<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" xmlns="http://www.loc.gov/MARC21/slim">
 <record>
  <leader>00000cab a22000003a 4500</leader>
  <controlfield tag="001">UP-99796217608835589</controlfield>
  <controlfield tag="003">Buklod</controlfield>
  <controlfield tag="005">20231007234834.0</controlfield>
  <controlfield tag="006">m    |o  d |      </controlfield>
  <controlfield tag="007">ta</controlfield>
  <controlfield tag="008">090520s        xx     d | ||r |||||   ||</controlfield>
  <datafield tag="040" ind1=" " ind2=" ">
   <subfield code="a">DENGII</subfield>
  </datafield>
  <datafield tag="041" ind1=" " ind2=" ">
   <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="100" ind1="0" ind2=" ">
   <subfield code="a">Stephens, John M.</subfield>
  </datafield>
  <datafield tag="245" ind1="0" ind2="0">
   <subfield code="a">MUDD</subfield>
   <subfield code="b">a multi-dimensional data generator.</subfield>
  </datafield>
  <datafield tag="300" ind1=" " ind2=" ">
   <subfield code="a">pp. 104-109</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
   <subfield code="a">Today's business intelligence systems consist of hundreds of processors with disk subsystems able to handle multiple Giga-bytes of IO-bandwidth. These systems usually contain terabytes of data. Evaluating database system performance of such systems often requires generating synthetic data with well defined statistical properties. To simulate different scenarios, it is important to vary statistical properties including row counts of tables. Foremost, in order to analyze large scale systems, data generators need to be able to produce hundreds of terabytes of data in a timely fashion. In this paper we present MUDD, a multi-dimensional data generator. Originally designed for TPC-DS, a decision support benchmark being developed by the TPC, MUDD is able to generate up to 100 Terabyte of flat file data in hours, utilizing modern multi processor architectures, including clusters. Its novel design separates data generation algorithms from data distribution definitions, enabling users to adjust their workload to individual needs and different scenarios.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Design.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Measurement.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Performance.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Standardization.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">TPC-DS.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Decision support.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Performance analysis.</subfield>
  </datafield>
  <datafield tag="773" ind1="0" ind2=" ">
   <subfield code="t">Software engineering notes.</subfield>
   <subfield code="g">29, 1 (2004).</subfield>
  </datafield>
  <datafield tag="905" ind1=" " ind2=" ">
   <subfield code="a">FO</subfield>
  </datafield>
  <datafield tag="852" ind1=" " ind2=" ">
   <subfield code="a">UPD</subfield>
   <subfield code="b">DENG-II</subfield>
  </datafield>
  <datafield tag="942" ind1=" " ind2=" ">
   <subfield code="a">Article</subfield>
  </datafield>
 </record>
</collection>
