<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" xmlns="http://www.loc.gov/MARC21/slim">
 <record>
  <leader>00000ctm a22000004a 4500</leader>
  <controlfield tag="001">UP-8027390931316151456</controlfield>
  <controlfield tag="003">Buklod</controlfield>
  <controlfield tag="005">20071106064409.0</controlfield>
  <controlfield tag="006">a     r    |||| u|</controlfield>
  <controlfield tag="007">ta</controlfield>
  <controlfield tag="008">071106s        xx     d     r    |||| u|</controlfield>
  <datafield tag="035" ind1=" " ind2=" ">
   <subfield code="a">(iLib)UPMIN-00000014741</subfield>
  </datafield>
  <datafield tag="040" ind1=" " ind2=" ">
   <subfield code="a">DLC</subfield>
   <subfield code="c">DLC</subfield>
   <subfield code="d">upmin</subfield>
  </datafield>
  <datafield tag="041" ind1=" " ind2=" ">
   <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="090" ind1=" " ind2=" ">
   <subfield code="a">LG993.5 2006</subfield>
   <subfield code="b">A64 Y36</subfield>
  </datafield>
  <datafield tag="100" ind1="1" ind2=" ">
   <subfield code="a">Yap, Toni Kathrina L.</subfield>
  </datafield>
  <datafield tag="245" ind1="0" ind2="0">
   <subfield code="a">On stability of optimal clusters</subfield>
   <subfield code="c">Toni Kathrina L. Yap.</subfield>
  </datafield>
  <datafield tag="264" ind1=" " ind2="1">
   <subfield code="c">2006</subfield>
  </datafield>
  <datafield tag="300" ind1=" " ind2=" ">
   <subfield code="a">84 leaves</subfield>
  </datafield>
  <datafield tag="502" ind1=" " ind2=" ">
   <subfield code="a">Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2006</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
   <subfield code="a">The evaluation of the quality of the clustering results is important in clustering analysis. Two of the widely-used approaches to evaluate the quality of clustering results are the validity index and the stability index. This paper compares the stable number of clusters based on the stability index M(K) proposed by Levine and Domany (2001) with the optimal number of clusters from the eight validity indices namely the C index, DB index, Dunn index, Silhouette index, SD index, SD index, CH index, and KL index. The stability index M(K) identified the stable number of clusters with varying dilution factors. K-means algorithm was used to cluster the five data sets. The data sets include the Iris, E. coli, processed Cleveland heart disease, glass and the water treatment. The quality of the clustering results represented in the number of computed clusters was evaluated using the validity index and the stability index. Result of the analysis showed that the stability index was consistent in identifying stable clustering at K= 2 for all the data sets. On the other hand, the eight validity indices performed differently for different data sets. Only Dunn index and CH index were consistent with the stability index in identifying optimal number of clusters that best fit the data. SD index, SD index and Silhouette index fairly performed in identifying optimal number of clusters that are stable clustering solution for all the data sets. The C index and the DB index were likely to favor large number of clusters. KL index was the least performer among the indices identifying the stable number of clusters only with two of the data sets used</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
   <subfield code="a">Clustering.</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
   <subfield code="a">K-Means clustering.</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
   <subfield code="a">Optimal clusters.</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
   <subfield code="a">Reampling.</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
   <subfield code="a">Validity indices.</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
   <subfield code="a">K-Mutual-Neighbor Criterion.</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
   <subfield code="a">Data normalization.</subfield>
  </datafield>
  <datafield tag="658" ind1=" " ind2=" ">
   <subfield code="a">Undergraduate Thesis</subfield>
   <subfield code="c">AMAT200</subfield>
   <subfield code="2">BSAM.</subfield>
  </datafield>
  <datafield tag="905" ind1=" " ind2=" ">
   <subfield code="a">FI</subfield>
  </datafield>
  <datafield tag="905" ind1=" " ind2=" ">
   <subfield code="a">UP</subfield>
  </datafield>
  <datafield tag="852" ind1="0" ind2=" ">
   <subfield code="a">UPMIN</subfield>
   <subfield code="b">UPMIN-MAIN</subfield>
   <subfield code="h">LG993.5 2006 A64 Y36</subfield>
  </datafield>
  <datafield tag="942" ind1=" " ind2=" ">
   <subfield code="a">Thesis</subfield>
  </datafield>
 </record>
</collection>
