00000ctm a22000004a 4500 UP-8027390931316151456 Buklod 20071106064409.0 a r |||| u| ta 071106s xx d r |||| u| (iLib)UPMIN-00000014741 DLC DLC upmin eng LG993.5 2006 A64 Y36 Yap, Toni Kathrina L. On stability of optimal clusters Toni Kathrina L. Yap. 2006 84 leaves Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2006 The evaluation of the quality of the clustering results is important in clustering analysis. Two of the widely-used approaches to evaluate the quality of clustering results are the validity index and the stability index. This paper compares the stable number of clusters based on the stability index M(K) proposed by Levine and Domany (2001) with the optimal number of clusters from the eight validity indices namely the C index, DB index, Dunn index, Silhouette index, SD index, SD index, CH index, and KL index. The stability index M(K) identified the stable number of clusters with varying dilution factors. K-means algorithm was used to cluster the five data sets. The data sets include the Iris, E. coli, processed Cleveland heart disease, glass and the water treatment. The quality of the clustering results represented in the number of computed clusters was evaluated using the validity index and the stability index. Result of the analysis showed that the stability index was consistent in identifying stable clustering at K= 2 for all the data sets. On the other hand, the eight validity indices performed differently for different data sets. Only Dunn index and CH index were consistent with the stability index in identifying optimal number of clusters that best fit the data. SD index, SD index and Silhouette index fairly performed in identifying optimal number of clusters that are stable clustering solution for all the data sets. The C index and the DB index were likely to favor large number of clusters. KL index was the least performer among the indices identifying the stable number of clusters only with two of the data sets used Clustering. K-Means clustering. Optimal clusters. Reampling. Validity indices. K-Mutual-Neighbor Criterion. Data normalization. Undergraduate Thesis AMAT200 BSAM. FI UP UPMIN UPMIN-MAIN LG993.5 2006 A64 Y36 Thesis