Jurnal Publikasi STMIK Pontianak

Clustering Algorithm Comparison of Search Results Documents


Abstract- Document clustering is one of the popular studies of data mining. This research focused on creation of the application system of document clustering of search results documents through clustering algorithms of Ant Colony Optimization, Forgy and ISODATA. Created applications were used to group and ease search results documents. Clustered documents were articles of journals, theses, thesis proposals, and ebooks. Indexing and searching the documents apply Apache Lucene, the search engine. Ant Colony Optimization algorithm was compared to partitioning clustering of Forgy and ISODATA. Comparison was on examination of processing time of clustering, variance, and the sum of squared errors. Experiments of groups of documents and datasets were conducted. To conclude, clustering results of the three methods show identical variance and produce high intraclass similarity and low interclass similarity. Also, in comparison to others, clustering through algorithm of Ant Colony Optimization takes the most time.


Keyword:Document Clustering, Ant Colony Optimization,Forgy, ISODATA.

Jurnal Publikasi STMIK Pontianak By David, Raymondus Raymond Kosala