Distance Metrics Selection Validity in Cluster Analysis
2011
Peter Grabusts

In cluster analysis data are divided into groups according to a specific criterion called metrics. Traditionally the metrics of choice has been Euclidean distance. This article studies other distance metrics used in cluster analysis– Manhattan distance, Cosine distance and Pearson correlation measure. In k-means clustering algorithm these metrics were used to determine cluster centers and the clustering correctness was evaluated. It was found that the clustering results were very similar. The article also contemplates to evaluate clustering validity criteria.


Atslēgas vārdi
clustering algorithms, cluster validity, k-means, metrics
DOI
10.2478/v10143-011-0045-y
Hipersaite
http://www.degruyter.com/view/j/acss.2011.45.issue--1/v10143-011-0045-y/v10143-011-0045-y.xml?format=INT

Grabusts, P. Distance Metrics Selection Validity in Cluster Analysis. Informācijas tehnoloģija un vadības zinātne. Nr.49, 2011, 72.-77.lpp. ISSN 1407-7493. Pieejams: doi:10.2478/v10143-011-0045-y

Publikācijas valoda
English (en)
RTU Zinātniskā bibliotēka.
E-pasts: uzzinas@rtu.lv; Tālr: +371 28399196