An Improved K-means Algorithm Based on Structure Features
2College of Engineering, Forestry, and Natural Sciences, Northern Arizona University, Arizona, America.
Abstract—In K-means clustering, we are given a set of n data points in multidimensional space, and the problem is to determine the number k of clusters. In this paper, we present three methods which are used to determine the true number of spherical Gaussian clusters with additional noise features. Our algorithms take into account the structure of Gaussian data sets and the initial centroids. These three algorithms have their own emphases and characteristics. The first method uses Minkowski distance as a measure of similarity, which is suitable for the discovery of non-convex spherical shape or the clusters with a large difference in size. The second method uses feature weighted Minkowski distance, which emphasizes the different importance of different features for the clustering results. The third method combines Minkowski distance with the best feature factors. We experiment with a variety of general evaluation indexes on Gaussian data sets with and without noise features. The results showed that the algorithms have higher precision than traditional K-means algorithm.
Index Terms—K-means, feature weighting, clustering, cluster validity index.
Cite: Qiang Zhan, "An Improved K-means Algorithm Based on Structure Features," Journal of Software vol. 12, no. 1, pp. 62-80, 2017.
May 03, 2016 News!
Papers published in JSW Vol. 11, No. 1- Vol. 11, No. 12 have been indexed by DBLP. [Click]
Jan 05, 2017 News!
[CFP] 2017 the annual meeting of JSW Editorial Board, ICCSM 2017, will be held in Maldives, July 4-6, 2017. [Click]
Feb 27, 2017 News!
Vol 12, No. 1 has been published with online version 6 original aritcles from 3 countries are published in this issue. [Click]
Sep 21, 2016 News!
Vol.11, No.8 has been indexed by EI (Inspec). [Click]
Nov 17, 2015 News!
Welcome Prof. Karim El Guemhioui from Canada to join the Editorial board of JSW. [Click]