Academic Journal of Computing & Information Science, 2021, 4(5); doi: 10.25236/AJCIS.2021.040506.
Data Science and Big Data Technology, Shanxi University of Finance and economics, Taiyuan, Shanxi, 030000, China
K-means clustering is a very classical clustering algorithm, and it is also one of the representatives of unsupervised learning. It has the advantages of a simple idea, high efficiency, and easy implementation, so it is widely used in many fields. However, K-means clustering also has some limitations, such as the number of clusters, the value of K is challenging to select, the selection of initial class center, the detection of outliers, and so on. This paper introduces the traditional K-means clustering algorithm and its improved method in detail. The advantages and disadvantages of the improved algorithm are analyzed, and the existing problems are pointed out. The development direction and trend of the K-means algorithm have been prospected.
K-means algorithm, outliers, improved algorithm
Bao Chong. K-means clustering algorithm: a brief review. Academic Journal of Computing & Information Science (2021), Vol. 4, Issue 5: 37-40. https://doi.org/10.25236/AJCIS.2021.040506.
 Kapoor, A. & Singhal, A. (2017). A comparative study of K-means, K-Means++ and fuzzy C-means clustering algorithms. In 3rd International conference on computational intelligence and communication technology (CICT), Ghaziabad, pp. 1–6.
 Arora, P.& Deepali, S. (2016).Analysis of K-Means and K-Medoids algorithm for big data. Procedia Computer Science, 78: 507–512.
 Liberty, E., et, al. (2016). An algorithm for online k-means clustering. In Workshop on Algorithm Engineering and Experiments (ALENEX), SIAM, 81–89.
 Yuan, C.&Yang, H. (2019). Research on K-value selection method of K-means clustering algorithm. Multidisciplinary Scientific Journal. 2: 226–235.
 Stemmer, U. (2020). Locally private k-means clustering. In Proceedings of the 2020 Symposium on Discrete Algorithms, pp.548-559.
 Olukanmi, P.O.& Twala, B.(2017). K-means-sharp: Modified centroid update for outlier-robust k-means clustering. In Proceedings of the 2017 Pattern Recognition Association of South Africa and Robotics and Mechatronics (PRASA-RobMech), Bloemfontein, South Africa, 30 November–1 December 2017; pp. 14–19.
 Qureshi, M.N.& Ahamad, M.V. (2018). An Improved Method for Image Segmentation Using K-Means Clustering with Neutrosophic Logic. Procedia Computer Science, 132, 534–540.
 Sinaga, K. P. & Yang, M.-S. (2020). Unsupervised K-Means Clustering Algorithm. IEEE Access, vol. 8, pp. 80716–80727.
 Gan, G., & Ng, K. P..(2017). K -means clustering with outlier removal. Pattern Recognition Letters, 90, 8-14.
 Ahmed, M., et, al. (2020). The k-means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics, 9: 1295.
 Agarwal, J., et, al. (2013). Crime Analysis using K-Means Clustering. International Journal of Computer Application, 83(4): 1–4.
 Ghezelbash, R. et, al. (2020). Optimization of geochemical anomaly detection using a novel genetic K-means clustering (GKMC) algorithm. Computers & Geosciences, 134: 104335.
 Jothi, R. et, al. (2019). DK-means: a deterministic k-means clustering algorithm for gene expression analysis. Pattern Analysis and Applications, 22(2), 649-667.
 Xia, C. et, al. (2020). Distributed K-Means clustering guaranteeing local differential privacy. Computer Security, 90: 101699.