Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2023, 6(6); doi: 10.25236/AJCIS.2023.060612.

An Improved Ball K-Means Clustering Method Based on SOM

Author(s)

Peilun Han, Junxiu An

Corresponding Author:
Junxiu An
Affiliation(s)

Chengdu University of Information Technology, Chengdu, China

Abstract

With the increase of the dimension and quantity of sample data, the calculation cost of K-Means clustering algorithm increases sharply. Therefore, a novel accelerated accurate K-Means clustering algorithm, called "Ball K-Means", has recently been used to reduce the computational cost. Although Ball K-Means reduces the computational cost, both this algorithm and K-Means algorithm lack the global search capability. K-means algorithm may fall into local minima because of its dependence on the initial center. The proper selection of the initial center vector becomes the key to improve the K-means algorithm. Therefore, self-organizing map (SOM) can be used to cluster and determine the clustering range quickly, and then the result can be used as the initial center vector of K-means method. Aiming at the problems that the initial clustering center of Ball K-Means algorithm is randomly selected in the stage of clustering calculation, and the clustering result may fall into a local optimal solution, this article uses SOM network to preliminarily process the data to obtain the initial clustering center of Ball K-Means algorithm, which significantly improves the clustering effect of the algorithm. Taking intrusion detection as an example, the effectiveness and superiority of the algorithm are verified by experiments.

Keywords

Data mining, Clustering, Ball K-Means, SOM network

Cite This Paper

Peilun Han, Junxiu An. An Improved Ball K-Means Clustering Method Based on SOM. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 6: 80-83. https://doi.org/10.25236/AJCIS.2023.060612.

References

[1] Brentan B, Meirelles G, Luvizotto E J, et al. Hybrid SOM+ k -Means clustering to improve planning, operation and management in water distribution systems[J]. Environmental Modelling & Software, 2018, 106(AUG.):77-88.

[2] Jiang N, Liu T. An Improved Speech Segmentation and Clustering Algorithm Based on SOM and K –Means [J]. Mathematical Problems in Engineering, 2020, 2020(1):1-19.

[3] Brentan B, Meirelles G, Luvizotto E, et al. Hybrid SOM plus k-Means clustering to improve planning, operation and management in water distribution systems[J]. Environmental modelling & software, 2018(Aug.):106.

[4] Jia Shengsheng, Peng Dunlu. Domain text self-organizing mapping neural network clustering algorithm supported by CNN [J]. Microcomputer System, 2018, 39(6):6.

[5] Wang Shufen, Wang Wei. Multi-dimensional soil data analysis based on self-organizing feature mapping neural network technology [J]. China Agricultural Science and Technology Herald, 2018, 20(4):11.

[6] Zheng Zhong. Clustering analysis of students' physical health data based on self-organizing feature mapping network method [J]. Sichuan Sports Science, 2020, 39(3):4.

[7] Ma Chunlong, Shi Xiaoqing, Xu Weiwei, et al. Correlation analysis of multi-monitoring indicators of contaminated sites based on self-organizing neural network [J]. Hydrogeology Engineering Geology, 2021, 48(3):12.

[8] Zhan Zhongqiang, Yu Jin, Guo Zhi, et al. Study on short-term photovoltaic output prediction based on improved BP neural network with self-organizing mapping [J]. Sichuan Electric Power Technology, 2018, 41(2):6.