An improved density peaks clustering algorithm based on CURE

<p>Baiyan Chen, Kai Zhou</p>

doi:10.25236/AJCIS.2021.040201

Academic Journal of Computing & Information Science, 2021, 4(2); doi: 10.25236/AJCIS.2021.040201.

An improved density peaks clustering algorithm based on CURE

Author(s)

Baiyan Chen, Kai Zhou

Corresponding Author:

Baiyan Chen

Affiliation(s)

School of Computer & Software, Nanjing University of Information Science & Technology, Jiangsu Nanjing, China

Download PDF
|
Download: 95
|
View: 3399

Abstract

As a new density-based clustering algorithm, clustering by fast search and find of Density Peaks (DP) algorithm regards each density peak as a potential clustering center when dealing with a single cluster with multiple density peaks, therefore it is difficult to determine the correct number of clusters in the data set. To solve this problem, a mixed density peak clustering algorithm namely C-DP was proposed. Firstly, the density peak points were considered as the initial clustering centers and the dataset was divided into sub-clusters. Then, learned from the Clustering Using Representatives algorithm (CURE), the scattered representative points were selected from the sub-clusters, the clusters of the representative point pairs with the smallest distance were merged, and a parameter contraction factor was introduced to control the shape of the clusters. The experimental results show that the C-DP algorithm has better clustering effect than the DP algorithm. The comparison of the F-measure Index shows that the C-DP algorithm improves the accuracy of clustering when datasets contain multiple density peaks in a single cluster.

Keywords

Density Peak, Hierarchical Clustering, Cluster Merging, Representative Point, Contraction Factor

Cite This Paper

Baiyan Chen, Kai Zhou. An improved density peaks clustering algorithm based on CURE. Academic Journal of Computing & Information Science (2021), Vol. 4, Issue 2: 1-6. https://doi.org/10.25236/AJCIS.2021.040201.

References

[1] Zhen, C., Jiang, C. (2019) Overview of Data Mining in the Era of Big Data. International Core Journal of Engineering, 5, 136-139.

[2] Yan, M., Chen, L., Peng, L. (2016) Parallel programing templates for remote sensing image processing on GPU architectures: design and implementation. Computing, 98, 7-33.

[3] Liu, S., Zou, Y. (2020) An Improved Hybrid Clustering Algorithm Based on Particle Swarm Optimization and K-means. IOP Conference Series: Materials Science and Engineering, 750, 152-158.

[4] Zhao, L., Liu, Z., Levy, S.F. (2018) Bartender: a fast and accurate clustering algorithm to count barcode reads. Bioinformatics, 34, 739-747.

[5] Jothi, R., Mohanty, S.K., Ojha, A. (2019) DK-means: a deterministic K-means clustering algorithm for gene expression analysis. Pattern Analysis and Applications, 22, 649-667.

[6] Zhang, P., Shen, Q. (2018) Fuzzy c-means based coincidental link filtering in support of inferring social networks from spatiotemporal data streams. Soft Computing, 22, 1-11.

[7] Zou H. (2020) Clustering Algorithm and Its Application in Data Mining. Wireless Personal Communications, 110, 21-30.

[8] Gob, N., Rathinavelu A. (2018) Analyzing cloud based reviews for product ranking using feature based clustering algorithm. Cluster Computing, 22, 6977-6984.

[9] Chen, J., Chen, J., Yang D. (2018) A k-Deviation Density Based Clustering Algorithm. Mathematical Problems in Engineering, 2, 1-16.

[10] Liu, S.F., Meng, D.X., Wang X.Y. (2014) DBSCAN algorithm based on grid cell. Journal of Jilin University, 44, 1135-1139.

[11] Karami, A., Johansson, R. (2014) Choosing DBSCAN Parameters Automatically using Differential Evolution. International Journal of Computer Applications, 91, 1-11.

[12] Rodriguez, A., Laio, A. (2014) Clustering by fast search and find of density peaks. Science, 344, 1492-1496.

[13] Kirtee. Panwar. Alka. (2016) Modified CURE algorithm with enhancement to identify number of clusters. International journal of artificial intelligence and soft computing: IJAISC, 5, 226-240.