Academic Journal of Computing & Information Science, 2023, 6(13); doi: 10.25236/AJCIS.2023.061302.
Pan Jinfeng, Hu Xiaoqin
Quanzhou University of Information Engineering, Quanzhou, Fujian, 362000, China
To optimize the parallel clustering scheduling effect of distributed storage big data, this study proposes the design of a decision tree based parallel clustering scheduling algorithm for distributed storage big data. This method first constructs a decision tree model to achieve distributed storage data classification, and then implements parallel design of the decision tree model based on the Storm platform to improve data classification speed. Finally, based on this, a distributed storage data clustering scheduling design is implemented using an improved BPSO algorithm. The experimental results show that the proposed method has a shorter scheduling time and is superior to traditional methods, with better application results.
Decision tree; Distributed storage big data; Parallel clustering scheduling
Pan Jinfeng, Hu Xiaoqin. Distributed Storage Big Data Parallel Clustering Scheduling Algorithm Based on Decision Tree. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 13: 9-13. https://doi.org/10.25236/AJCIS.2023.061302.
[1] Han Litao. Design of Parallel K-means Clustering Algorithm Based on Cloud Computing[J]. Information and Computers (Theoretical Edition), 2023 ,35(09):93-95.
[2] Mao Yimin, Gan Dejin, Liao Lefa, et al. Parallel division clustering algorithm based on Spark framework and ASPSO[J]. Journal on Communications,2022,43(03):148-163.
[3] Wang Yuxian. Research on big data parallel search clustering algorithm based on Cloud Computing [J]. Automation & Instrumentation,2021,(10):33-36.
[4] Liu Jiefang, Zhang Zhihui. Parallel clustering algorithm for big data[J]. Computer Engineering and Design,2021,42(08):2265-2270.
[5] Lin Xiaohong, Lu Xinghua, Ma Miantao, et al Data Parallel Clustering Mining Algorithm Based on Continuous Detail Feature Decomposition[J]. Computer Technology and Development, 2022, 32(04): 34-38.
[6] Zhao Chunxia, Zhao Yingying, Song Xuekun.Parallel Clustering Algorithm for Multi-source Heterogeneous Data Based on Frequent Itemsets[J]. Journal of University of Jinan(Science and Technology), 2022,36(04):440-443+451.