Welcome to Francis Academic Press

The Frontiers of Society, Science and Technology, 2020, 2(8); doi: 10.25236/FSST.2020.020802.

Research on Classification of Imbalanced Data Set Based on TMDSMOTE Algorithm

Author(s)

Wei Sun1, Chen Cheng1,*, Gaiqing Yu1

Corresponding Author:
Chen Cheng
Affiliation(s)

1 Information Engineering College, Shanghai Maritime University, Shanghai 201306, China

*Corresponding Author


Abstract

Scholars represented by Chawla proposed the SMOTE algorithm with the core idea of random upsampling. By constructing positive samples artificially, the number of negative samples and positive samples in the data set tended to be balanced. For SMOTE algorithm, scholars have proposed many improved algorithms. Considering the above problems, this paper proposes an improved algorithm TMDSMOTE algorithm, which not only considers the problem of sample distribution marginalization, but also considers the complexity of the algorithm.

Keywords

TMDSMOTE Algorithm, Research on Classification

Cite This Paper

Wei Sun, Chen Cheng, Gaiqing Yu. Research on Classification of Imbalanced Data Set Based on TMDSMOTE Algorithm. The Frontiers of Society, Science and Technology (2020) Vol. 2 Issue 8: 5-12. https://doi.org/10.25236/FSST.2020.020802.

References

[1] Chawla N V, Bowyer K W, Hall L O (2002). SMOTE:synthetie minority over- sampling technique[J].Journal of Artificial Intelligence Researeh, no.16, pp.321-357.

[2] Guangyuan Deng (2019). Research and development of power transformer vibration monitoring and diagnosis algorithms and system software based on the Internet of Things [D]. Zhejiang University.

[3] Xu Jin, Lei Wang, Guozi Sun, et al (2019). An Undersampling Method for Unbalanced Data Based on Centroid Space [J]. Computer Science, vol.46, no.2, pp.50-55.

[4] Xinai Xu (2018). Recognition and separation algorithm for data overlap between classes of unbalanced fiber sensing data sets [J] .Laser Magazine, vol.39, no.11, pp.120-125.

[5] Xueyan Wen, Liying Zhao, Kesheng Xu, et al (2018). Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification [J]. Journal of Harbin University of Science and Technology, vol.23, no. 4, pp.87-94.

[6] Pengfei Zhang, Yigui Wang, Zhijun Zhang (2019). Research on personalized recommendation algorithm integrating tags and multiple information [J]. Computer Engineering and Applications, vol.55, no.5, pp.159-165.

[7] Aiying Yin, Yunbing Wu, Xiaohua Yang (2018). Hybrid sampling algorithm for unbalanced data for manufacturing industry [J]. Computer Engineering and Design, vol.39, no.4, pp.1053-1058.

[8] Xueyan Wen, Jianan Chen, Weipeng Jing, et al (2018). Optimization research on classification model for imbalanced data sets [J]. Computer Engineering, vol.44, no.4, pp.268-273 + 293.

[9] Wei Yi, Li Mao, Jun Sun, Linhai Wu (2018). Research on classification of improved Smote algorithm on imbalanced data sets [J] .Computer and Modernization, no.3, pp.83-88.

[10] Guoquan Wang (2017). Research on feature selection algorithm for high-dimensional unbalanced data [D]. Harbin Institute of Technology.

[11] Qinghua Zhao, Yihao Zhang, Jianfen Ma, et al (2018). Research on Improved SMOTE Classification Algorithm for Non-balanced Data Sets [J]. Computer Engineering and Applications, vol.54, no.18, pp. 168-173.

[12] Yan Zhang (2017). Research on outlier detection for unbalanced data [D]. Qingdao University of Science and Technology.

[13] Yan Li, Yihua Li, Jinhuan Wang (2017). A new music personalized recommendation algorithm based on LDA-MURE model [J]. Journal of Jilin University (Science Edition), vol.55, no.2, pp.371-375.

[14] Yunyi Pei (2016). A Study on Affective Analysis of Chinese Travel Reviews [D]. Beijing Jiaotong University.

[15] Huizhen Zhao, Fuxian Liu, Longyue Li (2016). Collaborative fuzzy C-means algorithm for K-nearest neighbor estimation coordination coefficients [J]. Computer Engineering and Applications, vol.52, no.19, pp.19-24 + 30.

[16] Ruolei Chen (2013). Research on Prediction Methods of Inherently Irregular Protein Structures Based on Multi-scale and Multi-feature [D]. Harbin Engineering University.

[17] Lina Liu, Zhilou Yu, Huaxiang Zhang (2011). Dimension reduction method for imbalanced data sets [J]. Information Technology and Informatization, no.5, pp.62-64.

[18] Shufeng Yang (2009). Application of classification technology in medical diagnosis [D]. Shantou University.

[19] Chaoxue Wang, Zhengmao Pan, Lili Dong, etc (2013). Research on classification of unbalanced data sets based on improved SMOTE [J]. Computer Engineering and Applications, vol.49, no.2, pp.184-187.