Welcome to Francis Academic Press

Frontiers in Medical Science Research, 2021, 3(6); doi: 10.25236/FMSR.2021.030602.

Classification for the analysis of incomplete medical data

Author(s)

Ximing Ma1, Qi An2

Corresponding Author:
Ximing Ma
Affiliation(s)

1College of Data Science and Application, Inner Mongolia University of Technology, Huhhot, China

2School of Mathematical Sciences, Nanjing Normal Universtiy, Nan Jing, China

Abstract

Missing data is common in life. The processing of missing data is the key to classification. Therefore, using the existing reliable data set to complete the missing data is a common and necessary method. These methods have an important impact on dealing with the fuzziness and uncertainty in the data set. Therefore, it is necessary and effective to use accurate data and attribution methods to attribute missing data sets. This paper presents a new missing data classification method. Firstly, the center vector representing each class is calculated by using the training samples. The missing values are then estimated using the center of each class. By comparing the performance of three different interpolation methods in different test data sets, the final results show that the proposed method performs best in general.

Keywords

interpolation, missing data, K-nearest neighbors, Decision Tree, Random Forest

Cite This Paper

Ximing Ma, Qi An. Classification for the analysis of incomplete medical data. Frontiers in Medical Science Research (2021) Vol. 3 Issue 6: 6-10. https://doi.org/10.25236/FMSR.2021.030602.

References

[1] J. Venugopalan, N. Chanani, K. Maher and M. D. Wang, "Novel Data Imputation for Multiple Types of Missing Data in Intensive Care Units," in IEEE Journal of Biomedical and Health Informatics, vol. 23, no. 3, pp. 1243-1250, May 2019, doi: 10.1109/JBHI.2018.2883606.

[2] E. T. Capariño, A. M. Sison and R. P. Medina, "Application of the Modified Imputation Method to Missing Data to Increase Classification Performance," 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), 2019, pp. 134-139, doi: 10.1109/CCOMS.2019. 8821632.

[3] B. Xiang, F. Yan, T. Wu, W. Xia, J. Hu and L. Shen, "An Improved Multiple Imputation Method Based on Chained Equations for Distributed Photovoltaic Systems," 2020 IEEE 6th International Conference on Computer and Communications (ICCC), 2020, pp. 2001-2005, doi: 10.1109/ICCC51575.2020. 9345230.

[4] Ichikawa, M., Hosono, A., Tamai, Y., Watanabe, M., Shibata, K., Tsujimura, S., . . . Suzuki, S. (2019). Handling missing data in an FFQ: Multiple imputation and nutrient intake estimates. Public Health Nutrition, 22(8), 1351-1360. doi:10.1017 /S1368980019000168

[5] Waljee AK, Mukherjee A, Singal AG, et al. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open 2013; 3: e002847. doi: 10.1136/bmjopen-2013-002847

[6] Lin Qiao, Ran Ran, He Wu, Qiaoni Zhou, Sai Liu, and Yunfei Liu. 2018. Imputation Method of Missing Values for Dissolved Gas Analysis Data Based on Iterative KNN and XGBoost. In Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence (ACAI 2018). Association for Computing Machinery, New York, NY, USA, Article 11, 1–7. DOI: https: //doi.org/10.1145/3302425.3302447

[7] T. Duy Le, R. Beuran and Y. Tan, "Comparison of the Most Influential Missing Data Imputation Algorithms for Healthcare," 2018 10th International Conference on Knowledge and Systems Engineering (KSE), 2018, pp. 247-251, doi: 10. 1109/KSE.2018.8573344.

[8] Zhao Yang. Statistical inference for missing data mechanisms [J]. Statistics in Medicine, 2020, 39(28): 4325-4333.

[9] M. Dixit, R. Sharma, S. Shaikh and K. Muley, "Internet Traffic Detection using Naïve Bayes and K-Nearest Neighbors (KNN) algorithm," 2019 International Conference on Intelligent Computing and Control Systems (ICCS), 2019, pp. 1153-1157, doi: 10.1109/ICCS45141.2019.9065655.

[10] HAN Cheng-cheng, ZENG Si-tao, LIN Qiang, CAO Yong-chun, MAN Zheng-xing. Decision Tree Based Steraming Data Classification Algorithm: A Survey [J]. Journal of Northwest Minzu University (Natural Science), 2020, 41(02): 20-30.

[11] LV Hong-yan, FENG Qian. A review of random forests algorithm [J]. Journal of the Hebei Academy of sciences, 2019, 36(03): 37-41.