Welcome to Francis Academic Press

Academic Journal of Medicine & Health Sciences, 2023, 4(3); doi: 10.25236/AJMHS.2023.040305.

A Comparative Study of Seven Machine Learning Algorithms for Breast Cancer Detection and Diagnosis

Author(s)

Qinyi Ruan

Corresponding Author:
Qinyi Ruan
Affiliation(s)

Department of Mathematics and Applied Mathematics, Wenzhou University, Wenzhou, Zhejiang, China

Abstract

This paper presents a comparative analysis of seven distinct machine learning (ML) algorithms, namely Linear Discriminant Analysis, Logistic Regression, K-Nearest Neighbor, Decision Tree Classifier, Random Forest Classifier, Voting Classifier, and Support Vector Machine, in predicting the diagnosis of breast cancer. The study utilized 30 histological tumor features obtained from digital imaging of fine needle aspirates of breast tumor cell masses contained in the dataset, achieving an accuracy of approximately 95% through the application of the aforementioned algorithms. Results show that the LDA and RFC algorithms outperformed the others in terms of accuracy in diagnosing breast cancer. Furthermore, the study suggests that the stability of diagnostic outcomes is better achieved with large-scale data. Finally, the accuracy of the LR algorithm was observed to be less than 85% after conducting Principal Component Analysis (PCA), which was lower than the accuracy achieved without dimensionality reduction.

Keywords

Machine Learning ML, Linear Discriminant Analysis, Logistic Regression, K- Nearest Neighbor, Decision Tree Classifier, Random Forest Classifier, Voting Classifier, Support Vector Machine, Breast cancer, Principal component analysis

Cite This Paper

Qinyi Ruan. A Comparative Study of Seven Machine Learning Algorithms for Breast Cancer Detection and Diagnosis. Academic Journal of Medicine & Health Sciences (2023) Vol. 4, Issue 3: 25-33. https://doi.org/10.25236/AJMHS.2023.040305.

References

[1] L. G. Ahmad, A. T. Eshlaghy, A. Poorebrahimi, M. Ebrahimi and A.R. Razavi, “Using three machine learning techniques for predicting breast cancer recurrence,” (2013), J Health Med Inform 4: 124. doi: 10.4172/2157-7420.1000124.

[2] Uma Ojha and Savita Goel, “A study on prediction of breast cancer recurrence using data mining techniques,” 2017 7th Int. Conf. on Cloud Computing, Data Science & Engineering – Confluence, pp 527-530, IEEE, 2017.

[3] Delen D, Walker G, Kadam A (2005) Predicting breast cancer survivability: a comparison of three data mining methods. Artiicial Intelligence in Medicine 34: 113-127.

[4] Mandeep Rana,”Breast cancer diagnosis and recurrence prediction using machine learning techniques”, International journal of research in Engineering and Technology, Vol.4, No.4, pp.372-376, April 2015.

[5] A. F. M. Agarap, “On breast cancer detection: an application of machine learning algorithms on the wisconsin diagnostic dataset,” in Proceedings of the 2nd International Conference on Machine Learning and Soft Computing, Phu Quoc Island Viet Nam, Feb. 2018, pp. 5–9. doi: 10. 1145/ 3184066. 3184080.

[6] C. Cortes and V. Vapnik. 1995. Support-vector Networks. Machine Learning 20.3

[7] Medjahed SA, Saadi TA, Benyettou,” A. Breast cancer diagnosis by using k-nearest neighbor with different distances and classification rules”,International Journal of Computer Applications. 2013 Jan 1, vol. 62 (1).

[8] U. K. Kumar, M. B. S. Nikhil, and K. Sumangali, “Prediction of breast cancer using voting classifier technique,” in 2017 IEEE International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM), Chennai, India, Aug. 2017, pp. 108–114. doi: 10.1109/ICSTM.2017.8089135.

[9] I. Kononenko, “Machine learning for medical diagnosis: history, state of the art and perspective,” vol. 23, 2001.

[10] Y. Yasui and X. Wang, Statistical Learning from a Regression Perspective by BERK, R. A., vol. 65, no. 4. 2009.

[11] H. Abbasian, B. Nasersharif, A. Akbari, M. Rahmani and M. S. Moin, “Optimized linear discriminant analysis for extracting robust speech features,” 2008 3rd Int.Symp.on Communications, Control and Signal Processing, pp 819-824, IEEE, 2008.

[12] A. F. M. Agarap, “On breast cancer detection: an application of machine learning algorithms on the wisconsin diagnostic dataset,” in Proceedings of the 2nd International Conference on Machine Learning and Soft Computing, Phu Quoc Island Viet Nam, Feb. 2018, pp. 5–9. doi: 10.1145/3184066.3184080.

[13] H. Hasan and N. M. Tahir, "Feature selection of breast cancer based on principal component analysis," in Signal Processing and Its Applications (CSPA), 2010 6th International Colloquium on, 2010, pp. 1-4.