Kejia Zhu1, Pengzhou Fang2, Haocheng Li3, Ruihan Shi4, Yutian Shi5, Yanyuzi Chen6
1University of Nottingham Ningbo China, Ningbo, Zhejiang, China
2University of Toronto, Toronto, Ontorio, Canada
3Beijing World Youth Academy, Beijing, China
4Southwest Jiaotong University, Chengdu, Sichuan, China
5Shanghai University of Finance and Economics, Shanghai, China
6The High School Attached to Northwest Normal University, Lanzhou, Gansu, China
These authors contributed equally to this work
Nowadays, evaluating and identifying the potential fraud risk of borrowers effectively and calculating the fraud probability of them are the basis and significant steps of credit risk management in modern financial institutions before issuing loans. This paper mainly studies the statical analysis of the historical loan data of financial institutions based on the idea of unbalanced data classification and establishes the prediction model of loan fraud through random forest, decision tree and regression algorithm. The prediction performance of random forest algorithm is better than the other two mentioned methods. Additionally, it may obtain the feature that have a remarkable impact on the final fraud by ranking the importance of those features, which leads to a more effective judgment on the credit risk in the financial field.
Random forest, Bank reference, Prediction of loan fraud, Data mining
Kejia Zhu, Pengzhou Fang, Haocheng Li, Ruihan Shi, Yutian Shi, Yanyuzi Chen. Identification and Prediction Methods of Financial Anti-fraud. Academic Journal of Business & Management (2021) Vol. 3, Issue 8: 39-45. https://doi.org/10.25236/AJBM.2021.030808.
 Guo, W. (2013) Classification of Imbalanced Datasets Research Based on Ensemble Learning. Available at: http://www.doc88.com/p-0753734954688.html (Accessed: 27 February 2021).
 Liu, J. (2014) ‘Research on Classifying Unbalanced Data Based on Penalty-based SVM and Ensemble Learning’, Computer Applications and Software, 31(1), pp. 186-190. Available at: https://www.ixueshu.com/document/214c9856c46110072903558263d62dcc318947a18e7f9386.html (Accessed: 27 February 2021).
 Su, B. (2018) ‘Strengthen Financial Anti-Fraud Capability’, HINA Finance, pp. 72-74. Available at: https://www.ixueshu.com/document/22cde1e97521595bc1b0dd42e57f8555318947a18e7f9386.html (Accessed: 14 March 2021).
 Xiao, J. (2013) Research on Imbalanced Data Classification Method Based on Random Forest Algorithm. Available at: https://www.doc88.com/p-6562591316684.html (Accessed: 27 February 2021).
 Zhang, L. et al. (2014) ‘the basic principle of random forest and its applications in ecology: a case study of Pinus yunnanensis’, Acta Ecological Sinica, 34(3), pp. 650-659. Available at: http://www.ecologica.cn/stxb/ch/html/2014/3/stxb201306031292.html (Accessed: 24 February 2021).
 Zhou, B. (2014) Classification and Application of Ensemble Learning in Unbalanced Data. Available at: https://www.ixueshu.com/document/a46919a580be9f73b5095400f4449332318947a18e7f9386.html (Accessed: 27 February 2021).