Academic Journal of Computing & Information Science, 2022, 5(10); doi: 10.25236/AJCIS.2022.051009.
Xuehan Peng
School of Statistics, Capital University of Economics and Business, Beijing, 100000, China
With the continuous promotion of e-commerce platform installment payment and P2P credit platform, personal credit risk assessment is becoming more and more important for e-commerce and users Taking the data of Tianchi competition platform as a sample, this paper constructs an index system for personal credit risk assessment, and calculates the combined weight of each index through the critical method and entropy weight method, and weights it to obtain three factors: basic information, credit information and lending behavior information. Based on the quantified three indicators and the unbalanced sample algorithm, XGBoost and lightGBM are used to predict credit risk, it is found that the performance of these two methods is basically the same. This paper uses the SHAP-values interpretable machine learning method to explain the importance of these three factors. The empirical results show that the accuracy of XGBoost and lightGBM is higher than 80%, and the order of the importance of the three factors is: "basic information" is higher than "lending behavior information" and higher than "credit information". Finally, this paper puts forward relevant suggestions for the stable development of enterprise risk control and credit industry.
credit risk assessment, portfolio weight, SHAP-values, machine learning
Xuehan Peng. Research on Personal Credit Risk Assessment Based on Combination Weight and Shap Interpretable Machine Learning. Academic Journal of Computing & Information Science (2022), Vol. 5, Issue 10: 54-59. https://doi.org/10.25236/AJCIS.2022.051009.
[1] Shuguang Li. Research on personal credit evaluation. Beijing. Beijing Jiaotong University. June 2003.13-30
[2] Wei Lu. A comparative study of credit scoring methods based on real personal credit data in China [Dissertation]. Hefei. University of science and technology of China. 2004.45-48
[3] Abellán JMantas C J. Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring [J]. Expert Systems with Applications, 2014, 41(8): 3825-3830.
[4] Lundberg S M, Lee S I. A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems, ACM, 2017, 4765-4774.
[5] Wenxiong Liao, Bi Zeng, Tiankai Liang, Yayun Xu, Junfeng Zhao. Personal credit risk assessment method for high dimensional data[J]. Computer engineering and Application, 2020, 56(04):219-224.
[6] Arora N, Kaur P D. A Bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment [J]. Applied Soft Computing, 2020, 86: 105936.
[7] Marcílio W E, Eler D M. From explanations to feature selection: assessing shap values as feature selection mechanism. 2020 33rd SIBGRAPI conference on Graphics, Patterns and Images (SIBGRAPI). Ieee, 2020: 340-347.
[8] Gramegna A, Giudici P. SHAP and LIME: an evaluation of discriminative power in credit risk [J]. Frontiers in Artificial Intelligence, 2021: 140.
[9] Zhang X, Yu L, Yin H, et al. Integrating data augmentation and hybrid feature selection for small sample credit risk assessment with high dimensionality [J]. Computers & Operations Research, 2022: 105937.