Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2023, 6(8); doi: 10.25236/AJCIS.2023.060808.

Insurance Fraud Detection Based on XGBoost

Author(s)

Haoran Zheng1, Fan Peng2, Yawen Tian3, Zizhou Zhang4, Wenting Zhang5

Corresponding Author:
Haoran Zheng
Affiliation(s)

1Software Engineering, Shandong University of Technology, Zibo, Shandong, China

2Internet of Things Project, Hebei University of Technology, Tianjin, China

3McMaster University, Toronto, Ontario, Canada

4Nttingham University, Ningbo, Zhejiang, China

5Qihua Academy Nanchang, Nanchang, Jaingxi, China

Abstract

This research conducted a comprehensive study on predicting customer car insurance claims using Gradient Boosting Decision Tree (GBDT) and XGBoost models. The process included data exploration, feature engineering, model evaluation, and parameter tuning. The dataset was explored based on variable types and missing values, and further processed through mean encoding and outlier removal. Date features were also manipulated to create more meaningful features. Two models, GBDT and XGBoost, were trained and evaluated based on their AUC (Area Under the Curve) values. Both models demonstrated good predictive power, with GBDT slightly outperforming XGBoost. The results of this study provide valuable insights for predicting insurance claims, offering significant implications for further research and practical applications.

Keywords

GBDT, XGBoost, Machine Learning, Car Insurance Fraud Detection

Cite This Paper

Haoran Zheng, Fan Peng, Yawen Tian, Zizhou Zhang, Wenting Zhang. Insurance Fraud Detection Based on XGBoost. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 8: 68-74. https://doi.org/10.25236/AJCIS.2023.060808.

References

[1] Hancock, J. T., & Khoshgoftaar, T. M. (2021). Gradient boosted decision tree algorithms for medicare fraud detection. SN Computer Science, 2(4), 268.

[2] Sri, G., & Ricardo, P. (2021). Combining Rules-Based and Machine Learning Models to Combat Financial Fraud .The Databricks Blog.

[3] Sanober, S., Alam, I., Pande, S., Arslan, F., Rane, K. P., Singh, B. K., ... & Shabaz, M. (2021). An enhanced secure deep learning algorithm for fraud detection in wireless communication. Wireless Communications and Mobile Computing, 2021, 1-14.

[4] Wang, X., Yi, Z., & Wu, H. (2018, August). Research and Improvement of Internet Financial Anti-Fraud Rules Based on Information Gain and Support. In Journal of Physics: Conference Series (Vol. 1069, No. 1, p. 012104). IOP Publishing. 

[5] Wang, C., Luo, Q., Pan, L., Yuan, T. S., & Liu, Y. Z. (2022). Research and Application of Real-time Intelligent Anti-fraud System Based on Trusted AI and Spatio-temporal Big Data. Telecommunications Engineering Technology and Standardization. 35(12), 34-39.