Insurance Fraud Detection Based on XGBoost

<p>Haoran Zheng<sup>1</sup>, Fan Peng<sup>2</sup>, Yawen Tian<sup>3</sup>, Zizhou Zhang<sup>4</sup>, Wenting Zhang<sup>5</sup></p>

doi:10.25236/AJCIS.2023.060808

Academic Journal of Computing & Information Science, 2023, 6(8); doi: 10.25236/AJCIS.2023.060808.

Insurance Fraud Detection Based on XGBoost

Author(s)

Haoran Zheng¹, Fan Peng², Yawen Tian³, Zizhou Zhang⁴, Wenting Zhang⁵

Corresponding Author:

Haoran Zheng

Affiliation(s)

¹Software Engineering, Shandong University of Technology, Zibo, Shandong, China

²Internet of Things Project, Hebei University of Technology, Tianjin, China

³McMaster University, Toronto, Ontario, Canada

⁴Nttingham University, Ningbo, Zhejiang, China

⁵Qihua Academy Nanchang, Nanchang, Jaingxi, China

Download PDF
|
Download: 22
|
View: 675

Abstract

This research conducted a comprehensive study on predicting customer car insurance claims using Gradient Boosting Decision Tree (GBDT) and XGBoost models. The process included data exploration, feature engineering, model evaluation, and parameter tuning. The dataset was explored based on variable types and missing values, and further processed through mean encoding and outlier removal. Date features were also manipulated to create more meaningful features. Two models, GBDT and XGBoost, were trained and evaluated based on their AUC (Area Under the Curve) values. Both models demonstrated good predictive power, with GBDT slightly outperforming XGBoost. The results of this study provide valuable insights for predicting insurance claims, offering significant implications for further research and practical applications.

Keywords

GBDT, XGBoost, Machine Learning, Car Insurance Fraud Detection

Cite This Paper

Haoran Zheng, Fan Peng, Yawen Tian, Zizhou Zhang, Wenting Zhang. Insurance Fraud Detection Based on XGBoost. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 8: 68-74. https://doi.org/10.25236/AJCIS.2023.060808.

References

[1] Hancock, J. T., & Khoshgoftaar, T. M. (2021). Gradient boosted decision tree algorithms for medicare fraud detection. SN Computer Science, 2(4), 268.

[2] Sri, G., & Ricardo, P. (2021). Combining Rules-Based and Machine Learning Models to Combat Financial Fraud .The Databricks Blog.

[3] Sanober, S., Alam, I., Pande, S., Arslan, F., Rane, K. P., Singh, B. K., ... & Shabaz, M. (2021). An enhanced secure deep learning algorithm for fraud detection in wireless communication. Wireless Communications and Mobile Computing, 2021, 1-14.

[4] Wang, X., Yi, Z., & Wu, H. (2018, August). Research and Improvement of Internet Financial Anti-Fraud Rules Based on Information Gain and Support. In Journal of Physics: Conference Series (Vol. 1069, No. 1, p. 012104). IOP Publishing.

[5] Wang, C., Luo, Q., Pan, L., Yuan, T. S., & Liu, Y. Z. (2022). Research and Application of Real-time Intelligent Anti-fraud System Based on Trusted AI and Spatio-temporal Big Data. Telecommunications Engineering Technology and Standardization. 35(12), 34-39.