Welcome to Francis Academic Press

Academic Journal of Business & Management, 2023, 5(5); doi: 10.25236/AJBM.2023.050501.

A used car repricing method based on K-Means++ clustering and multiple linear regression


Jiajia Meng

Corresponding Author:
Jiajia Meng

School of Management, Qufu Normal University, Rizhao, Shandong, 276826, China


A predictive pricing method based on k-means++ clustering and multiple linear regression is proposed to solve the unreasonable initial pricing of used cars and evaluate the model's low accuracy. To begin, all vehicles are divided into three categories based on the transaction period of the data sample. The K-Means++ clustering algorithm is then used to cluster the best-selling cars, and a multiple linear regression equation is fitted for each category as the regression equation for prediction and evaluation. Finally, the revaluation of unsold and unsalable vehicles is examined. In addition, two used car valuation methods, XGBoost and AdaBoost, are compared. Because the model performs well, the experiment shows that using the multiple linear regression method based on clustering to estimate car price is reasonable.


Used cars; Transaction period; K-Means++ clustering; Multiple linear regression

Cite This Paper

Jiajia Meng. A used car repricing method based on K-Means++ clustering and multiple linear regression. Academic Journal of Business & Management (2023) Vol. 5, Issue 5: 1-8. https://doi.org/10.25236/AJBM.2023.050501.


[1] Mao pan, Cai yun, Wan xiong. Study on Influencing Factors of Second-hand Car Price Evaluation Based on BP Neural Network [J]. Automotive Practical Technology, 2020(04): 59-63+67.

[2] Zhang Yuansen. Second-hand Car Price Evaluation Model Based on Neural Network [D]. Tianjin: Tianjin University, 2018:11-15.

[3] Yang Sirui. Research on Second-hand Car Evaluation Model Based on GA-MIV-BP Algorithm [D]. Chongqing: Chongqing University of Technology, 2020:15-22.

[4] Lv Jin. Study on Used Car Price Prediction Based on Feature Optimization Combination SVM [D]. Wuhan: Zhongnan University of Economics and Law, 2019: 9-18.

[5] Hu Yu. Construction and Application of Second-hand Car Evaluation Model Based on Characteristic Price Theory [D]. Changsha: Hunan University, 2017:11-15.

[6] Xie Yang, Wen Hua, Zhang Jie. A second-hand car price evaluation method based on Machine learning [J]. Enterprise Technology Development: Mid-Day Issue, 2015, 34(4): 116-8.

[7] Cao Jie. Study on the Value Evaluation of Second-hand Cars Based on Random Forest Model [D]; Hebei University of Economics and Business, 2020.

[8] Wang Jingna. Research on Second-hand Car Evaluation Model Based on Random Forest Algorithm [D]; Beijing Jiaotong University, 2019.

[9] FREUND Y, SCHAPIRE R, ABE N. A short introduction to boosting[J]. Journal-Japanese Society For Artificial Intelligence, 1999, 14(771-780): 1612.

[10] Cui Sishuai. Analysis of Domestic Used Car Price Forecast Based on Integrated Learning [D]. Dalian: Dalian University of Technology, 202:15-21.

[11] Jia Pengxiang. Forecast of Used Car Price based on LightGBM [D]. Jinan: Shandong Normal University, 2021:14-22.

[12] Liu Cong, Cheng Ximing. A second-hand car valuation method based on AdaBoost. Journal of Beijing University of Information Science and Technology (Natural Science), 2017, 32(03): 49-53.

[13] Zheng Jie. Prediction of Used Car Price Based on Random Forest and XGBoost Algorithm [J]. Digital Technology and Applications, 2021, 39(06): 90-93+188.

[14] ARTHUR D, VASSILVITSKII S. k-means++: The advantages of careful seeding[R]: Stanford, 2006.

[15] Wang Huiwen, Meng Jie. Multiple linear regression prediction modeling method [J]. Journal of Beijing University of Aeronautics and Astronsutics,2007(04): 500-504.

[16] KUMAR A, ABIRAMI S. Aspect-based opinion ranking framework for product reviews using a Spearman's rank correlation coefficient method [J]. Information Sciences, 2018, 460: 23-41.

[17] SALMERoN R, GARCiA C, GARCiA J. Variance inflation factor and condition number in multiple linear regression [J]. Journal of Statistical Computation and Simulation, 2018, 88(12): 2365-84.