Academic Journal of Engineering and Technology Science, 2023, 6(5); doi: 10.25236/AJETS.2023.060505.
Jinzhi Fan1, Ziyi Fan2
1Changsha University of Science & Technology, Changsha 410114, China
2Changzhi University, Changzhi, 046011, China
Considering the case that the prediction variable is a time series and the response variable is a continuous scalar, we propose a time series regression model based on improved PCA and Bagging Algorithms. Compared with PCA dimension reduction, the proposed method uses distance correlation coefficient matrix instead of Person correlation coefficient matrix, which makes the distribution assumption of original variables more free. Considering that PCA is an unsupervised dimension reduction technique and the connection functions between principal components and response variables are unknown, we propose to use Bagging Algorithmss to capture information of principal components related to response variables. In the actual data analysis, the comparative methods are LASSO and PCA-based linear models, and the empirical results show that the proposed method has certain competitiveness compared with the comparison method.Finally, because the base-model of Bagging Algorithms is model-free, some machine learning methods with higher precision and flexibility can be used as the base-model for data tasks with different complexity.
Time Series Data Regression, Distance correlation coefficient, PCA, Bagging Algorithms
Jinzhi Fan, Ziyi Fan. A Time Series Regression Model via Improved PCA and Bagging Algorithms. Academic Journal of Engineering and Technology Science (2023) Vol. 6, Issue 5: 23-29. https://doi.org/10.25236/AJETS.2023.060505.
 Haowen Dong, Qianying Zhang, Zhaoyi Chen, Yi Zhao, Yuqing Qian. Comprehensive Prediction of Wind Speed Capture in Wind Power Generation Based on Multiple linear regression and time Series [J]. Mechanical and Electrical Information, 2020, No.633(27): 8-9. 2020.27.004.
 Xuejian Li, Meili Wang, Min Zhao, Yinghan Shi, Qiang Gao. Vineyard drought monitoring model based on multiple stepwise regression analysis [J]. Agricultural Research in the Arid Areas, 2022, 40(04): 249-254.
 Ji Shu, Gu Chen, Xi Xiaobo, Zhang Zhenghua, Hong Qingqing, Huo Zhongyang, Zhao Haitao, Zhang Ruihong, Li Bin, Tan Changwei. Quantitative Monitoring of Leaf Area Index in Rice Based on Hyperspectral Feature Bands and Ridge Regression Algorithms [J]. Remote Sensing, 2022, 14(12).
 Wang Y, Xia S T. A novel feature subspace selection method in random forests for high dimensional data[C]// 2016 International Joint Conference on Neural Networks (IJCNN). IEEE, 2016.
 Ravi K K V, Agrawal D, Abbadi A E. Dimensionality Reduction for Similarity Searching in Dynamic Databases [J]. Computer Vision and Image Understanding, 1999, 75(1/2): 59-72
 Xiangwei Kong, Hao Chen, JiajieYe, Yadong Li, Gan Yifeng. Drill bit optimization for predicting drillability of rock based on PCA [J]. Xinjiang Oil and Gas, 2022, 18(03): 6-11.
 Yunfei Zhang, Wanxiong Wang. Prediction of Air Quality Index in Xi'an Based on PCA-SSA-Elman [J]. Journal of Software, 2002, 43(06): 30-34.
 Bingchen He, Xueming Yang, Jinsong Wang, Xu Zhu, Hu Zongjie, Liu Qiang. Prediction of Residual Service Life of Lithium ion Batteries based on PCA-GPR [J]. Acta Energiae Solaris Sinica, 2022, 43(05): 484-491. 2022-0422.
 Rong Zhu, Guohua Zou, Xinyu Zhang. Model averaging Method for Partial Function Linear Models [J]. Journal of Systems Science and Mathematics, 2018, 38(07): 777-800.
 Xiaoqun He. Applied Regression Analysis (R Language Edition) [M]. Beijing: Publishing House of Electronics Industry, 2017.
 Lu Zhang, Lingchen Kong, Huangyue Chen. Hierarchical clustering method based on Distance Correlation Coefficient [J]. Computational Mathematics, 2019, 41(03): 320-334.
 Xie Q, Tang L, Li W , et al. Principal Model Analysis Based on Partial Least Squares[J]. 2019.