Academic Journal of Computing & Information Science, 2024, 7(8); doi: 10.25236/AJCIS.2024.070810.
Bohan Zhang
The High School Affiliated to Renmin University of China, Beijing, China
In the intricate environment of urban centers, autonomous vehicles encounter multifaceted perceptual challenges. Traditional 2D object detection methods fail to accurately provide the feature information of 3D objects, leading to the failure of object identification and behavior prediction. The PETR algorithm improves this problem through multi-vision 3D object detection. This thesis is dedicated to optimizing the PETR algorithm from three perspectives to elevate the performance and efficacy of 3D object detection. These optimizations include refining the network backbone, adjusting image input parameters, and enhancing the training parameters such as the learning rate, gradient clipping parameters, and the choice of optimizer. Utilizing the NuScenes Dataset for model training, the final evaluation and comparison of model performance are mainly based on mAP metric.
Object Detection, Urban Traffic, PETR Algorithm
Bohan Zhang. Optimization of Multi-View 3D Object Detection in Urban Traffic Environments Based on the PETR Algorithm. Academic Journal of Computing & Information Science (2024), Vol. 7, Issue 8: 63-73. https://doi.org/10.25236/AJCIS.2024.070810.
[1] Liu, Y., Wang, T., Zhang, X., & Sun, J. (2022, October). Petr: Position embedding transformation for multi-view 3d object detection. In European Conference on Computer Vision (pp. 531-548). Cham: Springer Nature Switzerland.
[2] Chen, Y., Liu, S., Shen, X., & Jia, J. (2020). Dsgn: Deep stereo geometry network for 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12536-12545).
[3] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[4] Mousavian, A., Anguelov, D., Flynn, J., & Kosecka, J. (2017). 3d bounding box estimation using deep learning and geometry. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 7074-7082).
[5] Wang, X., Xiao, T., Jiang, Y., Shao, S., Sun, J., & Shen, C. (2018). Repulsion loss: Detecting pedestrians in a crowd. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7774-7783).
[6] Arnold, E., Al-Jarrah, O. Y., Dianati, M., Fallah, S., Oxtoby, D., & Mouzakitis, A. (2019). A survey on 3d object detection methods for autonomous driving applications. IEEE Transactions on Intelligent Transportation Systems, 20(10), 3782-3795.