Welcome to Francis Academic Press

International Journal of Frontiers in Engineering Technology, 2023, 5(11); doi: 10.25236/IJFET.2023.051113.

Road Scene Semantic Segmentation Based on Deep Learning

Author(s)

Zhaoxiang Wang, Kaiqi Huang

Corresponding Author:
Kaiqi Huang
Affiliation(s)

School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou, Jiangxi, China

Abstract

This study aims to address the problem of semantic segmentation in complex road scenes, which has significant applications in fields such as autonomous driving, traffic monitoring, and urban planning. The methods investigated in our research primarily include key steps such as data collection, preprocessing, and annotation. We employ CNN models for data augmentation and introduce the DAFormer semantic segmentation algorithm. In the end, this paper proposes an enhanced DAFormer network architecture, incorporating techniques such as rare class sampling, Object Category ImageNet Feature Distance (FD), and learning rate warm-up. The application of these techniques enables DAFormer to better understand image content in complex road scenarios, providing a powerful tool to tackle real-world challenges. We evaluate its performance in this challenging task by comparing it with four traditional algorithms. Experimental results demonstrate a significant performance improvement in the enhanced DAFormer algorithm in complex road environments, achieving an average intersection over union (MIoU) of 0.82, pixel accuracy (PA) of up to 89%, and improved timeliness. Compared to other algorithms, the enhanced DAFormer exhibits superior performance in terms of accuracy, stability, and timeliness.

Keywords

Semantic Segmentation, DAFormer Algorithm, Complex Road Scenes, Unsupervised Domain Adaptation

Cite This Paper

Zhaoxiang Wang, Kaiqi Huang. Road Scene Semantic Segmentation Based on Deep Learning. International Journal of Frontiers in Engineering Technology (2023), Vol. 5, Issue 11: 86-93. https://doi.org/10.25236/IJFET.2023.051113.

References

[1] Chen Liang, et al. "DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs." IEEE Transactions on Pattern Analysis and Machine Intelligence. 2018, 40(4), 834-848. https://doi.org/10.1109/TPAMI.2017.2699184

[2] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-Net: Convolutional networks for biomedical image segmentation." In International Conference on Medical image computing and computer-assisted intervention. 2015, 234-241. https://doi.org/10.1007/978-3-319-24574-4_28

[3] Badrinarayanan, Vijay, Alex Kendall, and Roberto Cipolla. "SegNet: A deep convolutional encoder-decoder architecture for image segmentation." IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017, 39(12), 2481-2495. https://doi.org/10.1109/TPAMI.2016.2644615

[4] Zhao Hengshuang, et al. "ICNet for real-time semantic segmentation on high-resolution images." In Proceedings of the European conference on computer vision (ECCV), 2018, 20-32. https://doi.org/10.1007/978-3-030-01219-9_25

[5] Milletari, Fausto, Nassir Navab, and Seyed-Ahmad Ahmadi. "V-Net: Fully convolutional neural networks for volumetric medical image segmentation." In 3D Vision (3DV), 2016 Fourth International Conference on. 2016, 565-571 https://doi.org/10.1109/3DV.2016.79

[6] Hoyer L, Dai D, Van Gool L. Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 9924-9935.

[7] Paszke, A., Chaurasia, A., Kim, S., & Culurciello, E. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(12), 2846-2858. DOI: 10.1109/TPAMI.2017.2760923

[8] Badrinarayanan, V., Kendall, A., & Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(12), 2891-2904. DOI: 10.1109/TPAMI.2018.2818320

[9] Romera, E., Alvarez, J. M., Bergasa, L. M., & Arroyo, R. ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(3), 1039-1050. DOI: 10.1109/TITS.2018.2835378

[10] Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(7), 1661-1674. DOI: 10.1109/TPAMI. 2018.2862814

[11] Meletis, P., & Dubbelman, G. Fast, Robust, Continuous Monocular Depth and Normal Estimation. IEEE Robotics and Automation Letters, 2019, 4(2), 2040-2047. DOI: 10.1109/LRA.2019.2899305

[12] Valada, A., Mohan, R., & Burgard, W. Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2020, 128(4), 970-992. DOI: 10.1007/s11263-019-01253-9

[13] Mazzini, D. Guided Upsampling Network for Real-Time Semantic Segmentation. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(12), 5160-5170. DOI: 10.1109/TITS. 2019. 2959256 

[14] Xiaofeng. Li, Jing Wei and Hongshuang Jiao. Real-time Tracking Algorithm for Aerial Vehicles using Improved Convolutional Neural Network and Transfer Learning. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(3):2296-2305.