Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2021, 4(1); doi: 10.25236/AJCIS.2021.040111.

Context attention network for occluded pedestrian detection

Author(s)

Shiyang Zhao1, *

Corresponding Author:
Shiyang Zhao
Affiliation(s)

1College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China

*Corresponding author

Abstract

Pedestrian detection in occluded scenes has always been a thorny problem in computer vision. In this case, due to the large difference in scale of occluded pedestrians and low visibility, it usually brings great challenges to detection. To solve this problem, this paper proposes a model structure for pedestrian occlusion detection, which improves the pedestrian detection method based on anchor-free. Specifically, we introduce a structure for extracting multi-scale context information to learn a better feature representation, and a channel attention module on each decoder layer to provide global context as a guidance of low-level features to select category localization details. Experimental results show that this method achieves 41.93% of MR-2 on the occlusion subset of Caltech pedestrian dataset, which is better than other contrast detectors.

Keywords

Pedestrian detection, multi-scale context, channel attention, anchor-free

Cite This Paper

Shiyang Zhao. Context attention network for occluded pedestrian detection. Academic Journal of Computing & Information Science (2021), Vol. 4, Issue 1: 66-74. https://doi.org/10.25236/AJCIS.2021.040111.

References

[1] Y. Tian, P. Luo and X. Wang (2015). Deep learning strong parts for pedestrian detection. International Conference on Computer Vision, p.1904-1912. 

[2] C.L. Zhou and J.S. Yuan (2018). Bi-box regression for pedestrian detection and occlusion estimation. European Conference on Computer Vision, p.138-154. 

[3] X. Wang and T. Xiao (2018). Repulsion loss: Detecting pedestrians in a crowd. IEEE Conference on Computer Vision and Pattern Recogni-tion, p.7774-7783. 

[4] S.S. Zhang, J. Yang and B. Schiele (2018). Occluded pedestrian detection through guided attention in cnns. IEEE Conference on Computer Vision and Pattern Recognition, p.6995-7003. 

[5] W. Liu and S. Liao (2019). High-level semantic feature detection: A new perspective for pedestrian detection. IEEE Conference on Computer Vision and Pattern Recognition, p.5187-5196. 

[6] K. He, X.Y. Zhang and S.Q. Ren (2016). Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition, p.770-778. 

[7] A.G. Howard, M. Zhu and B. Chen (2017). Mobilenets:  Efficient  convolutional  neural  networks  for  mobile  vision  applications. arXiv:1704.04861. 

[8] C. Szegedy and S. Loffe (2017). Inception-v4, inception-resnet and the impact of residual connections on learning.The AAAI Conference on Artificial Intelligence. 

[9] L.C. Chen, G. Papandreou and F. Schroff (2017). Rethinking atrous convolution for semantic image segmentatio. arXiv:1706.05587 

[10] J. Dai (2017). Deformable convolutional network. Proceedings of the IEEE Iternational Conference on Computer Vision. 

[11] S. Liu and D. Huang (2018). Receptive field block net for accurate and fast object detection. IEEE Conference on Computer Vision and Pattern Recognition, p.385-400. 

[12] J. Hu, L. Shen and G. Sun (2018). Squeeze-and-excitation networks. IEEE Conference on Computer Vision and Pattern Recognition, p.7132-7141.

[13] X. Li, W. Wang and X. Hu (2019). Selective Kernel Networks. IEEE Conference on Computer Vision and Pattern Recognition, p.510-519.

[14] S. Woo, J. Park and J.Y. Lee (2018). Cbam: Convolutional block attention module. European Conference on Computer Vision, p.3-19.

[15] A. Tarvainen and H. Valpola (2017). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning re-sults. Advances in neural information processing systems, p.1195-1204.

[16] L. Zhang, L Lin and X. Liang (2016). Is faster r-cnn doing well for pedestrian detection?. European Conference on Computer Vision, p.443-457.

[17] Z.W. Cai and Q.F. Fan (2016). A unified multi-scale deep convolutional neural network for fast object detection. European Conference on Computer Vision.

[18] M. Cordts, M. Omran and S. Ramos (2016). The cityscapes dataset for semantic urban scene understanding. IEEE Conference on Computer Vision and Pattern Recognition, p.3213-3223

[19] C.Z. Lin, J.W. Lu and G. Wang (2018). Graininess-aware deep feature learning for pedestrian detection. European Conference on Computer Vision.

[20] W. Liu, S. Liao and W. Hu (2018). Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. European Conference on Computer Vision, p.618-634