Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2023, 6(5); doi: 10.25236/AJCIS.2023.060514.

Inverted Non-Maximum Suppression for More Accurate and Neater Face Detection

Author(s)

Lian Liu1,2, Liguo Zhou2

Corresponding Author:
Liguo Zhou
Affiliation(s)

1College of Electronic and Information Engineering, Tongji University, Shanghai, China

2Chair of Robotics, Artificial Intelligence and Real-time Systems, Technical University of Munich, Garching, Germany

Abstract

CNN-based face detection methods have achieved significant progress in recent years. In addition to the strong representation ability of CNN, post-processing methods are also very important for the performance of face detection. In general, the face detection method predicts several candidate bounding- boxes for one face. NMS is used to filter out inaccurate candidate boxes to get the most accurate box. The principle of NMS is to select the box with a higher score as the basic box and then delete the box which has a large overlapping area with the basic box but has a lower score. However, the current NMS method and its improved versions do not perform well when face image quality is poor or faces are in a cluster. In these situations, even after NMS filtering, there is often a face corresponding to multiple predicted boxes. To reduce this kind of negative result, in this paper, we propose a new NMS method that operates in the reverse order of other NMS methods. Our method performs well on low-quality and tiny face samples. Experiments demonstrate that our method is effective as a post-processor for different face detection methods. The source code has been released on https://github.com/.

Keywords

NMS, Face Detection, CNNs

Cite This Paper

Lian Liu, Liguo Zhou. Inverted Non-Maximum Suppression for More Accurate and Neater Face Detection. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 5: 101-106. https://doi.org/10.25236/AJCIS.2023.060514.

References

[1] P. Viola, M. Jones et al., “Robust real-time object detection,” Interna- tional journal of computer vision, vol. 4, no. 34-47, p. 4, 2001. 

[2] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol. 1. Ieee, 2005, pp. 886–893. 

[3] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation, ” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587. 

[4] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition, ” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. 

[5] R. Girshick, “Fast Rcnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448. 

[6] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015. 

[7] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. -Y. Fu, andA. C. Berg, “Ssd: Single shot multibox detector,” in European conference on computer vision. Springer, 2016, pp. 21–37. 

[8] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779– 788. 

[9] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv: 1804. 02767, 2018. 

[10] G. Jocher, “Ultralytics/yolov5,” https://github.com/ultralytics/yolov5, Oct. 2020. 

[11] J. Canny, “A computational approach to edge detection,” IEEE Transac- tions on pattern analysis and machine intelligence, no. 6, pp. 679–698, 1986. 

[12] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “A convolutional neural network cascade for face detection, ” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 5325– 5334. 

[13] J. Hosang, R. Benenson, and B. Schiele, “A convnet for non-maximum suppression,” in German Conference on Pattern Recognition. Springer, 2016, pp. 192–204. 

[14] S. Liu, D. Huang, Y. Wang, “Adaptive nms: Refining pedestrian detection in a crowd,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 6459–6468. 

[15] N. Bodla, B. Singh, R. Chellappa, and L. S. Davis, “Soft-nms–improving object detection with one line of code,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5561–5569. 

[16] C. Ning, H. Zhou, Y. Song and J. Tang, “Inceptionsingleshotmultibox detector for object detection,” in 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, 2017, pp. 549–554. 

[17] J. Li, Y. Wang, C. Wang, Y. Tai, J. Qian, J. Yang, C. Wang, J. Li, and F. Huang, “Dsfd: dual shot face detector, ” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5060–5069. 

[18] X. Tang, D. K. Du, Z. He, and J. Liu, “Pyramidbox: A context-assisted single shot face detector,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 797–813. 

[19] Y. Yoo, D. Han, and S. Yun, “Extd: Extremely tiny face detector via iterative filter reuse,” arXiv preprint arXiv: 1906. 06579, 2019. 

[20] S. Yang, P. Luo, C. -C. Loy, and X. Tang, “Wider face: A face detection benchmark,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5525–5533. 

[21] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library, ” arXiv preprint arXiv: 1912. 01703, 2019.