Academic Journal of Computing & Information Science, 2023, 6(5); doi: 10.25236/AJCIS.2023.060514.
Lian Liu1,2, Liguo Zhou2
1College of Electronic and Information Engineering, Tongji University, Shanghai, China
2Chair of Robotics, Artificial Intelligence and Real-time Systems, Technical University of Munich, Garching, Germany
CNN-based face detection methods have achieved significant progress in recent years. In addition to the strong representation ability of CNN, post-processing methods are also very important for the performance of face detection. In general, the face detection method predicts several candidate bounding- boxes for one face. NMS is used to filter out inaccurate candidate boxes to get the most accurate box. The principle of NMS is to select the box with a higher score as the basic box and then delete the box which has a large overlapping area with the basic box but has a lower score. However, the current NMS method and its improved versions do not perform well when face image quality is poor or faces are in a cluster. In these situations, even after NMS filtering, there is often a face corresponding to multiple predicted boxes. To reduce this kind of negative result, in this paper, we propose a new NMS method that operates in the reverse order of other NMS methods. Our method performs well on low-quality and tiny face samples. Experiments demonstrate that our method is effective as a post-processor for different face detection methods. The source code has been released on https://github.com/.
NMS, Face Detection, CNNs
Lian Liu, Liguo Zhou. Inverted Non-Maximum Suppression for More Accurate and Neater Face Detection. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 5: 101-106. https://doi.org/10.25236/AJCIS.2023.060514.
 P. Viola, M. Jones et al., “Robust real-time object detection,” Interna- tional journal of computer vision, vol. 4, no. 34-47, p. 4, 2001.
 N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol. 1. Ieee, 2005, pp. 886–893.
 R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation, ” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587.
 Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition, ” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
 R. Girshick, “Fast Rcnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
 S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015.
 W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. -Y. Fu, andA. C. Berg, “Ssd: Single shot multibox detector,” in European conference on computer vision. Springer, 2016, pp. 21–37.
 J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779– 788.
 J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv: 1804. 02767, 2018.
 G. Jocher, “Ultralytics/yolov5,” https://github.com/ultralytics/yolov5, Oct. 2020.
 J. Canny, “A computational approach to edge detection,” IEEE Transac- tions on pattern analysis and machine intelligence, no. 6, pp. 679–698, 1986.
 H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “A convolutional neural network cascade for face detection, ” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 5325– 5334.
 J. Hosang, R. Benenson, and B. Schiele, “A convnet for non-maximum suppression,” in German Conference on Pattern Recognition. Springer, 2016, pp. 192–204.
 S. Liu, D. Huang, Y. Wang, “Adaptive nms: Refining pedestrian detection in a crowd,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 6459–6468.
 N. Bodla, B. Singh, R. Chellappa, and L. S. Davis, “Soft-nms–improving object detection with one line of code,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5561–5569.
 C. Ning, H. Zhou, Y. Song and J. Tang, “Inceptionsingleshotmultibox detector for object detection,” in 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, 2017, pp. 549–554.
 J. Li, Y. Wang, C. Wang, Y. Tai, J. Qian, J. Yang, C. Wang, J. Li, and F. Huang, “Dsfd: dual shot face detector, ” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5060–5069.
 X. Tang, D. K. Du, Z. He, and J. Liu, “Pyramidbox: A context-assisted single shot face detector,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 797–813.
 Y. Yoo, D. Han, and S. Yun, “Extd: Extremely tiny face detector via iterative filter reuse,” arXiv preprint arXiv: 1906. 06579, 2019.
 S. Yang, P. Luo, C. -C. Loy, and X. Tang, “Wider face: A face detection benchmark,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5525–5533.
 A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library, ” arXiv preprint arXiv: 1912. 01703, 2019.