A lightweight image sensitive information detection model based on yolov5s

<p>Yueheng Mao<sup>1,2</sup>, Bin Song<sup>1,2</sup>, Zhiyong Zhang<sup>1,2</sup>, Wenhou Yang<sup>3</sup>, Yu Lan<sup>3</sup></p>

doi:10.25236/AJCIS.2023.060303

Academic Journal of Computing & Information Science, 2023, 6(3); doi: 10.25236/AJCIS.2023.060303.

A lightweight image sensitive information detection model based on yolov5s

Author(s)

Yueheng Mao^1,2, Bin Song^1,2, Zhiyong Zhang^1,2, Wenhou Yang³, Yu Lan³

Corresponding Author:

Yueheng Mao

Affiliation(s)

¹Information Engineering College, Henan University of Science and Technology, Luoyang, 471023, Henan, China

²Henan International Joint Laboratory of Cyberspace Security Applications, Luoyang, 471023, Henan, China

³Sunnetech Ltd., Quzhou, 324003, Zhejiang, China

Download PDF
|
Download: 23
|
View: 623

Abstract

Current sensitive information detection methods are prone to problems such as low detection accuracy, long training time, and slow detection speed, resulting in models that are usually not suitable for practical deployment. To solve this problem, a lightweight image sensitive information detection model based on yolov5s is proposed in this paper. First, this paper designs an efficient attention module GPSA module based on PSA module in the feature extraction part, which enables the network model to learn richer multi-scale feature representations and improve the detection accuracy of the model for sensitive information. In the feature fusion part, this paper adopts the BiFPN structure instead of the PAN structure of the original model, so that the feature fusion ability of the model can be improved. After experimental comparison, the results show that the detection accuracy and speed of the proposed method in this paper on the homemade sensitive image dataset are better than the current mainstream methods. The experimental results show that the final mAP of this model on the self-made sensitive image data set can reach 71%, and the detection time of a single image is 2.8ms, which can meet the requirements of network platform deployment in practical application.

Keywords

Sensitive Information, yolov5s, PSA Module, Attention Module, BiFPN

Cite This Paper

Yueheng Mao, Bin Song, Zhiyong Zhang, Wenhou Yang, Yu Lan. A lightweight image sensitive information detection model based on yolov5s. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 3: 20-27. https://doi.org/10.25236/AJCIS.2023.060303.

References

[1] Zhang H, Zu K, Lu J, et al. EPSANet: An efficient pyramid squeeze attention block on convolutional neural network[C]//Proceedings of the Asian Conference on Computer Vision. 2022: 1161-1177

[2] Tan M, Pang R, Le Q V. Efficientdet: Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 10781-10790.

[3] Haiming Yin, Xiaodong Xu, Lihua Ye. Big Skin Regions Detection for Adult Image dentification[C]// Workshop on Digital Media & Digital Content Management. IEEE, 2011.

[4] Basilio J A M, Torres G A, Gabriel Sánchez Pérez, et al. Explicit image detection using YCbCr space color model as skin detection[C]// Proceedings of the 2011 American conference on applied mathematics and the 5th WSEAS international conference on Computer engineering and applications. World Scientific and Engineering Academy and Society (WSEAS), 2011.

[5] Bermejo Nievas E, Deniz Suarez O, Bueno García G, et al. Violence detection in video using computer vision techniques[C]//Computer Analysis of Images and Patterns: 14th International Conference, CAIP 2011, Seville, Spain, August 29-31, 2011, Proceedings, Part II 14. Springer Berlin Heidelberg, 2011: 332-339.

[6] Moustafa M. Applying deep learning to classify pornographic images and videos [J]. arXiv preprint arXiv:1511.08899, 2015.

[7] Mark Marsden, Kevin McGuinness, Suzanne Little, et al. Resnet Crowd: a residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification [C].Proceedings of 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, 2017:1-7.

[8] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.

[9] Girshick R. Fast r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448.

[10] Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks [J]. Advances in neural information processing systems, 2015, 28.

[11] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.

[12] Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, 2016: 21-37.

[13] Redmon J, Farhadi A. yolo9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.

[14] Redmon J, Farhadi A. yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.

[15] Bochkovskiy A, Wang C Y, Liao H Y M. yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.

[16] Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8759-8768.

[17] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.

[18] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.

[19] Duta I C, Liu L, Zhu F, et al. Pyramidal convolution: Rethinking convolutional neural networks for visual recognition[J]. arXiv preprint arXiv:2006.11538, 2020.

[20] Gao S H, Cheng M M, Zhao K, et al. Res2net: A new multi-scale backbone architecture[J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 43(2): 652-662.

[21] Yuan P, Lin S, Cui C, et al. HS-ResNet: Hierarchical-split block on convolutional neural network[J]. arXiv preprint arXiv:2010.07621, 2020.

[22] Han K, Wang Y, Tian Q, et al. Ghostnet: More features from cheap operations[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 1580-1589.

[23] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2980-2988.