Academic Journal of Computing & Information Science, 2023, 6(4); doi: 10.25236/AJCIS.2023.060415.
Zhiyuan Wang1, Yan Li2, Bibo Lu1, Lishan Zhao1, Shisong Zhu1, Yi He3
1College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, China
2Jiaozuo Metallurgy Building Materials Senior Technical School, Jiaozuo, China
3Henan Zhongyuan Zhixin Technology Limited Company, Jiaozuo, China
In the open-door unmanned vending machine monitoring scenario, a detection scheme for multiple states of user's hand is designed to analyze the abnormal behavior during the user's shopping process, and a hand multi-state detection algorithm based on YOLOv5 is proposed. To improve the inference speed of the algorithm, the 3×3 convolution in YOLOv5 is replaced with RepVGG by using the idea of structural re-referencing. The accuracy of the algorithm is improved by adding CBAM attention mechanism. The model size is greatly reduced while ensuring the model recognition accuracy. The experimental results show that the algorithm in this paper can accurately identify the hand state of the user when shopping and has some practical application value.
deep learning, unmanned vending cabinet, YOLOv5, CBAM, model reparameterization
Zhiyuan Wang, Yan Li, Bibo Lu, Lishan Zhao, Shisong Zhu, Yi He. Unmanned vending counter abnormal behavior recognition based on YOLOv5. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 4: 110-117. https://doi.org/10.25236/AJCIS.2023.060415.
 Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection; proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, F, 2016 [C].
 Redmon J, Farhadi A. YOLO9000: better, faster, stronger; proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, F, 2017 [C].
 Redmon J, Farhadi A. Yolov3: An incremental im-provement [J]. arXiv preprint arXiv:180402767, 2018.
 Bochkovskiy A, Wang C-Y, Liao H-Y M. Yolov4: Optimal speed and accuracy of object detection [J]. arXiv preprint arXiv:200410934, 2020.
 Ultralytics. YOLOv5 [EB/OL]. (2021-04-12) [2022-4-20]. https://github.com/ultralytics/yolov5.
 Deng L, Gong Y, Lu X, et al. Focus-Enhanced Scene Text Recognition with Deformable Convolutions; proceedings of the 2019 IEEE 5th International Conference on Computer and Communications (ICCC), F 6-9 Dec. 2019, 2019 [C].
 Wang C-Y, Liao H-Y M, Wu Y-H, et al. CSPNet: A new backbone that can enhance learning capability of CNN; pro-ceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, F, 2020 [C].
 Ding X, Zhang X, Ma N, et al. Repvgg: Making vgg-style convnets great again; proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, F, 2021 [C].
 He K, Zhang X, Ren S, et al. Identity Mappings in Deep Residual Networks; proceedings of the Computer Vision – ECCV 2016, Cham, F 2016//, 2016 [C]. Springer International Publishing.
 Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.