Welcome to Francis Academic Press

Academic Journal of Engineering and Technology Science, 2023, 6(12); doi: 10.25236/AJETS.2023.061206.

Research on Human Action Analysis and Recognition Methods Based on Deep Learning


Jiale Zhang

Corresponding Author:
Jiale Zhang

School of Information Engineering, Heilongjiang University of Finance and Economic, Harbin, 150000, China


With the vigorous development of the Internet and multimedia technology and the large-scale popularization of video capture devices such as smartphones and surveillance cameras, video data has shown explosive growth. The goal of action recognition is to recognize the action being performed by the human in a video. As a basic but extremely challenging task in computer vision, it has a broad application prospect in many fields, such as human-machine interaction, virtual reality, intelligent video surveillance, and social public security. In this paper, human action analysis and recognition methods based on deep learning are summarized, and several prevailing action recognition algorithms are introduced and categorized in detail. Different from the traditional classification methods, we survey the currently popular algorithms into 2 series from the perspective of feature fusion: 2D convolutional series based action recognition algorithms and 3D convolutional series based action recognition algorithms.


deep learning, human action analysis, human action recognition

Cite This Paper

Jiale Zhang. Research on Human Action Analysis and Recognition Methods Based on Deep Learning. Academic Journal of Engineering and Technology Science (2023) Vol. 6, Issue 12: 39-44. https://doi.org/10.25236/AJETS.2023.061206.


[1] Information on: https://www.groupisd.com/what-happens-online-in-60-seconds/

[2] VINAGRE M, ARANDA J, CASALS A. A New Relational Geometric Feature for Human Action Recognition [C]. proceedings of the Lecture Notes in Electrical Engineering, 2015. 263-278.

[3] YAN L, LU W, WEI L, et al. Action Recognition Using Local Joints Structure and Histograms of 3D Joints [C]. proceedings of the 2014 Tenth International Conference on Computational Intelligence and Security, 2015. 185-188.

[4] Simonyan, Karen, Zisserman, et al. Two-Stream Convolutional Networks for Action Recognition in Videos[C]. Advances in Neural Information Processing Systems, 2014: 568–576.

[5] Ng Y H, Hausknecht M, Vijayanarasimhan S, et al. Beyond short snippets: Deep networks for video classification[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015: 4694-4702.

[6] Ji S, Xu W, Yang M, et al. 3D Convolutional Neural Networks for Human Action Recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221-231

[7] Diba A, Fayyaz M, Sharma V, et al. Temporal 3d convnets: New architecture and transfer learning for video classification[EB/OL]. 2017, 11 22. arXiv preprint arXiv:171108200. https://doi.org/10.48550/arXiv.1711.08200.