DAA-GCN: A Dynamic Adaptive Attention Graph Convolutional Network for Robust Skeleton-Based Action Recognition

<p>Rundong Zhou, Jiawei Wang</p>

doi:10.25236/AJCIS.2025.080603

Academic Journal of Computing & Information Science, 2025, 8(6); doi: 10.25236/AJCIS.2025.080603.

DAA-GCN: A Dynamic Adaptive Attention Graph Convolutional Network for Robust Skeleton-Based Action Recognition

Author(s)

Rundong Zhou, Jiawei Wang

Corresponding Author:

Rundong Zhou

Affiliation(s)

School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, China

Download PDF
|
Download: 40
|
View: 1937

Abstract

Skeleton-based action recognition has attracted increasing attention due to its compact data representation and robustness against appearance variations. Although Graph Convolutional Networks (GCNs) have demonstrated strong performance by modeling spatial dependencies among joints, they still face challenges in capturing long-range interactions and multi-scale temporal dynamics. To address these limitations, we propose a novel framework, DAA-GCN (Dynamic Adaptive Attention Graph Convolutional Network), which comprises two core modules: the Spatio-Temporal Adaptive Feature Extractor (STAFE) and the Multi-Perspective Fusion Graph Attention (MPFGA). STAFE integrates long-range spatio-temporal graph convolutions with multi-branch temporal convolutions to effectively capture both short-term and long-term motion patterns, while MPFGA enhances feature representations by combining global self-attention with local additive attention, thereby balancing global context with local structural information. We evaluate the proposed DAA-GCN on two benchmark datasets, NTU RGB+D 60 and NTU RGB+D 120, under both cross-subject and cross-view/setup protocols. Experimental results show that DAA-GCN consistently outperforms state-of-the-art methods, and ablation studies further confirm the effectiveness of each module in the overall architecture. In summary, DAA-GCN presents a robust and scalable solution for skeleton-based action recognition, with promising applications in human-computer interaction, video surveillance, and healthcare monitoring.

Keywords

Skeleton Recognition, Graph Convolution, Attention Mechanism, Feature Fusion

Cite This Paper

Rundong Zhou, Jiawei Wang. DAA-GCN: A Dynamic Adaptive Attention Graph Convolutional Network for Robust Skeleton-Based Action Recognition. Academic Journal of Computing & Information Science (2025), Vol. 8, Issue 6: 19-26. https://doi.org/10.25236/AJCIS.2025.080603.

References

[1] T. N. Kipf and M. Welling, "Semi-Supervised Classification with Graph Convolutional Network," 2017.

[2] S. Yan, Y. Xiong, and D. Lin, "Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, Apr. 2018.

[3] Ren, Bin, et al. "A survey on 3d skeleton-based action recognition using learning method." Cyborg and Bionic Systems 5 (2024): 0100.

[4] Peng, Kunyu, et al. "Navigating open set scenarios for skeleton-based action recognition." Proceedings of the AAAI conference on artificial intelligence. Vol. 38. No. 5. 2024.

[5] Song, Yi-Fan, et al. "Constructing stronger and faster baselines for skeleton-based action recognition." IEEE transactions on pattern analysis and machine intelligence 45.2: 1474-1488 (2022).

[6] Zhou, Huanyu, Qingjie Liu, and Yunhong Wang. "Learning discriminative representations for skeleton based action recognition." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10608-10617. 2023.

[7] Su, Kun, Xiulong Liu, and Eli Shlizerman. "Predict & cluster: Unsupervised skeleton based action recognition." In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9631-9640. 2020.

[8] Wu, L., Zhang, C. and Zou, Y., 2023. SpatioTemporal focus for skeleton-based action recognition. Pattern Recognition, 136, p.109231,2023.

[9] Tang, Yansong, Yi Tian, Jiwen Lu, Peiyang Li, and Jie Zhou. "Deep progressive reinforcement learning for skeleton-based action recognition." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5323-5332. 2018.

[10] Feng, Liqi, Yaqin Zhao, Wenxuan Zhao, and Jiaxi Tang. "A comparative review of graph convolutional networks for human skeleton-based action recognition." Artificial Intelligence Review (2022): 1-31.

[11] Shi, Lei, et al. "Skeleton-based action recognition with multi-stream adaptive graph convolutional networks." IEEE Transactions on Image Processing 29: 9532-9545 (2020).

[12] Du, Yong, Yun Fu, and Liang Wang. "Representation learning of temporal dynamics for skeleton-based action recognition." IEEE Transactions on Image Processing 25.7 : 3010-3022 (2016).

[13] Cao, Congqi, et al. "Skeleton-based action recognition with gated convolutional neural networks." IEEE Transactions on Circuits and Systems for Video Technology 29.11: 3247-3257 (2018).

[14] Liu, Jun, et al. "Skeleton-based action recognition using spatio-temporal LSTM network with trust gates." IEEE transactions on pattern analysis and machine intelligence 40.12 (2017): 3007-3021.

[15] Li, Chuankun, Pichao Wang, Shuang Wang, Yonghong Hou, and Wanqing Li. "Skeleton-based action recognition using LSTM and CNN." In 2017 IEEE International conference on multimedia & expo workshops (ICMEW), pp. 585-590. IEEE, 2017.