合肥大学人工智能与大数据学院;
目前基于普通图卷积网络的方法主要依赖局部性的图卷积操作,限制了其对远距离关节间复杂关联的灵活捕捉能力。提出一种自注意力增强图卷积网络(Self-Attention Enhanced Graph Convolutional Network, SGNet),根据骨架数据的特性,对每个关节点的通道进行独立的全局性建模,即通道特定的全局空间建模(Channel-Specific Global Spatial Modeling, C-GSM),并行于局部空间建模(Local Spatial Modeling, LSM),以提取局部和全局的空间特征表示。在两个大型且具有挑战性的基准数据集NTU RGB+D和NTU RGB+D120上进行了广泛的实验研究。与最新相关方法的比较,SGNet表现得非常有竞争性,在NTU RGB+D X-Sub和NTU RGB+D120 X-Set上分别取得了92.9%和90.7%的最高准确率。
131 | 0 | 39 |
下载次数 | 被引频次 | 阅读次数 |
[1] SHAHROUDY A, LIU J, NG T T, et al. NTU RGB+D:A Large-Scale Dataset for 3D Human Activity Analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas:IEEE, 2016:1010-1019.
[2] WEI S, SONG Y, ZHANG Y. Human Skeleton Tree Recurrent Neural Network with Joint Relative Motion Feature for SkeletonBased Action Recognition[C]//2017 IEEE International Conference on Image Processing(ICIP). Beijing:IEEE, 2017:91-95.
[3] YAN S, XIONG Y, LIN D. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence. New Orleans:AAAI, 2018.
[4] SHI L, ZHANG Y, CHENG J, et al. Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach:IEEE,2019:12026-12035.
[5] LIU Z, ZHANG H, CHEN Z, et al. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos:IEEE Computer Society,2020:143-152.
[6] CHEN Y, ZHANG Z, YUAN C, et al. Channel-Wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal:IEEE Computer Society,2021:13359-13368.
[7] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is All You Need[C]//Proceedings of the Advances in Neural Information Processing Systems. Long Beach:NIPS, 2017:5998-6008.
[8] PLIZZARI C, CANNICI M, MATTEUCCI M. Skeleton-Based Action Recognition via Spatial and Temporal Transformer Networks[J].Computer Vision and Image Understanding, 2021, 208:103219.
[9] GENG P, LU X, HU C, et al. Focusing Fine-Grained Action by Self-Attention-Enhanced Graph Neural Networks with Contrastive Learning[J].IEEE Transactions on Circuits and Systems for Video Technology, 2023, 23(9):4754-4768.
[10] LIU J, SHAHROUDY A, PEREZ M, et al. NTU RGB+D 120:A Large-Scale Benchmark for 3D Human Activity Understanding[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 42(10):2684-2701.
[11] CHENG K, ZHANG Y, HE X, et al. Skeleton-Based Action Recognition with Shift Graph Convolutional Network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle:IEEE, 2020:183-192.
[12] YE F, PU S, ZHONG Q, et al. Dynamic GCN:Context-Enriched Topology Learning for Skeleton-Based Action Recognition[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York:ACM, 2020:55-63.
[13] SONG Y F, ZHANG Z, SHAN C, et al. Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(2):1474-1488.
[14] CHENG Q, CHENG J, REN Z, et al. Multi-Scale Spatial-Temporal Convolutional Neural Network for Skeleton-Based Action Recognition[J].Pattern Analysis and Applications, 2023:1-13.
[15] SHI L, ZHANG Y, CHENG J, et al. Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action-Gesture Recognition[C]//Revised Selected Papers of the Asian Conference on Computer Vision(ACCV’20), Part V. Cham,Switzerland:Springer, 2020:38-53.
[16] CHI H, HA M H, CHI S, et al. InfoGCN:Representation Learning for Human Skeleton-Based Action Recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans:IEEE, 2022:20186-20196.
[17] TIAN H, MA X, LI X, et al. Skeleton-Based Action Recognition with Select-Assemble-Normalize Graph Convolutional Networks[J].IEEE Transactions on Multimedia, 2023, 25:8527-8538.
[18] CHEN Z, LI S, YANG B, et al. Multi-Scale Spatial-Temporal Graph Convolutional Network for Skeleton-based Action Recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Virtual Event:AAAI, 2021, 35(2):1113-1122.
[19] BAI R, LI M, MENG B, et al. Hierarchical Graph Convolutional Skeleton Transformer for Action Recognition[C]//IEEE International Conference on Multimedia and Expo(ICME). Taipei:IEEE, 2022:1-6.
[20] LIU H, LIU Y, CHEN Y, et al. TranSkeleton:Hierarchical Spatial-Temporal Transformer for Skeleton-Based Action Recognition[J].IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(8):4137-4148.
基本信息:
DOI:
中图分类号:TP391.41;TP183
引用信息:
[1]丁悦,吴志泽.基于自注意力图卷积网络的人体骨架行为识别[J].合肥大学学报,2024,41(05):94-101.
基金信息:
国家自然科学基金项目“具视觉隐私保护的无监督老年人日常行为识别方法研究”(62406095); 安徽省自然科学基金面上项目“基于特权信息学习的非干预式老年人日常行为识别方法研究”(2308085MF213); 安徽省重点研究与开发计划“基于演化博弈的网络攻防实战演练平台设计与实现”(2022K07020011)