Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition

06/27/2022
by   Zhan Chen, et al.
0

Graph convolutional networks have been widely used for skeleton-based action recognition due to their excellent modeling ability of non-Euclidean data. As the graph convolution is a local operation, it can only utilize the short-range joint dependencies and short-term trajectory but fails to directly model the distant joints relations and long-range temporal information that are vital to distinguishing various actions. To solve this problem, we present a multi-scale spatial graph convolution (MS-GC) module and a multi-scale temporal graph convolution (MT-GC) module to enrich the receptive field of the model in spatial and temporal dimensions. Concretely, the MS-GC and MT-GC modules decompose the corresponding local graph convolution into a set of sub-graph convolution, forming a hierarchical residual architecture. Without introducing additional parameters, the features will be processed with a series of sub-graph convolutions, and each node could complete multiple spatial and temporal aggregations with its neighborhoods. The final equivalent receptive field is accordingly enlarged, which is capable of capturing both short- and long-range dependencies in spatial and temporal domains. By coupling these two modules as a basic block, we further propose a multi-scale spatial temporal graph convolutional network (MST-GCN), which stacks multiple blocks to learn effective motion representations for action recognition. The proposed MST-GCN achieves remarkable performance on three challenging benchmark datasets, NTU RGB+D, NTU-120 RGB+D and Kinetics-Skeleton, for skeleton-based action recognition.

READ FULL TEXT
research
03/31/2020

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

Spatial-temporal graphs have been widely used by skeleton-based action r...
research
04/03/2020

TEA: Temporal Excitation and Aggregation for Action Recognition

Temporal modeling is key for action recognition in videos. It normally c...
research
12/17/2021

Self-attention based anchor proposal for skeleton-based action recognition

Skeleton sequences are widely used for action recognition task due to it...
research
08/03/2021

Skeleton Split Strategies for Spatial Temporal Graph Convolution Networks

A skeleton representation of the human body has been proven to be effect...
research
10/12/2022

DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition

Graph convolution networks (GCN) have been widely used in skeleton-based...
research
08/18/2022

Spatial Temporal Graph Attention Network for Skeleton-Based Action Recognition

It's common for current methods in skeleton-based action recognition to ...

Please sign up or login with your details

Forgot password? Click here to reset