Spatial Temporal Graph Attention Network for Skeleton-Based Action Recognition

08/18/2022
by   Shenglan Liu, et al.
0

It's common for current methods in skeleton-based action recognition to mainly consider capturing long-term temporal dependencies as skeleton sequences are typically long (>128 frames), which forms a challenging problem for previous approaches. In such conditions, short-term dependencies are few formally considered, which are critical for classifying similar actions. Most current approaches are consisted of interleaving spatial-only modules and temporal-only modules, where direct information flow among joints in adjacent frames are hindered, thus inferior to capture short-term motion and distinguish similar action pairs. To handle this limitation, we propose a general framework, coined as STGAT, to model cross-spacetime information flow. It equips the spatial-only modules with spatial-temporal modeling for regional perception. While STGAT is theoretically effective for spatial-temporal modeling, we propose three simple modules to reduce local spatial-temporal feature redundancy and further release the potential of STGAT, which (1) narrow the scope of self-attention mechanism, (2) dynamically weight joints along temporal dimension, and (3) separate subtle motion from static features, respectively. As a robust feature extractor, STGAT generalizes better upon classifying similar actions than previous methods, witnessed by both qualitative and quantitative results. STGAT achieves state-of-the-art performance on three large-scale datasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400. Code is released.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2020

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

Spatial-temporal graphs have been widely used by skeleton-based action r...
research
06/27/2022

Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition

Graph convolutional networks have been widely used for skeleton-based ac...
research
12/25/2022

StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition

Sign language recognition (SLR) aims to overcome the communication barri...
research
01/23/2023

A noisy-input generalised additive model for relative sea-level change along the Atlantic coast of North America

We propose a Bayesian, noisy-input, spatial-temporal generalised additiv...
research
07/07/2020

Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action Recognition

Dynamic skeletal data, represented as the 2D/3D coordinates of human joi...
research
06/02/2021

TSI: Temporal Saliency Integration for Video Action Recognition

Efficient spatiotemporal modeling is an important yet challenging proble...

Please sign up or login with your details

Forgot password? Click here to reset