LSTA-Net: Long short-term Spatio-Temporal Aggregation Network for Skeleton-based Action Recognition

11/01/2021
by   Tailin Chen, et al.
0

Modelling various spatio-temporal dependencies is the key to recognising human actions in skeleton sequences. Most existing methods excessively relied on the design of traversal rules or graph topologies to draw the dependencies of the dynamic joints, which is inadequate to reflect the relationships of the distant yet important joints. Furthermore, due to the locally adopted operations, the important long-range temporal information is therefore not well explored in existing works. To address this issue, in this work we propose LSTA-Net: a novel Long short-term Spatio-Temporal Aggregation Network, which can effectively capture the long/short-range dependencies in a spatio-temporal manner. We devise our model into a pure factorised architecture which can alternately perform spatial feature aggregation and temporal feature aggregation. To improve the feature aggregation effect, a channel-wise attention mechanism is also designed and employed. Extensive experiments were conducted on three public benchmark datasets, and the results suggest that our approach can capture both long-and-short range dependencies in the space and time domain, yielding higher results than other state-of-the-art methods. Code available at https://github.com/tailin1009/LSTA-Net.

READ FULL TEXT

page 1

page 2

research
07/20/2023

GLSFormer: Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos

Automated surgical step recognition is an important task that can signif...
research
07/22/2022

Automated Dilated Spatio-Temporal Synchronous Graph Modeling for Traffic Prediction

Accurate traffic prediction is a challenging task in intelligent transpo...
research
09/07/2021

Improving Phenotype Prediction using Long-Range Spatio-Temporal Dynamics of Functional Connectivity

The study of functional brain connectivity (FC) is important for underst...
research
01/08/2022

Spatio-Temporal Tuples Transformer for Skeleton-Based Action Recognition

Capturing the dependencies between joints is critical in skeleton-based ...
research
06/20/2020

Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning

In self-supervised spatio-temporal representation learning, the temporal...
research
09/21/2021

Skeleton-Graph: Long-Term 3D Motion Prediction From 2D Observations Using Deep Spatio-Temporal Graph CNNs

Several applications such as autonomous driving, augmented reality and v...
research
11/16/2016

Learning long-term dependencies for action recognition with a biologically-inspired deep network

Despite a lot of research efforts devoted in recent years, how to effici...

Please sign up or login with your details

Forgot password? Click here to reset