Global-local Motion Transformer for Unsupervised Skeleton-based Action Learning

07/13/2022
by   Boeun Kim, et al.
0

We propose a new transformer model for the task of unsupervised learning of skeleton motion sequences. The existing transformer model utilized for unsupervised skeleton-based action learning is learned the instantaneous velocity of each joint from adjacent frames without global motion information. Thus, the model has difficulties in learning the attention globally over whole-body motions and temporally distant joints. In addition, person-to-person interactions have not been considered in the model. To tackle the learning of whole-body motion, long-range temporal dynamics, and person-to-person interactions, we design a global and local attention mechanism, where, global body motions and local joint motions pay attention to each other. In addition, we propose a novel pretraining strategy, multi-interval pose displacement prediction, to learn both global and local attention in diverse time ranges. The proposed model successfully learns local dynamics of the joints and captures global context from the motion sequences. Our model outperforms state-of-the-art models by notable margins in the representative benchmarks. Codes are available at https://github.com/Boeun-Kim/GL-Transformer.

READ FULL TEXT

page 12

page 20

page 21

page 22

research
08/19/2022

SoMoFormer: Social-Aware Motion Transformer for Multi-Person Motion Prediction

Multi-person motion prediction remains a challenging problem, especially...
research
02/17/2022

Neural Marionette: Unsupervised Learning of Motion Skeleton and Latent Dynamics from Volumetric Video

We present Neural Marionette, an unsupervised approach that discovers th...
research
10/06/2022

Focal and Global Spatial-Temporal Transformer for Skeleton-based Action Recognition

Despite great progress achieved by transformer in various vision tasks, ...
research
05/25/2023

Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion Data and Natural Language

Due to recent advances in pose-estimation methods, human motion can be e...
research
05/20/2021

An Attractor-Guided Neural Networks for Skeleton-Based Human Motion Prediction

Joint relation modeling is a curial component in human motion prediction...
research
08/09/2023

Joint-Relation Transformer for Multi-Person Motion Prediction

Multi-person motion prediction is a challenging problem due to the depen...
research
10/20/2021

AniFormer: Data-driven 3D Animation with Transformer

We present a novel task, i.e., animating a target 3D object through the ...

Please sign up or login with your details

Forgot password? Click here to reset