Attention, please: A Spatio-temporal Transformer for 3D Human Motion Prediction

04/18/2020
by   Emre Aksan, et al.
14

In this paper, we propose a novel architecture for the task of 3D human motion modelling. We argue that the problem can be interpreted as a generative modelling task: A network learns the conditional synthesis of human poses where the model is conditioned on a seed sequence. Our focus lies on the generation of plausible future developments over longer time horizons, whereas previous work considered shorter time frames of up to 1 second. To mitigate the issue of convergence to a static pose, we propose a novel architecture that leverages the recently proposed self-attention concept. The task of 3D motion prediction is inherently spatio-temporal and thus the proposed model learns high dimensional joint embeddings followed by a decoupled temporal and spatial self-attention mechanism. The two attention blocks operate in parallel to aggregate the most informative components of the sequence to update the joint representation. This allows the model to access past information directly and to capture spatio-temporal dependencies explicitly. We show empirically that this reduces error accumulation over time and allows for the generation of perceptually plausible motion sequences over long time horizons as well as accurate short-term predictions. Accompanying video available at https://youtu.be/yF0cdt2yCNE .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2023

SPOTR: Spatio-temporal Pose Transformers for Human Motion Prediction

3D human motion prediction is a research area of high significance and a...
research
12/04/2021

STJLA: A Multi-Context Aware Spatio-Temporal Joint Linear Attention Network for Traffic Forecasting

Traffic prediction has gradually attracted the attention of researchers ...
research
08/24/2021

Spatio-Temporal Self-Attention Network for Video Saliency Prediction

3D convolutional neural networks have achieved promising results for vid...
research
04/30/2015

Predicting People's 3D Poses from Short Sequences

We propose an efficient approach to exploiting motion information from c...
research
09/15/2022

STPOTR: Simultaneous Human Trajectory and Pose Prediction Using a Non-Autoregressive Transformer for Robot Following Ahead

In this paper, we develop a neural network model to predict future human...
research
12/13/2018

Human Motion Prediction via Spatio-Temporal Inpainting

We propose a Generative Adversarial Network (GAN) to forecast 3D human m...
research
05/25/2023

Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion Data and Natural Language

Due to recent advances in pose-estimation methods, human motion can be e...

Please sign up or login with your details

Forgot password? Click here to reset