Recurrent Network Models for Human Dynamics

08/02/2015
by   Katerina Fragkiadaki, et al.
0

We propose the Encoder-Recurrent-Decoder (ERD) model for recognition and prediction of human body pose in videos and motion capture. The ERD model is a recurrent neural network that incorporates nonlinear encoder and decoder networks before and after recurrent layers. We test instantiations of ERD architectures in the tasks of motion capture (mocap) generation, body pose labeling and body pose forecasting in videos. Our model handles mocap training data across multiple subjects and activity domains, and synthesizes novel motions while avoid drifting for long periods of time. For human pose labeling, ERD outperforms a per frame body part detector by resolving left-right body part confusions. For video pose forecasting, ERD predicts body joint displacements across a temporal horizon of 400ms and outperforms a first order motion model based on optical flow. ERDs extend previous Long Short Term Memory (LSTM) models in the literature to jointly learn representations and their dynamics. Our experiments show such representation learning is crucial for both labeling and prediction in space-time. We find this is a distinguishing feature between the spatio-temporal visual domain in comparison to 1D text, speech or handwriting, where straightforward hard coded representations have shown excellent results when directly combined with recurrent units.

READ FULL TEXT

page 7

page 8

research
02/20/2019

Human Motion Prediction via Learning Local Structure Representations and Temporal Dependencies

Human motion prediction from motion capture data is a classical problem ...
research
08/13/2017

Lattice Long Short-Term Memory for Human Action Recognition

Human actions captured in video sequences are three-dimensional signals ...
research
04/11/2023

Multi-Graph Convolution Network for Pose Forecasting

Recently, there has been a growing interest in predicting human motion, ...
research
01/07/2017

Unsupervised Learning of Long-Term Motion Dynamics for Videos

We present an unsupervised representation learning approach that compact...
research
09/29/2020

Attention-Driven Body Pose Encoding for Human Activity Recognition

This article proposes a novel attention-based body pose encoding for hum...
research
06/07/2019

Ego-Pose Estimation and Forecasting as Real-Time PD Control

We propose the use of a proportional-derivative (PD) control based polic...
research
12/13/2019

Unsupervised and Generic Short-Term Anticipation of Human Body Motions

Various neural network based methods are capable of anticipating human b...

Please sign up or login with your details

Forgot password? Click here to reset