Conditional Temporal Variational AutoEncoder for Action Video Prediction

08/12/2021
by   Xiaogang Xu, et al.
30

To synthesize a realistic action sequence based on a single human image, it is crucial to model both motion patterns and diversity in the action video. This paper proposes an Action Conditional Temporal Variational AutoEncoder (ACT-VAE) to improve motion prediction accuracy and capture movement diversity. ACT-VAE predicts pose sequences for an action clips from a single input image. It is implemented as a deep generative model that maintains temporal coherence according to the action category with a novel temporal modeling on latent space. Further, ACT-VAE is a general action sequence prediction framework. When connected with a plug-and-play Pose-to-Image (P2I) network, ACT-VAE can synthesize image sequences. Extensive experiments bear out our approach can predict accurate pose and synthesize realistic image sequences, surpassing state-of-the-art approaches. Compared to existing methods, ACT-VAE improves model accuracy and preserves diversity.

READ FULL TEXT

page 1

page 6

page 7

page 8

page 9

page 11

research
04/12/2021

Action-Conditioned 3D Human Motion Synthesis with Transformer VAE

We tackle the problem of action-conditioned generation of realistic and ...
research
10/21/2021

MUGL: Large Scale Multi Person Conditional Action Generation with Locomotion

We introduce MUGL, a novel deep neural model for large-scale, diverse ge...
research
07/06/2018

A Variational Time Series Feature Extractor for Action Prediction

We propose a Variational Time Series Feature Extractor (VTSFE), inspired...
research
11/12/2021

Action2video: Generating Videos of Human 3D Actions

We aim to tackle the interesting yet challenging problem of generating v...
research
12/08/2022

Executing your Commands via Motion Diffusion in Latent Space

We study a challenging task, conditional human motion generation, which ...
research
11/24/2021

Hierarchical Graph-Convolutional Variational AutoEncoding for Generative Modelling of Human Motion

Models of human motion commonly focus either on trajectory prediction or...
research
04/09/2018

Binge Watching: Scaling Affordance Learning from Sitcoms

In recent years, there has been a renewed interest in jointly modeling p...

Please sign up or login with your details

Forgot password? Click here to reset