Executing your Commands via Motion Diffusion in Latent Space

12/08/2022
by   Xin Chen, et al.
0

We study a challenging task, conditional human motion generation, which produces plausible human motion sequences according to various conditional inputs, such as action classes or textual descriptors. Since human motions are highly diverse and have a property of quite different distribution from conditional modalities, such as textual descriptors in natural languages, it is hard to learn a probabilistic mapping from the desired conditional modality to the human motion sequences. Besides, the raw motion data from the motion capture system might be redundant in sequences and contain noises; directly modeling the joint distribution over the raw motion sequences and conditional modalities would need a heavy computational overhead and might result in artifacts introduced by the captured noises. To learn a better representation of the various human motion sequences, we first design a powerful Variational AutoEncoder (VAE) and arrive at a representative and low-dimensional latent code for a human motion sequence. Then, instead of using a diffusion model to establish the connections between the raw motion sequences and the conditional inputs, we perform a diffusion process on the motion latent space. Our proposed Motion Latent-based Diffusion model (MLD) could produce vivid motion sequences conforming to the given conditional inputs and substantially reduce the computational overhead in both the training and inference stages. Extensive experiments on various human motion generation tasks demonstrate that our MLD achieves significant improvements over the state-of-the-art methods among extensive human motion generation tasks, with two orders of magnitude faster than previous diffusion models on raw motion sequences.

READ FULL TEXT

page 1

page 5

page 7

page 12

page 13

research
05/16/2023

AMD: Autoregressive Motion Diffusion

Human motion generation aims to produce plausible human motion sequences...
research
04/25/2022

TEMOS: Generating diverse human motions from textual descriptions

We address the problem of generating diverse 3D human motions from textu...
research
04/04/2022

HiT-DVAE: Human Motion Generation via Hierarchical Transformer Dynamical VAE

Studies on the automatic processing of 3D human pose data have flourishe...
research
08/12/2021

Conditional Temporal Variational AutoEncoder for Action Video Prediction

To synthesize a realistic action sequence based on a single human image,...
research
11/29/2022

UDE: A Unified Driving Engine for Human Motion Generation

Generating controllable and editable human motion sequences is a key cha...
research
11/25/2022

BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction

Stochastic human motion prediction (HMP) has generally been tackled with...
research
10/20/2020

Probabilistic Character Motion Synthesis using a Hierarchical Deep Latent Variable Model

We present a probabilistic framework to generate character animations ba...

Please sign up or login with your details

Forgot password? Click here to reset