UDE: A Unified Driving Engine for Human Motion Generation

11/29/2022
by   Zixiang Zhou, et al.
0

Generating controllable and editable human motion sequences is a key challenge in 3D Avatar generation. It has been labor-intensive to generate and animate human motion for a long time until learning-based approaches have been developed and applied recently. However, these approaches are still task-specific or modality-specific<cit.><cit.><cit.><cit.>. In this paper, we propose “UDE", the first unified driving engine that enables generating human motion sequences from natural language or audio sequences (see Fig. <ref>). Specifically, UDE consists of the following key components: 1) a motion quantization module based on VQVAE that represents continuous motion sequence as discrete latent code<cit.>, 2) a modality-agnostic transformer encoder<cit.> that learns to map modality-aware driving signals to a joint space, and 3) a unified token transformer (GPT-like<cit.>) network to predict the quantized latent code index in an auto-regressive manner. 4) a diffusion motion decoder that takes as input the motion tokens and decodes them into motion sequences with high diversity. We evaluate our method on HumanML3D<cit.> and AIST++<cit.> benchmarks, and the experiment results demonstrate our method achieves state-of-the-art performance. Project website: <https://github.com/zixiangzhou916/UDE/>

READ FULL TEXT

page 1

page 4

page 6

page 7

page 13

page 14

research
10/19/2022

PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting

We address the problem of action-conditioned generation of human motion ...
research
05/16/2023

AMD: Autoregressive Motion Diffusion

Human motion generation aims to produce plausible human motion sequences...
research
06/26/2023

MotionGPT: Human Motion as a Foreign Language

Though the advancement of pre-trained large language models unfolds, the...
research
12/08/2022

Executing your Commands via Motion Diffusion in Latent Space

We study a challenging task, conditional human motion generation, which ...
research
05/25/2023

Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion Data and Natural Language

Due to recent advances in pose-estimation methods, human motion can be e...
research
03/25/2022

Implicit Neural Representations for Variable Length Human Motion Generation

We propose an action-conditional human motion generation method using va...
research
03/27/2023

Object Discovery from Motion-Guided Tokens

Object discovery – separating objects from the background without manual...

Please sign up or login with your details

Forgot password? Click here to reset