C2G2: Controllable Co-speech Gesture Generation with Latent Diffusion Model

08/29/2023
by   Longbin Ji, et al.
0

Co-speech gesture generation is crucial for automatic digital avatar animation. However, existing methods suffer from issues such as unstable training and temporal inconsistency, particularly in generating high-fidelity and comprehensive gestures. Additionally, these methods lack effective control over speaker identity and temporal editing of the generated gestures. Focusing on capturing temporal latent information and applying practical controlling, we propose a Controllable Co-speech Gesture Generation framework, named C2G2. Specifically, we propose a two-stage temporal dependency enhancement strategy motivated by latent diffusion models. We further introduce two key features to C2G2, namely a speaker-specific decoder to generate speaker-related real-length skeletons and a repainting strategy for flexible gesture generation/editing. Extensive experiments on benchmark gesture datasets verify the effectiveness of our proposed C2G2 compared with several state-of-the-art baselines. The link of the project demo page can be found at https://c2g2-gesture.github.io/c2_gesture

READ FULL TEXT

page 4

page 11

research
03/16/2023

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

Animating virtual avatars to make co-speech gestures facilitates various...
research
06/28/2021

Speech2Properties2Gestures: Gesture-Property Prediction as a Tool for Generating Representational Gestures from Speech

We propose a new framework for gesture generation, aiming to allow data-...
research
09/17/2023

LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation

Gestures are non-verbal but important behaviors accompanying people's sp...
research
12/05/2022

Audio-Driven Co-Speech Gesture Video Generation

Co-speech gesture is crucial for human-machine interaction and digital e...
research
08/13/2023

CLE Diffusion: Controllable Light Enhancement Diffusion Model

Low light enhancement has gained increasing importance with the rapid de...
research
08/17/2020

Sequence-to-Sequence Predictive Model: From Prosody To Communicative Gestures

Communicative gestures and speech prosody are tightly linked. Our object...
research
03/26/2023

GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents

The automatic generation of stylized co-speech gestures has recently rec...

Please sign up or login with your details

Forgot password? Click here to reset