LongDanceDiff: Long-term Dance Generation with Conditional Diffusion Model

08/23/2023
by   Siqi Yang, et al.
0

Dancing with music is always an essential human art form to express emotion. Due to the high temporal-spacial complexity, long-term 3D realist dance generation synchronized with music is challenging. Existing methods suffer from the freezing problem when generating long-term dances due to error accumulation and training-inference discrepancy. To address this, we design a conditional diffusion model, LongDanceDiff, for this sequence-to-sequence long-term dance generation, addressing the challenges of temporal coherency and spatial constraint. LongDanceDiff contains a transformer-based diffusion model, where the input is a concatenation of music, past motions, and noised future motions. This partial noising strategy leverages the full-attention mechanism and learns the dependencies among music and past motions. To enhance the diversity of generated dance motions and mitigate the freezing problem, we introduce a mutual information minimization objective that regularizes the dependency between past and future motions. We also address common visual quality issues in dance generation, such as foot sliding and unsmooth motion, by incorporating spatial constraints through a Global-Trajectory Modulation (GTM) layer and motion perceptual losses, thereby improving the smoothness and naturalness of motion generation. Extensive experiments demonstrate a significant improvement in our approach over the existing state-of-the-art methods. We plan to release our codes and models soon.

READ FULL TEXT
research
08/05/2023

DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation

When hearing music, it is natural for people to dance to its rhythm. Aut...
research
04/25/2023

GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network

Music-driven 3D dance generation has become an intensive research topic ...
research
08/03/2023

Synthesizing Long-Term Human Motions with Diffusion Models via Coherent Sampling

Text-to-motion generation has gained increasing attention, but most exis...
research
05/02/2023

Long-Term Rhythmic Video Soundtracker

We consider the problem of generating musical soundtracks in sync with r...
research
03/29/2023

Robust Dancer: Long-term 3D Dance Synthesis Using Unpaired Data

How to automatically synthesize natural-looking dance movements based on...
research
09/27/2022

NEURAL MARIONETTE: A Transformer-based Multi-action Human Motion Synthesis System

We present a neural network-based system for long-term, multi-action hum...
research
02/01/2023

Correspondence-free online human motion retargeting

We present a novel data-driven framework for unsupervised human motion r...

Please sign up or login with your details

Forgot password? Click here to reset