DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation

by   Qiaosong Qi, et al.

When hearing music, it is natural for people to dance to its rhythm. Automatic dance generation, however, is a challenging task due to the physical constraints of human motion and rhythmic alignment with target music. Conventional autoregressive methods introduce compounding errors during sampling and struggle to capture the long-term structure of dance sequences. To address these limitations, we present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation. This model comprises a music-to-dance diffusion model and a sequence super-resolution diffusion model. To bridge the gap between music and motion for conditional generation, DiffDance employs a pretrained audio representation learning model to extract music embeddings and further align its embedding space to motion via contrastive loss. During training our cascaded diffusion model, we also incorporate multiple geometric losses to constrain the model outputs to be physically plausible and add a dynamic loss weight that adaptively changes over diffusion timesteps to facilitate sample diversity. Through comprehensive experiments performed on the benchmark dataset AIST++, we demonstrate that DiffDance is capable of generating realistic dance sequences that align effectively with the input music. These results are comparable to those achieved by state-of-the-art autoregressive methods.


LongDanceDiff: Long-term Dance Generation with Conditional Diffusion Model

Dancing with music is always an essential human art form to express emot...

Taming Diffusion Models for Music-driven Conducting Motion Generation

Generating the motion of orchestral conductors from a given piece of sym...

MCM: Multi-condition Motion Synthesis Framework for Multi-scenario

The objective of the multi-condition human motion synthesis task is to i...

EDGE: Editable Dance Generation From Music

Dance is an important human art form, but creating new dances can be dif...

Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance Generation

Achieving multiple genres and long-term choreography sequences from give...

Bipartite Graph Diffusion Model for Human Interaction Generation

The generation of natural human motion interactions is a hot topic in co...

Robust Dancer: Long-term 3D Dance Synthesis Using Unpaired Data

How to automatically synthesize natural-looking dance movements based on...

Please sign up or login with your details

Forgot password? Click here to reset