DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions

03/14/2023
by   Geumbyeol Hwang, et al.
0

For realistic talking head generation, creating natural head motion while maintaining accurate lip synchronization is essential. To fulfill this challenging task, we propose DisCoHead, a novel method to disentangle and control head pose and facial expressions without supervision. DisCoHead uses a single geometric transformation as a bottleneck to isolate and extract head motion from a head-driving video. Either an affine or a thin-plate spline transformation can be used and both work well as geometric bottlenecks. We enhance the efficiency of DisCoHead by integrating a dense motion estimator and the encoder of a generator which are originally separate modules. Taking a step further, we also propose a neural mix approach where dense motion is estimated and applied implicitly by the encoder. After applying the disentangled head motion to a source identity, DisCoHead controls the mouth region according to speech audio, and it blinks eyes and moves eyebrows following a separate driving video of the eye region, via the weight modulation of convolutional neural networks. The experiments using multiple datasets show that DisCoHead successfully generates realistic audio-and-video-driven talking heads and outperforms state-of-the-art methods. Project page: https://deepbrainai-research.github.io/discohead/

READ FULL TEXT

page 2

page 4

research
07/20/2021

Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion

We propose an audio-driven talking-head method to generate photo-realist...
research
08/23/2022

StyleTalker: One-shot Style-based Audio-driven Talking Head Video Generation

We propose StyleTalker, a novel audio-driven talking head generation mod...
research
11/22/2022

SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Generating talking head videos through a face image and a piece of speec...
research
08/01/2023

Context-Aware Talking-Head Video Editing

Talking-head video editing aims to efficiently insert, delete, and subst...
research
11/17/2022

SPACEx: Speech-driven Portrait Animation with Controllable Expression

Animating portraits using speech has received growing attention in recen...
research
10/06/2022

Audio-Visual Face Reenactment

This work proposes a novel method to generate realistic talking head vid...
research
01/03/2022

DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering

While recent advances in deep neural networks have made it possible to r...

Please sign up or login with your details

Forgot password? Click here to reset