DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering

01/03/2022
by   Shunyu Yao, et al.
7

While recent advances in deep neural networks have made it possible to render high-quality images, generating photo-realistic and personalized talking head remains challenging. With given audio, the key to tackling this task is synchronizing lip movement and simultaneously generating personalized attributes like head movement and eye blink. In this work, we observe that the input audio is highly correlated to lip motion while less correlated to other personalized attributes (e.g., head movements). Inspired by this, we propose a novel framework based on neural radiance field to pursue high-fidelity and personalized talking head generation. Specifically, neural radiance field takes lip movements features and personalized attributes as two disentangled conditions, where lip movements are directly predicted from the audio inputs to achieve lip-synchronized generation. In the meanwhile, personalized attributes are sampled from a probabilistic model, where we design a Transformer-based variational autoencoder sampled from Gaussian Process to learn plausible and natural-looking head pose and eye blink. Experiments on several benchmarks demonstrate that our method achieves significantly better results than state-of-the-art methods.

READ FULL TEXT

page 1

page 6

page 8

research
02/24/2020

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose

Real-world talking faces often accompany with natural head movement. How...
research
08/18/2021

FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning

In this paper, we propose a talking face generation method that takes an...
research
07/16/2020

Talking-head Generation with Rhythmic Head Motion

When people deliver a speech, they naturally move heads, and this rhythm...
research
02/13/2022

Lip movements information disentanglement for lip sync

The lip movements information is critical for many audio-visual tasks. H...
research
03/14/2023

DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions

For realistic talking head generation, creating natural head motion whil...
research
05/09/2023

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

Despite recent advances in syncing lip movements with any audio waves, c...
research
10/02/2019

Animating Face using Disentangled Audio Representations

All previous methods for audio-driven talking head generation assume the...

Please sign up or login with your details

Forgot password? Click here to reset