GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis

01/31/2023
by   Zhenhui Ye, et al.
3

Generating photo-realistic video portrait with arbitrary speech audio is a crucial problem in film-making and virtual reality. Recently, several works explore the usage of neural radiance field in this task to improve 3D realness and image fidelity. However, the generalizability of previous NeRF-based methods to out-of-domain audio is limited by the small scale of training data. In this work, we propose GeneFace, a generalized and high-fidelity NeRF-based talking face generation method, which can generate natural results corresponding to various out-of-domain audio. Specifically, we learn a variaitional motion generator on a large lip-reading corpus, and introduce a domain adaptative post-net to calibrate the result. Moreover, we learn a NeRF-based renderer conditioned on the predicted facial motion. A head-aware torso-NeRF is proposed to eliminate the head-torso separation problem. Extensive experiments show that our method achieves more generalized and high-fidelity talking face generation compared to previous methods.

READ FULL TEXT

page 8

page 15

research
07/19/2023

MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions

Audio-driven portrait animation aims to synthesize portrait videos that ...
research
03/20/2021

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis

Generating high-fidelity talking head video by fitting with the input au...
research
05/01/2023

GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation

Generating talking person portraits with arbitrary speech audio is a cru...
research
01/19/2022

Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation

Animating high-fidelity video portrait with speech audio is crucial for ...
research
01/10/2023

DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis

Talking head synthesis is a promising approach for the video production ...
research
05/09/2023

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

Despite recent advances in syncing lip movements with any audio waves, c...
research
06/13/2023

Parametric Implicit Face Representation for Audio-Driven Facial Reenactment

Audio-driven facial reenactment is a crucial technique that has a range ...

Please sign up or login with your details

Forgot password? Click here to reset