ReliTalk: Relightable Talking Portrait Generation from a Single Video

09/05/2023
by   Haonan Qiu, et al.
0

Recent years have witnessed great progress in creating vivid audio-driven portraits from monocular videos. However, how to seamlessly adapt the created video avatars to other scenarios with different backgrounds and lighting conditions remains unsolved. On the other hand, existing relighting studies mostly rely on dynamically lighted or multi-view data, which are too expensive for creating video portraits. To bridge this gap, we propose ReliTalk, a novel framework for relightable audio-driven talking portrait generation from monocular videos. Our key insight is to decompose the portrait's reflectance from implicitly learned audio-driven facial normals and images. Specifically, we involve 3D facial priors derived from audio features to predict delicate normal maps through implicit functions. These initially predicted normals then take a crucial part in reflectance decomposition by dynamically estimating the lighting condition of the given video. Moreover, the stereoscopic face representation is refined using the identity-consistent loss under simulated multiple lighting conditions, addressing the ill-posed problem caused by limited views available from a single monocular video. Extensive experiments validate the superiority of our proposed framework on both real and synthetic datasets. Our code is released in https://github.com/arthur-qiu/ReliTalk.

READ FULL TEXT
research
07/24/2020

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency

Recent learning-based approaches, in which models are trained by single-...
research
12/06/2022

RANA: Relightable Articulated Neural Avatars

We propose RANA, a relightable and articulated neural avatar for the pho...
research
04/22/2021

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation

While accurate lip synchronization has been achieved for arbitrary-subje...
research
07/19/2023

Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation

Talking head video generation aims to animate a human face in a still im...
research
06/20/2023

Audio-Driven 3D Facial Animation from In-the-Wild Videos

Given an arbitrary audio clip, audio-driven 3D facial animation aims to ...
research
06/13/2023

Parametric Implicit Face Representation for Audio-Driven Facial Reenactment

Audio-driven facial reenactment is a crucial technique that has a range ...
research
01/17/2023

Face Inverse Rendering via Hierarchical Decoupling

Previous face inverse rendering methods often require synthetic data wit...

Please sign up or login with your details

Forgot password? Click here to reset