Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation

07/19/2023
by   Fa-Ting Hong, et al.
0

Talking head video generation aims to animate a human face in a still image with dynamic poses and expressions using motion information derived from a target-driving video, while maintaining the person's identity in the source image. However, dramatic and complex motions in the driving video cause ambiguous generation, because the still source image cannot provide sufficient appearance information for occluded regions or delicate expression variations, which produces severe artifacts and significantly degrades the generation quality. To tackle this problem, we propose to learn a global facial representation space, and design a novel implicit identity representation conditioned memory compensation network, coined as MCNet, for high-fidelity talking head generation.~Specifically, we devise a network module to learn a unified spatial facial meta-memory bank from all training samples, which can provide rich facial structure and appearance priors to compensate warped source facial features for the generation. Furthermore, we propose an effective query mechanism based on implicit identity representations learned from the discrete keypoints of the source image. It can greatly facilitate the retrieval of more correlated information from the memory bank for the compensation. Extensive experiments demonstrate that MCNet can learn representative and complementary facial memory, and can clearly outperform previous state-of-the-art talking head generation methods on VoxCeleb1 and CelebV datasets. Please check our \href{https://github.com/harlanhong/ICCV2023-MCNET}{Project}.

READ FULL TEXT

page 1

page 3

page 7

page 8

page 11

page 12

page 13

page 14

research
04/20/2023

High-Fidelity and Freely Controllable Talking Head Video Generation

Talking head generation is to generate video based on a given source ide...
research
05/05/2023

Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos

Modern generators render talking-head videos with impressive levels of p...
research
04/11/2023

One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field

Talking head generation aims to generate faces that maintain the identit...
research
11/09/2020

FACEGAN: Facial Attribute Controllable rEenactment GAN

The face reenactment is a popular facial animation method where the pers...
research
09/05/2023

ReliTalk: Relightable Talking Portrait Generation from a Single Video

Recent years have witnessed great progress in creating vivid audio-drive...
research
09/09/2023

Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video

Synthesizing realistic videos according to a given speech is still an op...
research
11/13/2020

Image Animation with Perturbed Masks

We present a novel approach for image-animation of a source image by a d...

Please sign up or login with your details

Forgot password? Click here to reset