Identity-Preserving Realistic Talking Face Generation

05/25/2020
by   Sanjana Sinha, et al.
7

Speech-driven facial animation is useful for a variety of applications such as telepresence, chatbots, etc. The necessary attributes of having a realistic face animation are 1) audio-visual synchronization (2) identity preservation of the target individual (3) plausible mouth movements (4) presence of natural eye blinks. The existing methods mostly address the audio-visual lip synchronization, and few recent works have addressed the synthesis of natural eye blinks for overall video realism. In this paper, we propose a method for identity-preserving realistic facial animation from speech. We first generate person-independent facial landmarks from audio using DeepSpeech features for invariance to different voices, accents, etc. To add realism, we impose eye blinks on facial landmarks using unsupervised learning and retargets the person-independent landmarks to person-specific landmarks to preserve the identity-related facial structure which helps in the generation of plausible mouth shapes of the target identity. Finally, we use LSGAN to generate the facial texture from person-specific facial landmarks, using an attention mechanism that helps to preserve identity-related texture. An extensive comparison of our proposed method with the current state-of-the-art methods demonstrates a significant improvement in terms of lip synchronization accuracy, image reconstruction quality, sharpness, and identity-preservation. A user study also reveals improved realism of our animation results over the state-of-the-art methods. To the best of our knowledge, this is the first work in speech-driven 2D facial animation that simultaneously addresses all the above-mentioned attributes of a realistic speech-driven face animation.

READ FULL TEXT

page 1

page 2

page 5

page 7

page 8

page 9

research
04/16/2021

MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement

This paper presents a generic method for generating full facial 3D anima...
research
12/30/2022

Imitator: Personalized Speech-driven 3D Facial Animation

Speech-driven 3D facial animation has been widely explored, with applica...
research
08/23/2020

Geometry-guided Dense Perspective Network for Speech-Driven Facial Animation

Realistic speech-driven 3D facial animation is a challenging problem due...
research
06/02/2023

Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation

This paper presents a novel approach for generating 3D talking heads fro...
research
05/09/2019

Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss

We devise a cascade GAN approach to generate talking face video, which i...
research
08/04/2022

Artificial Image Tampering Distorts Spatial Distribution of Texture Landmarks and Quality Characteristics

Advances in AI based computer vision has led to a significant growth in ...
research
03/30/2020

ActGAN: Flexible and Efficient One-shot Face Reenactment

This paper introduces ActGAN - a novel end-to-end generative adversarial...

Please sign up or login with your details

Forgot password? Click here to reset