VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

11/27/2022
by   Kun Cheng, et al.
0

We present VideoReTalking, a new system to edit the faces of a real-world talking head video according to input audio, producing a high-quality and lip-syncing output video even with a different emotion. Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-realism. Given a talking-head video, we first modify the expression of each frame according to the same expression template using the expression editing network, resulting in a video with the canonical expression. This video, together with the given audio, is then fed into the lip-sync network to generate a lip-syncing video. Finally, we improve the photo-realism of the synthesized faces through an identity-aware face enhancement network and post-processing. We use learning-based approaches for all three steps and all our modules can be tackled in a sequential pipeline without any user intervention. Furthermore, our system is a generic approach that does not need to be retrained to a specific person. Evaluations on two widely-used datasets and in-the-wild examples demonstrate the superiority of our framework over other state-of-the-art methods in terms of lip-sync accuracy and visual quality.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 6

page 7

page 8

research
02/24/2020

Audio-driven Talking Face Video Generation with Natural Head Pose

Real-world talking faces often accompany with natural head movement. How...
research
06/04/2019

Text-based Editing of Talking-head Video

Editing talking-head video to change the speech content or to remove fil...
research
01/16/2022

Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels

In this paper, we present a dynamic convolution kernel (DCK) strategy fo...
research
01/15/2020

Everybody's Talkin': Let Me Talk as You Want

We present a method to edit a target portrait footage by taking a sequen...
research
10/06/2022

Audio-Visual Face Reenactment

This work proposes a novel method to generate realistic talking head vid...
research
06/08/2021

LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization

In this paper, we present a video-based learning framework for animating...
research
10/16/2021

Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor

This paper proposes a video editor based on OpenShot with several state-...

Please sign up or login with your details

Forgot password? Click here to reset