A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis

by   Yichen Han, et al.

Audio driven talking head synthesis is a challenging task that attracts increasing attention in recent years. Although existing methods based on 2D landmarks or 3D face models can synthesize accurate lip synchronization and rhythmic head pose for arbitrary identity, they still have limitations, such as the cut feeling in the mouth mapping and the lack of skin highlights. The morphed region is blurry compared to the surrounding face. A Keypoint Based Enhancement (KPBE) method is proposed for audio driven free view talking head synthesis to improve the naturalness of the generated video. Firstly, existing methods were used as the backend to synthesize intermediate results. Then we used keypoint decomposition to extract video synthesis controlling parameters from the backend output and the source image. After that, the controlling parameters were composited to the source keypoints and the driving keypoints. A motion field based method was used to generate the final image from the keypoint representation. With keypoint representation, we overcame the cut feeling in the mouth mapping and the lack of skin highlights. Experiments show that our proposed enhancement method improved the quality of talking-head videos in terms of mean opinion score.


page 1

page 5


One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing

We propose a neural talking-head video synthesis model and demonstrate i...

Facial Keypoint Sequence Generation from Audio

Whenever we speak, our voice is accompanied by facial movements and expr...

SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Generating talking head videos through a face image and a piece of speec...

HeadGAN: Video-and-Audio-Driven Talking Head Synthesis

Recent attempts to solve the problem of talking head synthesis using a s...

DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis

Talking head synthesis is a promising approach for the video production ...

Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention

Most of the existing audio-driven 3D facial animation methods suffered f...

Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary

With the advance of deep learning technology, automatic video generation...

Please sign up or login with your details

Forgot password? Click here to reset