Controllable One-Shot Face Video Synthesis With Semantic Aware Prior

by   Kangning Liu, et al.

The one-shot talking-head synthesis task aims to animate a source image to another pose and expression, which is dictated by a driving frame. Recent methods rely on warping the appearance feature extracted from the source, by using motion fields estimated from the sparse keypoints, that are learned in an unsupervised manner. Due to their lightweight formulation, they are suitable for video conferencing with reduced bandwidth. However, based on our study, current methods suffer from two major limitations: 1) unsatisfactory generation quality in the case of large head poses and the existence of observable pose misalignment between the source and the first frame in driving videos. 2) fail to capture fine yet critical face motion details due to the lack of semantic understanding and appropriate face geometry regularization. To address these shortcomings, we propose a novel method that leverages the rich face prior information, the proposed model can generate face videos with improved semantic consistency (improve baseline by 7% in average keypoint distance) and expression-preserving (outperform baseline by 15 % in average emotion embedding distance) under equivalent bandwidth. Additionally, incorporating such prior information provides us with a convenient interface to achieve highly controllable generation in terms of both pose and expression.


page 2

page 7

page 8

page 15

page 16

page 17

page 18


High-Fidelity and Freely Controllable Talking Head Video Generation

Talking head generation is to generate video based on a given source ide...

One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing

We propose a neural talking-head video synthesis model and demonstrate i...

One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field

Talking head generation aims to generate faces that maintain the identit...

One-Shot Face Reenactment on Megapixels

The goal of face reenactment is to transfer a target expression and head...

ICface: Interpretable and Controllable Face Reenactment Using GANs

This paper presents a generic face animator that is able to control the ...

MA-NeRF: Motion-Assisted Neural Radiance Fields for Face Synthesis from Sparse Images

We address the problem of photorealistic 3D face avatar synthesis from s...

Neural Face Video Compression using Multiple Views

Recent advances in deep generative models led to the development of neur...

Please sign up or login with your details

Forgot password? Click here to reset