DPE: Disentanglement of Pose and Expression for General Video Portrait Editing

01/16/2023
by   Youxin Pang, et al.
0

One-shot video-driven talking face generation aims at producing a synthetic talking video by transferring the facial motion from a video to an arbitrary portrait image. Head pose and facial expression are always entangled in facial motion and transferred simultaneously. However, the entanglement sets up a barrier for these methods to be used in video portrait editing directly, where it may require to modify the expression only while maintaining the pose unchanged. One challenge of decoupling pose and expression is the lack of paired data, such as the same pose but different expressions. Only a few methods attempt to tackle this challenge with the feat of 3D Morphable Models (3DMMs) for explicit disentanglement. But 3DMMs are not accurate enough to capture facial details due to the limited number of Blenshapes, which has side effects on motion transfer. In this paper, we introduce a novel self-supervised disentanglement framework to decouple pose and expression without 3DMMs and paired data, which consists of a motion editing module, a pose generator, and an expression generator. The editing module projects faces into a latent space where pose motion and expression motion can be disentangled, and the pose or expression transfer can be performed in the latent space conveniently via addition. The two generators render the modified latent codes to images, respectively. Moreover, to guarantee the disentanglement, we propose a bidirectional cyclic training strategy with well-designed constraints. Evaluations demonstrate our method can control pose or expression independently and be used for general video editing.

READ FULL TEXT

page 1

page 6

page 7

page 8

research
04/18/2023

POCE: Pose-Controllable Expression Editing

Facial expression editing has attracted increasing attention with the ad...
research
08/01/2023

Context-Aware Talking-Head Video Editing

Talking-head video editing aims to efficiently insert, delete, and subst...
research
05/27/2022

Video2StyleGAN: Disentangling Local and Global Variations in a Video

Image editing using a pretrained StyleGAN generator has emerged as a pow...
research
06/01/2023

We never go out of Style: Motion Disentanglement by Subspace Decomposition of Latent Space

Real-world objects perform complex motions that involve multiple indepen...
research
06/27/2022

Video2StyleGAN: Encoding Video in Latent Space for Manipulation

Many recent works have been proposed for face image editing by leveragin...
research
02/11/2022

Video-driven Neural Physically-based Facial Asset for Production

Production-level workflows for producing convincing 3D dynamic human fac...
research
03/11/2021

Self-Supervised Motion Retargeting with Safety Guarantee

In this paper, we present self-supervised shared latent embedding (S3LE)...

Please sign up or login with your details

Forgot password? Click here to reset