VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer

08/09/2023
by   Liyang Chen, et al.
0

Current talking face generation methods mainly focus on speech-lip synchronization. However, insufficient investigation on the facial talking style leads to a lifeless and monotonous avatar. Most previous works fail to imitate expressive styles from arbitrary video prompts and ensure the authenticity of the generated video. This paper proposes an unsupervised variational style transfer model (VAST) to vivify the neutral photo-realistic avatars. Our model consists of three key components: a style encoder that extracts facial style representations from the given video prompts; a hybrid facial expression decoder to model accurate speech-related movements; a variational style enhancer that enhances the style space to be highly expressive and meaningful. With our essential designs on facial style learning, our model is able to flexibly capture the expressive facial style from arbitrary video prompts and transfer it onto a personalized image renderer in a zero-shot manner. Experimental results demonstrate the proposed approach contributes to a more vivid talking avatar with higher authenticity and richer expressiveness.

READ FULL TEXT

page 3

page 5

page 6

page 7

page 8

research
05/15/2022

GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Synthesis

Style transfer for out-of-domain (OOD) speech synthesis aims to generate...
research
05/28/2023

StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation

Direct speech-to-speech translation (S2ST) has gradually become popular ...
research
05/10/2018

Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration

Zero-shot artistic style transfer is an important image synthesis proble...
research
04/30/2023

StyleLipSync: Style-based Personalized Lip-sync Video Generation

In this paper, we present StyleLipSync, a style-based personalized lip-s...
research
07/30/2023

HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer

Despite rapid progress in the voice style transfer (VST) field, recent z...
research
03/19/2023

StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields

3D style transfer aims to render stylized novel views of a 3D scene with...
research
09/03/2019

Face-to-Parameter Translation for Game Character Auto-Creation

Character customization system is an important component in Role-Playing...

Please sign up or login with your details

Forgot password? Click here to reset