Realistic Speech-Driven Facial Animation with GANs

06/14/2019
by   Konstantinos Vougioukas, et al.
4

Speech-driven facial animation is the process that automatically synthesizes talking characters based on speech signals. The majority of work in this domain creates a mapping from audio features to visual features. This approach often requires post-processing using computer graphics techniques to produce realistic albeit subject dependent results. We present an end-to-end system that generates videos of a talking head, using only a still image of a person and an audio clip containing speech, without relying on handcrafted intermediate features. Our method generates videos which have (a) lip movements that are in sync with the audio and (b) natural facial expressions such as blinks and eyebrow movements. Our temporal GAN uses 3 discriminators focused on achieving detailed frames, audio-visual synchronization, and realistic expressions. We quantify the contribution of each component in our model using an ablation study and we provide insights into the latent representation of the model. The generated videos are evaluated based on sharpness, reconstruction quality, lip-reading accuracy, synchronization as well as their ability to generate natural blinks.

READ FULL TEXT

page 4

page 6

page 8

page 9

page 11

page 12

page 13

page 14

research
05/23/2018

End-to-End Speech-Driven Facial Animation with Temporal GANs

Speech-driven facial animation is the process which uses speech signals ...
research
03/29/2023

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

Talking face generation, also known as speech-to-lip generation, reconst...
research
01/15/2023

Learning Audio-Driven Viseme Dynamics for 3D Face Animation

We present a novel audio-driven facial animation approach that can gener...
research
09/29/2022

Facial Landmark Predictions with Applications to Metaverse

This research aims to make metaverse characters more realistic by adding...
research
12/12/2019

Speech-driven facial animation using polynomial fusion of features

Speech-driven facial animation involves using a speech signal to generat...
research
05/27/2020

Modality Dropout for Improved Performance-driven Talking Faces

We describe our novel deep learning approach for driving animated faces ...
research
07/22/2022

Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos

The recent state of the art on monocular 3D face reconstruction from ima...

Please sign up or login with your details

Forgot password? Click here to reset