Invertable Frowns: Video-to-Video Facial Emotion Translation

09/16/2021
by   Ian Magnusson, et al.
0

We present Wav2Lip-Emotion, a video-to-video translation architecture that modifies facial expressions of emotion in videos of speakers. Previous work modifies emotion in images, uses a single image to produce a video with animated emotion, or puppets facial expressions in videos with landmarks from a reference video. However, many use cases such as modifying an actor's performance in post-production, coaching individuals to be more animated speakers, or touching up emotion in a teleconference require a video-to-video translation approach. We explore a method to maintain speakers' lip movements, identity, and pose while translating their expressed emotion. Our approach extends an existing multi-modal lip synchronization architecture to modify the speaker's emotion using L1 reconstruction and pre-trained emotion objectives. We also propose a novel automated emotion evaluation approach and corroborate it with a user study. These find that we succeed in modifying emotion while maintaining lip synchronization. Visual quality is somewhat diminished, with a trade off between greater emotion modification and visual quality between model variants. Nevertheless, we demonstrate (1) that facial expressions of emotion can be modified with nothing other than L1 reconstruction and pre-trained emotion objectives and (2) that our automated emotion evaluation approach aligns with human judgements.

READ FULL TEXT

page 1

page 5

research
11/23/2022

Whose Emotion Matters? Speaker Detection without Prior Knowledge

The task of emotion recognition in conversations (ERC) benefits from the...
research
08/08/2020

Speech Driven Talking Face Generation from a Single Image and an Emotion Condition

Visual emotion expression plays an important role in audiovisual speech ...
research
11/15/2021

Deep Semantic Manipulation of Facial Videos

Editing and manipulating facial features in videos is an interesting and...
research
09/17/2022

Continuously Controllable Facial Expression Editing in Talking Face Videos

Recently audio-driven talking face video generation has attracted consid...
research
08/13/2018

Cross-Cultural and Cultural-Specific Production and Perception of Facial Expressions of Emotion in the Wild

Automatic recognition of emotion from facial expressions is an intense a...
research
12/01/2021

Neural Emotion Director: Speech-preserving semantic control of facial expressions in "in-the-wild" videos

In this paper, we introduce a novel deep learning method for photo-reali...
research
05/17/2022

RARITYNet: Rarity Guided Affective Emotion Learning Framework

Inspired from the assets of handcrafted and deep learning approaches, we...

Please sign up or login with your details

Forgot password? Click here to reset