Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation

09/10/2023
by   Yuan Gan, et al.
0

Audio-driven talking-head synthesis is a popular research topic for virtual human-related applications. However, the inflexibility and inefficiency of existing methods, which necessitate expensive end-to-end training to transfer emotions from guidance videos to talking-head predictions, are significant limitations. In this work, we propose the Emotional Adaptation for Audio-driven Talking-head (EAT) method, which transforms emotion-agnostic talking-head models into emotion-controllable ones in a cost-effective and efficient manner through parameter-efficient adaptations. Our approach utilizes a pretrained emotion-agnostic talking-head transformer and introduces three lightweight adaptations (the Deep Emotional Prompts, Emotional Deformation Network, and Emotional Adaptation Module) from different perspectives to enable precise and realistic emotion controls. Our experiments demonstrate that our approach achieves state-of-the-art performance on widely-used benchmarks, including LRW and MEAD. Additionally, our parameter-efficient adaptations exhibit remarkable generalization ability, even in scenarios where emotional training videos are scarce or nonexistent. Project website: https://yuangan.github.io/eat/

READ FULL TEXT

page 3

page 4

page 5

page 7

page 8

page 13

page 15

page 16

research
04/15/2021

Audio-Driven Emotional Video Portraits

Despite previous success in generating audio-driven talking heads, most ...
research
05/30/2022

EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model

Although significant progress has been made to audio-driven talking face...
research
04/25/2021

3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head

Impressive progress has been made in audio-driven 3D facial animation re...
research
09/21/2023

Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech

Prosodic phrasing is crucial to the naturalness and intelligibility of e...
research
10/02/2019

Animating Face using Disentangled Audio Representations

All previous methods for audio-driven talking head generation assume the...
research
03/01/2023

READ Avatars: Realistic Emotion-controllable Audio Driven Avatars

We present READ Avatars, a 3D-based approach for generating 2D avatars t...
research
04/03/2020

Comparing emotional states induced by 360^∘ videos via head-mounted display and computer screen

In recent years 360^∘ videos have been becoming more popular. For tradit...

Please sign up or login with your details

Forgot password? Click here to reset