Transformer-S2A: Robust and Efficient Speech-to-Animation

11/18/2021
by   Liyang Chen, et al.
0

We propose a novel robust and efficient Speech-to-Animation (S2A) approach for synchronized facial animation generation in human-computer interaction. Compared with conventional approaches, the proposed approach utilizes phonetic posteriorgrams (PPGs) of spoken phonemes as input to ensure the cross-language and cross-speaker ability, and introduces corresponding prosody features (i.e. pitch and energy) to further enhance the expression of generated animation. Mixture-of-experts (MOE)-based Transformer is employed to better model contextual information while provide significant optimization on computation efficiency. Experiments demonstrate the effectiveness of the proposed approach on both objective and subjective evaluation with 17x inference speedup compared with the state-of-the-art approach.

READ FULL TEXT
research
06/08/2020

MultiSpeech: Multi-Speaker Text to Speech with Transformer

Transformer-based text to speech (TTS) model (e.g., Transformer TTS <cit...
research
09/12/2021

TEASEL: A Transformer-Based Speech-Prefixed Language Model

Multimodal language analysis is a burgeoning field of NLP that aims to s...
research
06/29/2022

iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre

The capability of generating speech with specific type of emotion is des...
research
06/20/2020

Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams

Generating 3D speech-driven talking head has received more and more atte...
research
11/07/2022

Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder

Accent plays a significant role in speech communication, influencing und...
research
08/21/2023

Can Language Models Learn to Listen?

We present a framework for generating appropriate facial responses from ...
research
01/03/2023

Modeling the Rhythm from Lyrics for Melody Generation of Pop Song

Creating a pop song melody according to pre-written lyrics is a typical ...

Please sign up or login with your details

Forgot password? Click here to reset