Prediction of head motion from speech waveforms with a canonical-correlation-constrained autoencoder

02/05/2020
by   JinHong Lu, et al.
0

This study investigates the direct use of speech waveforms to predict head motion for speech-driven head-motion synthesis, whereas the use of spectral features such as MFCC as basic input features together with additional features such as energy and F0 is common in the literature. We claim that, rather than combining different features that originate from waveforms, it is more effective to use waveforms directly predicting corresponding head motion. The challenge with the waveform-based approach is that waveforms contain a large amount of information irrelevant to predict head motion, which hinders the training of neural networks. To overcome the problem, we propose a canonical-correlation-constrained autoencoder (CCCAE), where hidden layers are trained to not only minimise the error but also maximise the canonical correlation with head motion. Compared with an MFCC-based system, the proposed system shows comparable performance in objective evaluation, and better performance in subject evaluation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/24/2019

A neural network based post-filter for speech-driven head motion synthesis

Despite the fact that neural networks are widely used for speech-driven ...
research
09/09/2023

Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video

Synthesizing realistic videos according to a given speech is still an op...
research
07/04/2023

A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation

Animating still face images with deep generative models using a speech i...
research
11/23/2022

ManVatar : Fast 3D Head Avatar Reconstruction Using Motion-Aware Neural Voxels

With NeRF widely used for facial reenactment, recent methods can recover...
research
10/26/2022

Naturalistic Head Motion Generation from Speech

Synthesizing natural head motion to accompany speech for an embodied con...
research
06/01/2022

Binding Dancers Into Attractors

To effectively perceive and process observations in our environment, fea...
research
07/23/2023

Explainable Depression Detection via Head Motion Patterns

While depression has been studied via multimodal non-verbal behavioural ...

Please sign up or login with your details

Forgot password? Click here to reset