Learning robust speech representation with an articulatory-regularized variational autoencoder

04/07/2021
by   Marc-Antoine Georges, et al.
0

It is increasingly considered that human speech perception and production both rely on articulatory representations. In this paper, we investigate whether this type of representation could improve the performances of a deep generative model (here a variational autoencoder) trained to encode and decode acoustic speech features. First we develop an articulatory model able to associate articulatory parameters describing the jaw, tongue, lips and velum configurations with vocal tract shapes and spectral features. Then we incorporate these articulatory parameters into a variational autoencoder applied on spectral features by using a regularization technique that constraints part of the latent space to follow articulatory trajectories. We show that this articulatory constraint improves model training by decreasing time to convergence and reconstruction loss at convergence, and yields better performance in a speech denoising task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/14/2022

Learning and controlling the source-filter representation of speech with a variational autoencoder

Understanding and controlling latent representations in deep generative ...
research
02/12/2021

Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier

Recently, variational autoencoders have been successfully used to learn ...
research
08/25/2023

Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder

Neural networks have been able to generate high-quality single-sentence ...
research
01/13/2019

Modeling neural dynamics during speech production using a state space variational autoencoder

Characterizing the neural encoding of behavior remains a challenging tas...
research
02/06/2022

Enhancing variational generation through self-decomposition

In this article we introduce the notion of Split Variational Autoencoder...
research
04/05/2019

Combining Sentiment Lexica with a Multi-View Variational Autoencoder

When assigning quantitative labels to a dataset, different methodologies...
research
08/10/2021

Regularized Sequential Latent Variable Models with Adversarial Neural Networks

The recurrent neural networks (RNN) with richly distributed internal sta...

Please sign up or login with your details

Forgot password? Click here to reset