Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement

05/19/2021
by   Guillaume Carbajal, et al.
0

Recently, the standard variational autoencoder has been successfully used to learn a probabilistic prior over speech signals, which is then used to perform speech enhancement. Variational autoencoders have then been conditioned on a label describing a high-level speech attribute (e.g. speech activity) that allows for a more explicit control of speech generation. However, the label is not guaranteed to be disentangled from the other latent variables, which results in limited performance improvements compared to the standard variational autoencoder. In this work, we propose to use an adversarial training scheme for variational autoencoders to disentangle the label from the other latent variables. At training, we use a discriminator that competes with the encoder of the variational autoencoder. Simultaneously, we also use an additional encoder that estimates the label for the decoder of the variational autoencoder, which proves to be crucial to learn disentanglement. We show the benefit of the proposed disentanglement learning when a voice activity label, estimated from visual data, is used for speech enhancement.

READ FULL TEXT
research
02/12/2021

Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier

Recently, variational autoencoders have been successfully used to learn ...
research
10/24/2019

A Recurrent Variational Autoencoder for Speech Enhancement

This paper presents a generative approach to speech enhancement based on...
research
11/24/2017

Quantifying the Effects of Enforcing Disentanglement on Variational Autoencoders

The notion of disentangled autoencoders was proposed as an extension to ...
research
09/10/2022

Variational Autoencoder Kernel Interpretation and Selection for Classification

This work proposed kernel selection approaches for probabilistic classif...
research
11/16/2022

A Two-Stage Deep Representation Learning-Based Speech Enhancement Method Using Variational Autoencoder and Adversarial Training

This paper focuses on leveraging deep representation learning (DRL) for ...
research
12/30/2020

Infer-AVAE: An Attribute Inference Model Based on Adversarial Variational Autoencoder

Facing the sparsity of user attributes on social networks, attribute inf...
research
01/13/2019

Modeling neural dynamics during speech production using a state space variational autoencoder

Characterizing the neural encoding of behavior remains a challenging tas...

Please sign up or login with your details

Forgot password? Click here to reset