A deep representation learning speech enhancement method using β-VAE

05/11/2022
by   Yang Xiang, et al.
0

In previous work, we proposed a variational autoencoder-based (VAE) Bayesian permutation training speech enhancement (SE) method (PVAE) which indicated that the SE performance of the traditional deep neural network-based (DNN) method could be improved by deep representation learning (DRL). Based on our previous work, we in this paper propose to use β-VAE to further improve PVAE's ability of representation learning. More specifically, our β-VAE can improve PVAE's capacity of disentangling different latent variables from the observed signal without the trade-off problem between disentanglement and signal reconstruction. This trade-off problem widely exists in previous β-VAE algorithms. Unlike the previous β-VAE algorithms, the proposed β-VAE strategy can also be used to optimize the DNN's structure. This means that the proposed method can not only improve PVAE's SE performance but also reduce the number of PVAE training parameters. The experimental results show that the proposed method can acquire better speech and noise latent representation than PVAE. Meanwhile, it also obtains a higher scale-invariant signal-to-distortion ratio, speech quality, and speech intelligibility.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2022

A Bayesian Permutation training deep representation learning method for speech enhancement with variational autoencoder

Recently, variational autoencoder (VAE), a deep representation learning ...
research
11/16/2022

A Two-Stage Deep Representation Learning-Based Speech Enhancement Method Using Variational Autoencoder and Adversarial Training

This paper focuses on leveraging deep representation learning (DRL) for ...
research
06/23/2021

Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders

Dynamical variational auto-encoders (DVAEs) are a class of deep generati...
research
05/16/2020

Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction

Vector Quantized Variational AutoEncoders (VQ-VAE) are a powerful repres...
research
12/23/2019

Mixture of Inference Networks for VAE-based Audio-visual Speech Enhancement

In this paper, we are interested in unsupervised speech enhancement usin...
research
01/25/2019

Unsupervised speech representation learning using WaveNet autoencoders

We consider the task of unsupervised extraction of meaningful latent rep...
research
10/13/2021

DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding

Conventional vocoders are commonly used as analysis tools to provide int...

Please sign up or login with your details

Forgot password? Click here to reset