Speech denoising by parametric resynthesis

04/02/2019
by   Soumi Maiti, et al.
0

This work proposes the use of clean speech vocoder parameters as the target for a neural network performing speech enhancement. These parameters have been designed for text-to-speech synthesis so that they both produce high-quality resyntheses and also are straightforward to model with neural networks, but have not been utilized in speech enhancement until now. In comparison to a matched text-to-speech system that is given the ground truth transcripts of the noisy speech, our model is able to produce more natural speech because it has access to the true prosody in the noisy speech. In comparison to two denoising systems, the oracle Wiener mask and a DNN-based mask predictor, our model equals the oracle Wiener mask in subjective quality and intelligibility and surpasses the realistic system. A vocoder-based upper bound shows that there is still room for improvement with this approach beyond the oracle Wiener mask. We test speaker-dependence with two speakers and show that a single model can be used for multiple speakers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/14/2019

Speaker independence of neural vocoders and their effect on parametric resynthesis speech enhancement

Traditional speech enhancement systems produce speech with compromised q...
research
06/16/2019

Parametric Resynthesis with neural vocoders

Noise suppression systems generally produce output speech with copromise...
research
05/16/2020

Sparse Mixture of Local Experts for Efficient Speech Enhancement

In this paper, we investigate a deep learning approach for speech denois...
research
06/02/2020

Dilated U-net based approach for multichannel speech enhancement from First-Order Ambisonics recordings

We present a CNN architecture for speech enhancement from multichannel f...
research
02/11/2021

Speech enhancement with mixture-of-deep-experts with clean clustering pre-training

In this study we present a mixture of deep experts (MoDE) neural-network...
research
03/31/2022

Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening

It is essential to perform speech intelligibility (SI) experiments with ...
research
11/06/2018

Kernel Machines Beat Deep Neural Networks on Mask-based Single-channel Speech Enhancement

We apply a fast kernel method for mask-based single-channel speech enhan...

Please sign up or login with your details

Forgot password? Click here to reset