DeepAI AI Chat
Log In Sign Up

Parametric Resynthesis with neural vocoders

06/16/2019
by   Soumi Maiti, et al.
CUNY Law School
0

Noise suppression systems generally produce output speech with copromised quality. We propose to utilize the high quality speech generation capability of neural vocoders for noise suppression. We use a neural network to predict clean mel-spectrogram features from noisy speech and then compare two neural vocoders, WaveNet and WaveGlow, for synthesizing clean speech from the predicted mel spectrogram. Both WaveNet and WaveGlow achieve better subjective and objective quality scores than the source separation model Chimera++. Further, WaveNet and WaveGlow also achieve significantly better subjective quality ratings than the oracle Wiener mask. Moreover, we observe that between WaveNet and WaveGlow, WaveNet achieves the best subjective quality scores, although at the cost of much slower waveform generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/02/2019

Speech denoising by parametric resynthesis

This work proposes the use of clean speech vocoder parameters as the tar...
10/05/2021

DNSMOS P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors

Human subjective evaluation is the gold standard to evaluate speech qual...
02/12/2021

Enhancing into the codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders

Audio codecs based on discretized neural autoencoders have recently been...
06/27/2022

Wideband Audio Waveform Evaluation Networks: Efficient, Accurate Estimation of Speech Qualities

Wideband Audio Waveform Evaluation Networks (WAWEnets) are convolutional...
12/03/2020

Individually amplified text-to-speech

Text-to-speech (TTS) offers the opportunity to compensate for a hearing ...
02/23/2021

Handling Background Noise in Neural Speech Generation

Recent advances in neural-network based generative modeling of speech ha...
07/31/2020

A Pyramid Recurrent Network for Predicting Crowdsourced Speech-Quality Ratings of Real-World Signals

The real-world capabilities of objective speech quality measures are lim...