A Supervised Speech enhancement Approach with Residual Noise Control for Voice Communication

12/08/2019
by   Andong Li, et al.
0

For voice communication, it is important to extract the speech from its noisy version without introducing unnaturally artificial noise. By studying the subband mean-squared error (MSE) of the speech for unsupervised speech enhancement approaches and revealing its relationship with the existing loss function for supervised approaches, this paper derives a generalized loss function, when taking the residual noise control into account, for supervised approaches. Our generalized loss function contains the well-known MSE loss function and many other often-used loss functions as special cases. Compared with traditional loss functions, our generalized loss function is more flexible to make a good trade-off between speech distortion and noise reduction. This is because a group of well-studied noise shaping schemes can be introduced to control residual noise for practical applications. Objective and subjective test results verify the importance of residual noise control for the supervised speech enhancement approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2019

A Perceptual Weighting Filter Loss for DNN Training in Speech Enhancement

Single-channel speech enhancement with deep neural networks (DNNs) has s...
research
01/28/2020

Weighted Speech Distortion Losses for Neural-network-based Real-time Speech Enhancement

This paper investigates several aspects of training a RNN (recurrent neu...
research
09/03/2019

On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement

Many deep learning-based speech enhancement algorithms are designed to m...
research
07/29/2020

On Loss Functions and Recurrency Training for GAN-based Speech Enhancement Systems

Recent work has shown that it is feasible to use generative adversarial ...
research
08/14/2019

Components Loss for Neural Networks in Mask-Based Speech Enhancement

Estimating time-frequency domain masks for single-channel speech enhance...
research
11/08/2020

Frequency Gating: Improved Convolutional Neural Networks for Speech Enhancement in the Time-Frequency Domain

One of the strengths of traditional convolutional neural networks (CNNs)...
research
06/02/2023

Towards Robust FastSpeech 2 by Modelling Residual Multimodality

State-of-the-art non-autoregressive text-to-speech (TTS) models based on...

Please sign up or login with your details

Forgot password? Click here to reset