DeepAI AI Chat
Log In Sign Up

Constrained Convolutional-Recurrent Networks to Improve Speech Quality with Low Impact on Recognition Accuracy

02/16/2018
by   Rasool Fakoor, et al.
The University of Texas at Arlington
Microsoft
0

For a speech-enhancement algorithm, it is highly desirable to simultaneously improve perceptual quality and recognition rate. Thanks to computational costs and model complexities, it is challenging to train a model that effectively optimizes both metrics at the same time. In this paper, we propose a method for speech enhancement that combines local and global contextual structures information through convolutional-recurrent neural networks that improves perceptual quality. At the same time, we introduce a new constraint on the objective function using a language model/decoder that limits the impact on recognition rate. Based on experiments conducted with real user data, we demonstrate that our new context-augmented machine-learning approach for speech enhancement improves PESQ and WER by an additional 24.5 respectively, when compared to the best-performing methods in the literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/20/2022

NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement

Acoustic echo cancellation (AEC) plays an important role in the full-dup...
04/08/2021

Phoneme-based Distribution Regularization for Speech Enhancement

Existing speech enhancement methods mainly separate speech from noises a...
08/24/2020

AMRConvNet: AMR-Coded Speech Enhancement Using Convolutional Neural Networks

Speech is converted to digital signals using speech coding for efficient...
05/02/2018

Convolutional-Recurrent Neural Networks for Speech Enhancement

We propose an end-to-end model based on convolutional and recurrent neur...
05/20/2020

TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids

Modern speech enhancement algorithms achieve remarkable noise suppressio...
10/24/2022

TridentSE: Guiding Speech Enhancement with 32 Global Tokens

In this paper, we present TridentSE, a novel architecture for speech enh...
01/22/2021

Towards efficient models for real-time deep noise suppression

With recent research advancements, deep learning models are becoming att...