Deep Xi as a Front-End for Robust Automatic Speech Recognition

06/18/2019
by   Aaron Nicolson, et al.
0

Front-end techniques for robust automatic speech recognition (ASR) have been dominated by masking- and mapping-based deep learning approaches to speech enhancement. Previously, minimum mean-square error (MMSE) approaches to speech enhancement using Deep Xi (a deep learning approach to a priori SNR estimation) were able to achieve higher quality and intelligibility scores than recent masking- and mapping-based deep learning approaches. Due to its high speech enhancement performance, we investigate the use of Deep Xi as a front-end for robust ASR. Deep Xi is evaluated using real-world non-stationary and coloured noise sources, at multiple SNR levels. Deep Xi achieved a relative word error rate reduction of 23.2 deep learning-based front-end. The results presented in this work show that Deep Xi is a viable front-end, and is able to significantly increase the robustness of an ASR system. Availability: Deep Xi is available at: https://github.com/anicolson/DeepXi

READ FULL TEXT
research
10/24/2022

Time-Domain Speech Enhancement for Robust Automatic Speech Recognition

It has been shown that the intelligibility of noisy speech can be improv...
research
03/09/2020

Improving noise robust automatic speech recognition with single-channel time-domain enhancement network

With the advent of deep learning, research on noise-robust automatic spe...
research
09/02/2015

Enhancement and Recognition of Reverberant and Noisy Speech by Extending Its Coherence

Most speech enhancement algorithms make use of the short-time Fourier tr...
research
05/30/2017

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

Eliminating the negative effect of non-stationary environmental noise is...
research
06/22/2016

A Curriculum Learning Method for Improved Noise Robustness in Automatic Speech Recognition

The performance of automatic speech recognition systems under noisy envi...
research
11/28/2018

Acoustics-guided evaluation (AGE): a new measure for estimating performance of speech enhancement algorithms for robust ASR

One challenging problem of robust automatic speech recognition (ASR) is ...
research
11/23/2021

Effect of noise suppression losses on speech distortion and ASR performance

Deep learning based speech enhancement has made rapid development toward...

Please sign up or login with your details

Forgot password? Click here to reset