Improving noise robust automatic speech recognition with single-channel time-domain enhancement network

03/09/2020
by   Keisuke Kinoshita, et al.
0

With the advent of deep learning, research on noise-robust automatic speech recognition (ASR) has progressed rapidly. However, ASR performance in noisy conditions of single-channel systems remains unsatisfactory. Indeed, most single-channel speech enhancement (SE) methods (denoising) have brought only limited performance gains over state-of-the-art ASR back-end trained on multi-condition training data. Recently, there has been much research on neural network-based SE methods working in the time-domain showing levels of performance never attained before. However, it has not been established whether the high enhancement performance achieved by such time-domain approaches could be translated into ASR. In this paper, we show that a single-channel time-domain denoising approach can significantly improve ASR performance, providing more than 30 back-end on the real evaluation data of the single-channel track of the CHiME-4 dataset. These positive results demonstrate that single-channel noise reduction can still improve ASR performance, which should open the door to more research in that direction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2021

SNRi Target Training for Joint Speech Enhancement and Recognition

This study aims to improve the performance of automatic speech recogniti...
research
08/26/2021

Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR

In recent decades, many studies have suggested that phase information is...
research
07/25/2020

Robust Front-End for Multi-Channel ASR using Flow-Based Density Estimation

For multi-channel speech recognition, speech enhancement techniques such...
research
02/03/2022

The RoyalFlush System of Speech Recognition for M2MeT Challenge

This paper describes our RoyalFlush system for the track of multi-speake...
research
07/19/2022

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

This paper presents recent progress on integrating speech separation and...
research
07/04/2021

TENET: A Time-reversal Enhancement Network for Noise-robust ASR

Due to the unprecedented breakthroughs brought about by deep learning, s...
research
06/18/2019

Deep Xi as a Front-End for Robust Automatic Speech Recognition

Front-end techniques for robust automatic speech recognition (ASR) have ...

Please sign up or login with your details

Forgot password? Click here to reset