Reinforcement Learning Based Speech Enhancement for Robust Speech Recognition

11/10/2018
by   Yih-Liang Shen, et al.
0

Conventional deep neural network (DNN)-based speech enhancement (SE) approaches aim to minimize the mean square error (MSE) between enhanced speech and clean reference. The MSE-optimized model may not directly improve the performance of an automatic speech recognition (ASR) system. If the target is to minimize the recognition error, the recognition results should be used to design the objective function for optimizing the SE model. However, the structure of an ASR system, which consists of multiple units, such as acoustic and language models, is usually complex and not differentiable. In this study, we proposed to adopt the reinforcement learning algorithm to optimize the SE model based on the recognition results. We evaluated the propsoed SE system on the Mandarin Chinese broadcast news corpus (MATBN). Experimental results demonstrate that the proposed method can effectively improve the ASR results with a notable 12.40 ratio at 0 dB and 5 dB conditions, respectively.

READ FULL TEXT
research
10/12/2021

Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models

A deep neural network (DNN)-based speech enhancement (SE) aiming to maxi...
research
12/11/2022

BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithm

The performance of speech-processing models is heavily influenced by the...
research
04/01/2022

End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation

This work presents our end-to-end (E2E) automatic speech recognition (AS...
research
11/15/2020

Speech enhancement guided by contextual articulatory information

Previous studies have confirmed the effectiveness of leveraging articula...
research
01/11/2022

Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition

The combination of a deep neural network (DNN) -based speech enhancement...
research
09/11/2020

Machine learning-based EDFA Gain Model Generalizable to Multiple Physical Devices

We report a neural-network based erbium-doped fiber amplifier (EDFA) gai...
research
05/26/2021

Training Speech Enhancement Systems with Noisy Speech Datasets

Recently, deep neural network (DNN)-based speech enhancement (SE) system...

Please sign up or login with your details

Forgot password? Click here to reset