Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition

05/09/2022
by   Catalin Zorila, et al.
0

Improving the accuracy of single-channel automatic speech recognition (ASR) in noisy conditions is challenging. Strong speech enhancement front-ends are available, however, they typically require that the ASR model is retrained to cope with the processing artifacts. In this paper we explore a speaker reinforcement strategy for improving recognition performance without retraining the acoustic model (AM). This is achieved by remixing the enhanced signal with the unprocessed input to alleviate the processing artifacts. We evaluate the proposed approach using a DNN speaker extraction based speech denoiser trained with a perceptually motivated loss function. Results show that (without AM retraining) our method yields about 23 compared with the unprocessed for the monoaural simulated and real CHiME-4 evaluation sets, respectively, and outperforms a state-of-the-art reference method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2022

MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement

Speech enhancement improves speech quality and promotes the performance ...
research
05/09/2019

Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates

This paper addresses the problem of block-online processing for multi-ch...
research
03/31/2022

A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction

Speaker extraction algorithm extracts the target speech from a mixture s...
research
03/03/2018

SpeechPy - A Library for Speech Processing and Recognition

SpeechPy is an open source Python package that contains speech preproces...
research
12/12/2019

On Neural Phone Recognition of Mixed-Source ECoG Signals

The emerging field of neural speech recognition (NSR) using electrocorti...
research
04/26/2022

Mask scalar prediction for improving robust automatic speech recognition

Using neural network based acoustic frontends for improving robustness o...
research
10/25/2018

Speaker Selective Beamformer with Keyword Mask Estimation

This paper addresses the problem of automatic speech recognition (ASR) o...

Please sign up or login with your details

Forgot password? Click here to reset