Deep Learning Based Dereverberation of Temporal Envelopesfor Robust Speech Recognition

08/07/2020
by   Anurenjan Purushothaman, et al.
0

Automatic speech recognition in reverberant conditions is a challenging task as the long-term envelopes of the reverberant speech are temporally smeared. In this paper, we propose a neural model for enhancement of sub-band temporal envelopes for dereverberation of speech. The temporal envelopes are derived using the autoregressive modeling framework of frequency domain linear prediction (FDLP). The neural enhancement model proposed in this paper performs an envelop gain based enhancement of temporal envelopes and it consists of a series of convolutional and recurrent neural network layers. The enhanced sub-band envelopes are used to generate features for automatic speech recognition (ASR). The ASR experiments are performed on the REVERB challenge dataset as well as the CHiME-3 dataset. In these experiments, the proposed neural enhancement approach provides significant improvements over a baseline ASR system with beamformed audio (average relative improvements of 21 development set and about 11 REVERB challenge dataset).

READ FULL TEXT
research
08/12/2021

Dereverberation of Autoregressive Envelopes for Far-field Speech Recognition

The task of speech recognition in far-field environments is adversely af...
research
11/13/2019

3-D Feature and Acoustic Modeling for Far-Field Speech Recognition

Automatic speech recognition in multi-channel reverberant conditions is ...
research
02/07/2018

Learning from Past Mistakes: Improving Automatic Speech Recognition Output via Noisy-Clean Phrase Context Modeling

Automatic speech recognition (ASR) systems lack joint optimization durin...
research
09/06/2020

Non causal deep learning based dereverberation

In this paper we demonstrate the effectiveness of non-causal context for...
research
08/01/2016

Blind phoneme segmentation with temporal prediction errors

Phonemic segmentation of speech is a critical step of speech recognition...
research
03/25/2021

Radically Old Way of Computing Spectra: Applications in End-to-End ASR

We propose a technique to compute spectrograms using Frequency Domain Li...
research
02/04/2022

Polyphonic pitch detection with convolutional recurrent neural networks

Recent directions in automatic speech recognition (ASR) research have sh...

Please sign up or login with your details

Forgot password? Click here to reset