Automatic context window composition for distant speech recognition

05/26/2018
by   Mirco Ravanelli, et al.
0

Distant speech recognition is being revolutionized by deep learning, that has contributed to significantly outperform previous HMM-GMM systems. A key aspect behind the rapid rise and success of DNNs is their ability to better manage large time contexts. With this regard, asymmetric context windows that embed more past than future frames have been recently used with feed-forward neural networks. This context configuration turns out to be useful not only to address low-latency speech recognition, but also to boost the recognition performance under reverberant conditions. This paper investigates on the mechanisms occurring inside DNNs, which lead to an effective application of asymmetric contexts.In particular, we propose a novel method for automatic context window composition based on a gradient analysis. The experiments, performed with different acoustic environments, features, DNN architectures, microphone settings, and recognition tasks show that our simple and efficient strategy leads to a less redundant frame configuration, which makes DNN training more effective in reverberant scenarios.

READ FULL TEXT
research
12/17/2017

Deep Learning for Distant Speech Recognition

Deep learning is an emerging technology that is considered one of the mo...
research
06/29/2016

Optimising The Input Window Alignment in CD-DNN Based Phoneme Recognition for Low Latency Processing

We present a systematic analysis on the performance of a phonetic recogn...
research
10/10/2017

Contaminated speech training methods for robust DNN-HMM distant speech recognition

Despite the significant progress made in the last years, state-of-the-ar...
research
11/23/2018

Improved Frequency Modulation Features for Multichannel Distant Speech Recognition

Frequency modulation features capture the fine structure of speech forma...
research
04/15/2018

Twin Regularization for online speech recognition

Online speech recognition is crucial for developing natural human-machin...
research
01/13/2017

Kernel Approximation Methods for Speech Recognition

We study large-scale kernel methods for acoustic modeling in speech reco...
research
06/19/2018

A Survey of Recent DNN Architectures on the TIMIT Phone Recognition Task

In this survey paper, we have evaluated several recent deep neural netwo...

Please sign up or login with your details

Forgot password? Click here to reset