Ensemble of Jointly Trained Deep Neural Network-Based Acoustic Models for Reverberant Speech Recognition

08/17/2016
by   Jeehye Lee, et al.
0

Distant speech recognition is a challenge, particularly due to the corruption of speech signals by reverberation caused by large distances between the speaker and microphone. In order to cope with a wide range of reverberations in real-world situations, we present novel approaches for acoustic modeling including an ensemble of deep neural networks (DNNs) and an ensemble of jointly trained DNNs. First, multiple DNNs are established, each of which corresponds to a different reverberation time 60 (RT60) in a setup step. Also, each model in the ensemble of DNN acoustic models is further jointly trained, including both feature mapping and acoustic modeling, where the feature mapping is designed for the dereverberation as a front-end. In a testing phase, the two most likely DNNs are chosen from the DNN ensemble using maximum a posteriori (MAP) probabilities, computed in an online fashion by using maximum likelihood (ML)-based blind RT60 estimation and then the posterior probability outputs from two DNNs are combined using the ML-based weights as a simple average. Extensive experiments demonstrate that the proposed approach leads to substantial improvements in speech recognition accuracy over the conventional DNN baseline systems under diverse reverberant conditions.

READ FULL TEXT
research
06/30/2014

Building DNN Acoustic Models for Large Vocabulary Speech Recognition

Deep neural networks (DNNs) are now a central component of nearly all st...
research
12/14/2015

Small-footprint Deep Neural Networks with Highway Connections for Speech Recognition

For speech recognition, deep neural networks (DNNs) have significantly i...
research
12/21/2018

End-to-End Classification of Reverberant Rooms using DNNs

Reverberation is present in our workplaces, our homes and even in places...
research
01/12/2018

Speech Dereverberation Based on Integrated Deep and Ensemble Learning

Reverberation, which is generally caused by sound reflections from walls...
research
01/12/2018

Speech Dereverberation Based on Integrated Deep and Ensemble Learning Algorithm

Reverberation, which is generally caused by sound reflections from walls...
research
03/18/2016

A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition

We study large-scale kernel methods for acoustic modeling and compare to...

Please sign up or login with your details

Forgot password? Click here to reset