Environmental Noise Embeddings for Robust Speech Recognition

01/11/2016
by   Suyoun Kim, et al.
0

We propose a novel deep neural network architecture for speech recognition that explicitly employs knowledge of the background environmental noise within a deep neural network acoustic model. A deep neural network is used to predict the acoustic environment in which the system in being used. The discriminative embedding generated at the bottleneck layer of this network is then concatenated with traditional acoustic features as input to a deep neural network acoustic model. Through a series of experiments on Resource Management, CHiME-3 task, and Aurora4, we show that the proposed approach significantly improves speech recognition accuracy in noisy and highly reverberant environments, outperforming multi-condition training, noise-aware training, i-vector framework, and multi-task learning on both in-domain noise and unseen noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2018

Augmenting Bottleneck Features of Deep Neural Network Employing Motor State for Speech Recognition at Humanoid Robots

As for the humanoid robots, the internal noise, which is generated by mo...
research
04/30/2018

Investigations on End-to-End Audiovisual Fusion

Audiovisual speech recognition (AVSR) is a method to alleviate the adver...
research
01/24/2022

Investigation of Deep Neural Network Acoustic Modelling Approaches for Low Resource Accented Mandarin Speech Recognition

The Mandarin Chinese language is known to be strongly influenced by a ri...
research
11/08/2012

Multi-input Multi-output Beta Wavelet Network: Modeling of Acoustic Units for Speech Recognition

In this paper, we propose a novel architecture of wavelet network called...
research
07/11/2018

A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network

In this paper, a time delay neural network (TDNN) based acoustic model i...
research
09/12/2018

End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent Neural Models

Speech activity detection (SAD) plays an important role in current speec...
research
02/23/2022

Blind Reverberation Time Estimation in Dynamic Acoustic Conditions

The estimation of reverberation time from real-world signals plays a cen...

Please sign up or login with your details

Forgot password? Click here to reset