Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments

10/09/2014
by   Andreas Schwarz, et al.
0

We propose a spatial diffuseness feature for deep neural network (DNN)-based automatic speech recognition to improve recognition accuracy in reverberant and noisy environments. The feature is computed in real-time from multiple microphone signals without requiring knowledge or estimation of the direction of arrival, and represents the relative amount of diffuse noise in each time and frequency bin. It is shown that using the diffuseness feature as an additional input to a DNN-based acoustic model leads to a reduced word error rate for the REVERB challenge corpus, both compared to logmelspec features extracted from noisy signals, and features enhanced by spectral subtraction.

READ FULL TEXT
research
08/27/2018

Augmenting Bottleneck Features of Deep Neural Network Employing Motor State for Speech Recognition at Humanoid Robots

As for the humanoid robots, the internal noise, which is generated by mo...
research
11/05/2019

Spatial Attention for Far-field Speech Recognition with Deep Beamforming Neural Networks

In this paper, we introduce spatial attention for refining the informati...
research
06/30/2020

Multi-view Frequency LSTM: An Efficient Frontend for Automatic Speech Recognition

Acoustic models in real-time speech recognition systems typically stack ...
research
02/20/2023

A DNN based Normalized Time-frequency Weighted Criterion for Robust Wideband DoA Estimation

Deep neural networks (DNNs) have greatly benefited direction of arrival ...
research
03/23/2018

An improved DNN-based spectral feature mapping that removes noise and reverberation for robust automatic speech recognition

Reverberation and additive noise have detrimental effects on the perform...
research
05/29/2017

DNN-based uncertainty estimation for weighted DNN-HMM ASR

In this paper, the uncertainty is defined as the mean square error betwe...
research
02/16/2018

Articulatory information and Multiview Features for Large Vocabulary Continuous Speech Recognition

This paper explores the use of multi-view features and their discriminat...

Please sign up or login with your details

Forgot password? Click here to reset