Exploring the robustness of features and enhancement on speech recognition systems in highly-reverberant real environments

03/23/2018
by   José Novoa, et al.
0

This paper evaluates the robustness of a DNN-HMM-based speech recognition system in highly-reverberant real environments using the HRRE database. The performance of locally-normalized filter bank (LNFB) and Mel filter bank (MelFB) features in combination with Non-negative Matrix Factorization (NMF), Suppression of Slowly-varying components and the Falling edge (SSF) and Weighted Prediction Error (WPE) enhancement methods are discussed and evaluated. Two training conditions were considered: clean and reverberated (Reverb). With Reverb training the use of WPE and LNFB provides WERs that are 3 provides WERs that are 11 respectively. With clean training, which represents a significant mismatch between testing and training conditions, LNFB features clearly outperform MelFB features. The results show that different types of training, parametrization, and enhancement techniques may work better for a specific combination of speaker-microphone distance and reverberation time. This suggests that there could be some degree of complementarity between systems trained with different enhancement and parametrization methods.

READ FULL TEXT
research
10/31/2017

Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization

This paper presents a statistical method of single-channel speech enhanc...
research
06/17/2019

On combining features for single-channel robust speech recognition in reverberant environments

This paper addresses the combination of complementary parallel speech re...
research
11/16/2018

Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization

In this paper we address speaker-independent multichannel speech enhance...
research
01/29/2018

Highly-Reverberant Real Environment database: HRRE

Speech recognition in highly-reverberant real environments remains a maj...
research
07/14/2015

Feature Normalisation for Robust Speech Recognition

Speech recognition system performance degrades in noisy environments. If...
research
03/24/2017

Batch-normalized joint training for DNN-based distant speech recognition

Improving distant speech recognition is a crucial step towards flexible ...
research
11/07/2018

On the use of DNN Autoencoder for Robust Speaker Recognition

In this paper, we present an analysis of a DNN-based autoencoder for spe...

Please sign up or login with your details

Forgot password? Click here to reset