Feature Normalisation for Robust Speech Recognition

07/14/2015
by   D. S. Pavan Kumar, et al.
0

Speech recognition system performance degrades in noisy environments. If the acoustic models are built using features of clean utterances, the features of a noisy test utterance would be acoustically mismatched with the trained model. This gives poor likelihoods and poor recognition accuracy. Model adaptation and feature normalisation are two broad areas that address this problem. While the former often gives better performance, the latter involves estimation of lesser number of parameters, making the system feasible for practical implementations. This research focuses on the efficacies of various subspace, statistical and stereo based feature normalisation techniques. A subspace projection based method has been investigated as a standalone and adjunct technique involving reconstruction of noisy speech features from a precomputed set of clean speech building-blocks. The building blocks are learned using non-negative matrix factorisation (NMF) on log-Mel filter bank coefficients, which form a basis for the clean speech subspace. The work provides a detailed study on how the method can be incorporated into the extraction process of Mel-frequency cepstral coefficients. Experimental results show that the new features are robust to noise, and achieve better results when combined with the existing techniques. The work also proposes a modification to the training process of SPLICE algorithm for noise robust speech recognition. It is based on feature correlations, and enables this stereo-based algorithm to improve the performance in all noise conditions, especially in unseen cases. Further, the modified framework is extended to work for non-stereo datasets where clean and noisy training utterances, but not stereo counterparts, are required. An MLLR-based computationally efficient run-time noise adaptation method in SPLICE framework has been proposed.

READ FULL TEXT
research
07/15/2013

Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition

In this paper, a modification to the training process of the popular SPL...
research
07/16/2019

Towards Adapting NMF Dictionaries Using Total Variability Modeling for Noise-Robust Acoustic Features

We propose an algorithm to extract noise-robust acoustic features from n...
research
03/23/2018

Exploring the robustness of features and enhancement on speech recognition systems in highly-reverberant real environments

This paper evaluates the robustness of a DNN-HMM-based speech recognitio...
research
03/09/2015

Modeling State-Conditional Observation Distribution using Weighted Stereo Samples for Factorial Speech Processing Models

This paper investigates the effectiveness of factorial speech processing...
research
10/16/2021

Towards Robust Waveform-Based Acoustic Models

We propose an approach for learning robust acoustic models in adverse en...
research
07/01/2017

Rank-1 Constrained Multichannel Wiener Filter for Speech Recognition in Noisy Environments

Multichannel linear filters, such as the Multichannel Wiener Filter (MWF...
research
08/27/2022

Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments

This research work is about recent development made in speech recognitio...

Please sign up or login with your details

Forgot password? Click here to reset