Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition

01/22/2016
by   Pranay Dighe, et al.
0

We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of low-dimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse representation of the test posteriors using this dictionary enables projection to the space of training data. Relying on the fact that the intrinsic dimensions of the posterior subspaces are indeed very small and the matrix of all posteriors belonging to a class has a very low rank, we demonstrate how low-dimensional structures enable further enhancement of the posteriors and rectify the spurious errors due to mismatch conditions. The enhanced acoustic modeling method leads to improvements in continuous speech recognition task using hybrid DNN-HMM (hidden Markov model) framework in both clean and noisy conditions, where upto 15.4 error rate (WER) is achieved.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2016

Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models

Conventional deep neural networks (DNN) for speech acoustic modeling rel...
research
08/29/2017

Information Theoretic Analysis of DNN-HMM Acoustic Modeling

We propose an information theoretic framework for quantitative assessmen...
research
01/24/2022

Investigation of Deep Neural Network Acoustic Modelling Approaches for Low Resource Accented Mandarin Speech Recognition

The Mandarin Chinese language is known to be strongly influenced by a ri...
research
03/25/2022

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

Compensation for channel mismatch and noise interference is essential fo...
research
03/31/2016

Differentiable Pooling for Unsupervised Acoustic Model Adaptation

We present a deep neural network (DNN) acoustic model that includes para...
research
10/25/2017

Relative Transfer Function Inverse Regression from Low Dimensional Manifold

In room acoustic environments, the Relative Transfer Functions (RTFs) ar...
research
05/15/2020

Context-Dependent Acoustic Modeling without Explicit Phone Clustering

Phoneme-based acoustic modeling of large vocabulary automatic speech rec...

Please sign up or login with your details

Forgot password? Click here to reset