Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models

10/18/2016
by   Pranay Dighe, et al.
0

Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond to context-dependent tied states or senones. The present work addresses some limitations of GMM-HMM senone alignments for DNN training. We hypothesize that the senone probabilities obtained from a DNN trained with binary labels can provide more accurate targets to learn better acoustic models. However, DNN outputs bear inaccuracies which are exhibited as high dimensional unstructured noise, whereas the informative components are structured and low-dimensional. We exploit principle component analysis (PCA) and sparse coding to characterize the senone subspaces. Enhanced probabilities obtained from low-rank and sparse reconstructions are used as soft-targets for DNN acoustic modeling, that also enables training with untranscribed data. Experiments conducted on AMI corpus shows 4.6 rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2016

Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition

We propose to model the acoustic space of deep neural network (DNN) clas...
research
08/29/2017

Information Theoretic Analysis of DNN-HMM Acoustic Modeling

We propose an information theoretic framework for quantitative assessmen...
research
04/07/2015

Deep Recurrent Neural Networks for Acoustic Modelling

We present a novel deep Recurrent Neural Network (RNN) model for acousti...
research
01/24/2022

Investigation of Deep Neural Network Acoustic Modelling Approaches for Low Resource Accented Mandarin Speech Recognition

The Mandarin Chinese language is known to be strongly influenced by a ri...
research
11/04/2014

Tied Probabilistic Linear Discriminant Analysis for Speech Recognition

Acoustic models using probabilistic linear discriminant analysis (PLDA) ...
research
07/01/2016

Moving Toward High Precision Dynamical Modelling in Hidden Markov Models

Hidden Markov Model (HMM) is often regarded as the dynamical model of ch...
research
02/24/2016

Improved Accent Classification Combining Phonetic Vowels with Acoustic Features

Researches have shown accent classification can be improved by integrati...

Please sign up or login with your details

Forgot password? Click here to reset