Interpreting DNN output layer activations: A strategy to cope with unseen data in speech recognition

02/16/2018
by   Vikramjit Mitra, et al.
0

Unseen data can degrade performance of deep neural net acoustic models. To cope with unseen data, adaptation techniques are deployed. For unlabeled unseen data, one must generate some hypothesis given an existing model, which is used as the label for model adaptation. However, assessing the goodness of the hypothesis can be difficult, and an erroneous hypothesis can lead to poorly trained models. In such cases, a strategy to select data having reliable hypothesis can ensure better model adaptation. This work proposes a data-selection strategy for DNN model adaptation, where DNN output layer activations are used to ascertain the goodness of a generated hypothesis. In a DNN acoustic model, the output layer activations are used to generate target class probabilities. Under unseen data conditions, the difference between the most probable target and the next most probable target is decreased compared to the same for seen data, indicating that the model may be uncertain while generating its hypothesis. This work proposes a strategy to assess a model's performance by analyzing the output layer activations by using a distance measure between the most likely target and the next most likely target, which is used for data selection for performing unsupervised adaptation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/31/2017

Leveraging Deep Neural Network Activation Entropy to cope with Unseen Data in Speech Recognition

Unseen data conditions can inflict serious performance degradation on sy...
research
11/05/2020

Domain Adaptation Using Class Similarity for Robust Speech Recognition

When only limited target domain data is available, domain adaptation cou...
research
11/21/2017

Unsupervised Adaptation with Domain Separation Networks for Robust Speech Recognition

Unsupervised domain adaptation of speech signal aims at adapting a well-...
research
03/31/2016

Differentiable Pooling for Unsupervised Acoustic Model Adaptation

We present a deep neural network (DNN) acoustic model that includes para...
research
10/01/2019

Domain Expansion in DNN-based Acoustic Models for Robust Speech Recognition

Training acoustic models with sequentially incoming data – while both le...
research
04/21/2022

Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition

Accent variability has posed a huge challenge to automatic speech recogn...
research
11/05/2020

Multi-Accent Adaptation based on Gate Mechanism

When only a limited amount of accented speech data is available, to prom...

Please sign up or login with your details

Forgot password? Click here to reset