Differentiable Pooling for Unsupervised Acoustic Model Adaptation

03/31/2016
by   Pawel Swietojanski, et al.
0

We present a deep neural network (DNN) acoustic model that includes parametrised and differentiable pooling operators. Unsupervised acoustic model adaptation is cast as the problem of updating the decision boundaries implemented by each pooling operator. In particular, we experiment with two types of pooling parametrisations: learned L_p-norm pooling and weighted Gaussian pooling, in which the weights of both operators are treated as speaker-dependent. We perform investigations using three different large vocabulary speech recognition corpora: AMI meetings, TED talks and Switchboard conversational telephone speech. We demonstrate that differentiable pooling operators provide a robust and relatively low-dimensional way to adapt acoustic models, with relative word error rates reductions ranging from 5--20 respect to unadapted systems, which themselves are better than the baseline fully-connected DNN-based acoustic models. We also investigate how the proposed techniques work under various adaptation conditions including the quality of adaptation data and complementarity to other feature- and model-space adaptation methods, as well as providing an analysis of the characteristics of each of the proposed approaches.

READ FULL TEXT
research
01/12/2016

Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation

This work presents a broad study on the adaptation of neural network aco...
research
06/30/2012

Differentiable Pooling for Hierarchical Feature Learning

We introduce a parametric form of pooling, based on a Gaussian, which ca...
research
04/23/2018

Towards an Unsupervised Entrainment Distance in Conversational Speech using Deep Neural Networks

Entrainment is a known adaptation mechanism that causes interaction part...
research
03/15/2020

Exploring Gaussian mixture model framework for speaker adaptation of deep neural network acoustic models

In this paper we investigate the GMM-derived (GMMD) features for adaptat...
research
02/16/2018

Interpreting DNN output layer activations: A strategy to cope with unseen data in speech recognition

Unseen data can degrade performance of deep neural net acoustic models. ...
research
01/22/2016

Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition

We propose to model the acoustic space of deep neural network (DNN) clas...
research
09/18/2023

Improved Factorized Neural Transducer Model For text-only Domain Adaptation

End-to-end models, such as the neural Transducer, have been successful i...

Please sign up or login with your details

Forgot password? Click here to reset