Adversarial Speaker Adaptation

04/29/2019
by   Zhong Meng, et al.
0

We propose a novel adversarial speaker adaptation (ASA) scheme, in which adversarial learning is applied to regularize the distribution of deep hidden features in a speaker-dependent (SD) deep neural network (DNN) acoustic model to be close to that of a fixed speaker-independent (SI) DNN acoustic model during adaptation. An additional discriminator network is introduced to distinguish the deep features generated by the SD model from those produced by the SI model. In ASA, with a fixed SI model as the reference, an SD model is jointly optimized with the discriminator network to minimize the senone classification loss, and simultaneously to mini-maximize the SI/SD discrimination loss on the adaptation data. With ASA, a senone-discriminative deep feature is learned in the SD model with a similar distribution to that of the SI model. With such a regularized and adapted deep feature, the SD model can perform improved automatic speech recognition on the target speaker's speech. Evaluated on the Microsoft short message dictation dataset, ASA achieves 14.4 and unsupervised adaptation, respectively, over an SI model trained from 2600 hours data, with 200 adaptation utterances per speaker.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2018

Speaker-Invariant Training via Adversarial Learning

We propose a novel adversarial multi-task learning scheme, aiming at act...
research
03/27/2018

Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model

Speaker adaptation aims to estimate a speaker specific acoustic model fr...
research
11/09/2019

Speaker Adaptation for Attention-Based End-to-End Speech Recognition

We propose three regularization-based speaker adaptation approaches to a...
research
03/15/2020

Exploring Gaussian mixture model framework for speaker adaptation of deep neural network acoustic models

In this paper we investigate the GMM-derived (GMMD) features for adaptat...
research
04/28/2019

Attentive Adversarial Learning for Domain-Invariant Training

Adversarial domain-invariant training (ADIT) proves to be effective in s...
research
04/29/2019

Adversarial Speaker Verification

The use of deep networks to extract embeddings for speaker recognition h...
research
02/13/2016

Signer-independent Fingerspelling Recognition with Deep Neural Network Adaptation

We study the problem of recognition of fingerspelled letter sequences in...

Please sign up or login with your details

Forgot password? Click here to reset