Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model

03/27/2018
by   Ke Wang, et al.
0

Speaker adaptation aims to estimate a speaker specific acoustic model from a speaker independent one to minimize the mismatch between the training and testing conditions arisen from speaker variabilities. A variety of neural network adaptation methods have been proposed since deep learning models have become the main stream. But there still lacks an experimental comparison between different methods, especially when DNN-based acoustic models have been advanced greatly. In this paper, we aim to close this gap by providing an empirical evaluation of three typical speaker adaptation methods: LIN, LHUC and KLD. Adaptation experiments, with different size of adaptation data, are conducted on a strong TDNN-LSTM acoustic model. More challengingly, here, the source and target we are concerned with are standard Mandarin speaker model and accented Mandarin speaker model. We compare the performances of different methods and their combinations. Speaker adaptation performance is also examined by speaker's accent degree.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2019

Adversarial Speaker Adaptation

We propose a novel adversarial speaker adaptation (ASA) scheme, in which...
research
04/02/2018

Speaker-Invariant Training via Adversarial Learning

We propose a novel adversarial multi-task learning scheme, aiming at act...
research
08/11/2020

Why Did the x-Vector System Miss a Target Speaker? Impact of Acoustic Mismatch Upon Target Score on VoxCeleb Data

Modern automatic speaker verification (ASV) relies heavily on machine le...
research
06/27/2019

Lattice-Based Unsupervised Test-Time Adaptation of Neural Network Acoustic Models

Acoustic model adaptation to unseen test recordings aims to reduce the m...
research
08/30/2018

Learning to adapt: a meta-learning approach for speaker adaptation

The performance of automatic speech recognition systems can be improved ...
research
11/17/2022

Multi-source Domain Adaptation for Text-independent Forensic Speaker Recognition

Adapting speaker recognition systems to new environments is a widely-use...
research
07/31/2018

Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems

Most neural-network based speaker-adaptive acoustic models for speech sy...

Please sign up or login with your details

Forgot password? Click here to reset