Learning to adapt: a meta-learning approach for speaker adaptation

08/30/2018
by   Ondřej Klejch, et al.
0

The performance of automatic speech recognition systems can be improved by adapting an acoustic model to compensate for the mismatch between training and testing conditions, for example by adapting to unseen speakers. The success of speaker adaptation methods relies on selecting weights that are suitable for adaptation and using good adaptation schedules to update these weights in order not to overfit to the adaptation data. In this paper we investigate a principled way of adapting all the weights of the acoustic model using a meta-learning. We show that the meta-learner can learn to perform supervised and unsupervised speaker adaptation and that it outperforms a strong baseline adapting LHUC parameters when adapting a DNN AM with 1.5M parameters. We also report initial experiments on adapting TDNN AMs, where the meta-learner achieves comparable performance with LHUC.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2019

Speaker Adaptive Training using Model Agnostic Meta-Learning

Speaker adaptive training (SAT) of neural network acoustic models learns...
research
11/07/2021

Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech

Personalizing a speech synthesis system is a highly desired application,...
research
08/31/2017

Leveraging Deep Neural Network Activation Entropy to cope with Unseen Data in Speech Recognition

Unseen data conditions can inflict serious performance degradation on sy...
research
03/27/2018

Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model

Speaker adaptation aims to estimate a speaker specific acoustic model fr...
research
11/14/2022

Meta-Learning of Neural State-Space Models Using Data From Similar Systems

Deep neural state-space models (SSMs) provide a powerful tool for modeli...
research
06/27/2019

Lattice-Based Unsupervised Test-Time Adaptation of Neural Network Acoustic Models

Acoustic model adaptation to unseen test recordings aims to reduce the m...
research
08/04/2023

Adapting the NICT-JLE Corpus for Disfluency Detection Models

The detection of disfluencies such as hesitations, repetitions and false...

Please sign up or login with your details

Forgot password? Click here to reset