Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model

11/27/2017
by   Sibo Tong, et al.
0

Phoneme-based multilingual training and different cross-lingual adaptation techniques for Automatic Speech Recognition (ASR) are explored in Connectionist Temporal Classification (CTC)-based systems. The multilingual model is trained to model a universal IPA-based phone set using CTC loss function. While the same IPA symbol may not correspond to acoustic similarity, Learning Hidden Unit Contribution (LHUC) is investigated. Given the multilingual model, different approaches are exploited and compared to adapt the multilingual model to a target language with limited adaptation data. In addition, dropout during cross-lingual adaptation is also studied and tested in order to mitigate the overfitting problem. Experiments show that the performance of the universal phoneme-based CTC system can be improve by apply LHUC and it is extensible to new phonemes during cross-lingual adaptation. Updating all the parameters shows consistently improvement on limited data. Applying dropout during adaptation can further improve the system and achieve competitive performance with Deep Neural Network (DNN)/ Hidden Markov Model (HMM) systems even on 21 hours data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2020

Autosegmental Neural Nets: Should Phones and Tones be Synchronous or Asynchronous?

Phones, the segmental units of the International Phonetic Alphabet (IPA)...
research
10/07/2021

Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0

We propose a simple and effective cross-lingual transfer learning method...
research
12/28/2020

Building Multi lingual TTS using Cross Lingual Voice Conversion

In this paper we propose a new cross-lingual Voice Conversion (VC) appro...
research
07/07/2022

Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition

Multilingual automatic speech recognition (ASR) systems mostly benefit l...
research
09/14/2022

Parameter-Efficient Finetuning for Robust Continual Multilingual Learning

NLU systems deployed in the real world are expected to be regularly upda...
research
05/02/2020

A language score based output selection method for multilingual speech recognition

The quality of a multilingual speech recognition system can be improved ...
research
03/06/2015

Maximum a Posteriori Adaptation of Network Parameters in Deep Models

We present a Bayesian approach to adapting parameters of a well-trained ...

Please sign up or login with your details

Forgot password? Click here to reset