Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

10/09/2021
by   Mu Yang, et al.
12

This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system, where each language was seen as an individual task and was learned sequentially and continually. It does not require pooled data from all languages altogether, and thus alleviates the storage and computation burden. One of the challenges of lifelong learning methods is "catastrophic forgetting": in TTS scenario it means that model performance quickly degrades on previous languages when adapted to a new language. We approach this problem via a data-replay-based lifelong learning method. We formulate the replay process as a supervised learning problem, and propose a simple yet effective dual-sampler framework to tackle the heavily language-imbalanced training samples. Through objective and subjective evaluations, we show that this supervised learning formulation outperforms other gradient-based and regularization-based lifelong learning methods, achieving 43 Distortion reduction compared to a fine-tuning baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2021

Pseudo-Labeling for Massively Multilingual Speech Recognition

Semi-supervised learning through pseudo-labeling has become a staple of ...
research
11/21/2022

Towards continually learning new languages

Multilingual speech recognition with neural networks is often implemente...
research
05/12/2023

Prompt Learning to Mitigate Catastrophic Forgetting in Cross-lingual Transfer for Open-domain Dialogue Generation

Dialogue systems for non-English languages have long been under-explored...
research
05/25/2023

Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration

This work aims to build a multilingual text-to-speech (TTS) synthesis sy...
research
09/18/2023

Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter

Multilingual intelligent assistants, such as ChatGPT, have recently gain...
research
10/27/2022

Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition

Adapting a trained Automatic Speech Recognition (ASR) model to new tasks...

Please sign up or login with your details

Forgot password? Click here to reset