Efficient neural speech synthesis for low-resource languages throughmultilingual modeling

09/02/2020
by   EstherKlabbers, et al.
0

Recent advances in neural TTS have led to models that canproduce high-quality synthetic speech. However, these mod-els typically require large amounts of training data, which canmake it costly to produce a new voice with the desired qual-ity. Although multi-speaker modeling can reduce the data re-quirements necessary for a new voice, this approach is usuallynot viable for many low-resource languages for which abundantmulti-speaker data is not available. In this paper, we thereforeinvestigated to what extent multilingual multi-speaker model-ing can be an alternative to monolingual multi-speaker model-ing, and explored how data from foreign languages may best becombined with low-resource language data. We found that mul-tilingual modeling can increase the naturalness of low-resourcelanguage speech, showed that multilingual models can producespeech with a naturalness comparable to monolingual multi-speaker models, and saw that the target language naturalnesswas affected by the strategy used to add foreign language data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2020

Efficient neural speech synthesis for low-resource languages through multilingual modeling

Recent advances in neural TTS have led to models that can produce high-q...
research
01/11/2023

Modelling low-resource accents without accent-specific TTS frontend

This work focuses on modelling a speaker's accent that does not have a d...
research
05/29/2019

Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains

Neural language modeling (LM) has led to significant improvements in sev...
research
02/16/2022

Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module

State-of-the-art text-to-speech (TTS) systems require several hours of r...
research
11/17/2022

Towards Building Text-To-Speech Systems for the Next Billion Users

Deep learning based text-to-speech (TTS) systems have been evolving rapi...
research
11/11/2020

Low-resource expressive text-to-speech using data augmentation

While recent neural text-to-speech (TTS) systems perform remarkably well...
research
07/04/2022

Mix and Match: An Empirical Study on Training Corpus Composition for Polyglot Text-To-Speech (TTS)

Training multilingual Neural Text-To-Speech (NTTS) models using only mon...

Please sign up or login with your details

Forgot password? Click here to reset