Multilingual Multiaccented Multispeaker TTS with RADTTS

01/24/2023
by   Rohan Badlani, et al.
0

We work to create a multilingual speech synthesis system which can generate speech with the proper accent while retaining the characteristics of an individual voice. This is challenging to do because it is expensive to obtain bilingual training data in multiple languages, and the lack of such data results in strong correlations that entangle speaker, language, and accent, resulting in poor transfer capabilities. To overcome this, we present a multilingual, multiaccented, multispeaker speech synthesis model based on RADTTS with explicit control over accent, language, speaker and fine-grained F_0 and energy features. Our proposed model does not rely on bilingual training data. We demonstrate an ability to control synthesized accent for any speaker in an open-source dataset comprising of 7 accents. Human subjective evaluation demonstrates that our model can better retain a speaker's voice and accent quality than controlled baselines while synthesizing fluent speech in all target languages and accents in our dataset.

READ FULL TEXT
research
05/19/2023

MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting

We present MParrotTTS, a unified multilingual, multi-speaker text-to-spe...
research
07/09/2019

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

We present a multispeaker, multilingual text-to-speech (TTS) synthesis m...
research
08/03/2020

One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech

We introduce an approach to multilingual speech synthesis which uses the...
research
11/17/2021

Rapping-Singing Voice Synthesis based on Phoneme-level Prosody Control

In this paper, a text-to-rapping/singing system is introduced, which can...
research
06/20/2020

Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams

Generating 3D speech-driven talking head has received more and more atte...
research
05/13/2022

Talking Face Generation with Multilingual TTS

In this work, we propose a joint system combining a talking face generat...
research
07/04/2022

Mix and Match: An Empirical Study on Training Corpus Composition for Polyglot Text-To-Speech (TTS)

Training multilingual Neural Text-To-Speech (NTTS) models using only mon...

Please sign up or login with your details

Forgot password? Click here to reset