Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation

07/29/2022
by   Giulia Comini, et al.
0

The availability of data in expressive styles across languages is limited, and recording sessions are costly and time consuming. To overcome these issues, we demonstrate how to build low-resource, neural text-to-speech (TTS) voices with only 1 hour of conversational speech, when no other conversational data are available in the same language. Assuming the availability of non-expressive speech data in that language, we propose a 3-step technology: 1) we train an F0-conditioned voice conversion (VC) model as data augmentation technique; 2) we train an F0 predictor to control the conversational flavour of the voice-converted synthetic data; 3) we train a TTS system that consumes the augmented data. We prove that our technology enables F0 controllability, is scalable across speakers and languages and is competitive in terms of naturalness over a state-of-the-art baseline model, another augmented method which does not make use of F0 information.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2020

Low-resource expressive text-to-speech using data augmentation

While recent neural text-to-speech (TTS) systems perform remarkably well...
research
04/21/2022

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Data augmentation via voice conversion (VC) has been successfully applie...
research
04/06/2022

Using Synthetic Data for Conversational Response Generation in Low-resource Settings

Response generation is a task in natural language processing (NLP) where...
research
02/13/2022

Distribution augmentation for low-resource expressive text-to-speech

This paper presents a novel data augmentation technique for text-to-spee...
research
01/11/2023

Modelling low-resource accents without accent-specific TTS frontend

This work focuses on modelling a speaker's accent that does not have a d...
research
03/07/2022

Building and curating conversational corpora for diversity-aware language science and technology

We present a pipeline and tools to build a maximally natural data set of...
research
02/28/2023

The 2022 NIST Language Recognition Evaluation

In 2022, the U.S. National Institute of Standards and Technology (NIST) ...

Please sign up or login with your details

Forgot password? Click here to reset