Distribution augmentation for low-resource expressive text-to-speech

02/13/2022
by   Mateusz Łajszczak, et al.
0

This paper presents a novel data augmentation technique for text-to-speech (TTS), that allows to generate new (text, audio) training examples without requiring any additional data. Our goal is to increase diversity of text conditionings available during training. This helps to reduce overfitting, especially in low-resource settings. Our method relies on substituting text and audio fragments in a way that preserves syntactical correctness. We take additional measures to ensure that synthesized speech does not contain artifacts caused by combining inconsistent audio samples. The perceptual evaluations show that our method improves speech quality over a number of datasets, speakers, and TTS architectures. We also demonstrate that it greatly improves robustness of attention-based TTS models.

READ FULL TEXT
research
07/20/2022

When Is TTS Augmentation Through a Pivot Language Useful?

Developing Automatic Speech Recognition (ASR) for low-resource languages...
research
04/21/2022

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Data augmentation via voice conversion (VC) has been successfully applie...
research
07/29/2022

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation

The availability of data in expressive styles across languages is limite...
research
05/20/2023

ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios

Text to Speech (TTS) models can generate natural and high-quality speech...
research
02/17/2022

Curriculum optimization for low-resource speech recognition

Modern end-to-end speech recognition models show astonishing results in ...
research
12/23/2020

Speech Synthesis as Augmentation for Low-Resource ASR

Speech synthesis might hold the key to low-resource speech recognition. ...
research
07/07/2019

Improving short text classification through global augmentation methods

We study the effect of different approaches to text augmentation. To do ...

Please sign up or login with your details

Forgot password? Click here to reset