fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit

09/14/2021
by   Changhan Wang, et al.
0

This paper presents fairseq S^2, a fairseq extension for speech synthesis. We implement a number of autoregressive (AR) and non-AR text-to-speech models, and their multi-speaker variants. To enable training speech synthesis models with less curated data, a number of preprocessing tools are built and their importance is shown empirically. To facilitate faster iteration of development and analysis, a suite of automatic metrics is included. Apart from the features added specifically for this extension, fairseq S^2 also benefits from the scalability offered by fairseq and can be easily integrated with other state-of-the-art systems provided in this framework. The code, documentation, and pre-trained models are available at https://github.com/pytorch/fairseq/tree/master/examples/speech_synthesis.

READ FULL TEXT
research
09/14/2023

FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec

This paper presents FunCodec, a fundamental neural speech codec toolkit,...
research
10/11/2020

fairseq S2T: Fast Speech-to-Text Modeling with fairseq

We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) m...
research
05/12/2020

Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

generative network for text-to-speech synthesis with control over speec...
research
01/16/2020

SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis

Automatic speech synthesis is a challenging task that is becoming increa...
research
06/16/2021

Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain

Non-autoregressive (NAR) models have achieved a large inference computat...
research
10/06/2020

Neural Speech Synthesis for Estonian

This technical report describes the results of a collaboration between t...
research
10/15/2021

ESPnet2-TTS: Extending the Edge of TTS Research

This paper describes ESPnet2-TTS, an end-to-end text-to-speech (E2E-TTS)...

Please sign up or login with your details

Forgot password? Click here to reset