VRAIN-UPV MLLP's system for the Blizzard Challenge 2021

This paper presents the VRAIN-UPV MLLP's speech synthesis system for the SH1 task of the Blizzard Challenge 2021. The SH1 task consisted in building a Spanish text-to-speech system trained on (but not limited to) the corpus released by the Blizzard Challenge 2021 organization. It included 5 hours of studio-quality recordings from a native Spanish female speaker. In our case, this dataset was solely used to build a two-stage neural text-to-speech pipeline composed of a non-autoregressive acoustic model with explicit duration modeling and a HiFi-GAN neural vocoder. Our team is identified as J in the evaluation results. Our system obtained very good results in the subjective evaluation tests. Only one system among other 11 participants achieved better naturalness than ours. Concretely, it achieved a naturalness MOS of 3.61 compared to 4.21 for real samples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2020

The NTU-AISG Text-to-speech System for Blizzard Challenge 2020

We report our NTU-AISG Text-to-speech (TTS) entry systems for the Blizza...
research
10/16/2020

Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion

Recent state-of-the-art neural text-to-speech (TTS) synthesis models hav...
research
08/30/2023

The DeepZen Speech Synthesis System for Blizzard Challenge 2023

This paper describes the DeepZen text to speech (TTS) system for Blizzar...
research
09/01/2023

The FruitShell French synthesis system at the Blizzard 2023 Challenge

This paper presents a French text-to-speech synthesis system for the Bli...
research
07/07/2022

BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus

BibleTTS is a large, high-quality, open speech dataset for ten languages...
research
03/22/2022

A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis

Speech synthesis has come a long way as current text-to-speech (TTS) mod...
research
01/25/2021

Domain-Dependent Speaker Diarization for the Third DIHARD Challenge

This report presents the system developed by the ABSP Laboratory team fo...

Please sign up or login with your details

Forgot password? Click here to reset