KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset

04/17/2021
by   Saida Mussakhojayeva, et al.
0

This paper introduces a high-quality open-source speech synthesis dataset for Kazakh, a low-resource language spoken by over 13 million people worldwide. The dataset consists of about 93 hours of transcribed audio recordings spoken by two professional speakers (female and male). It is the first publicly available large-scale dataset developed to promote Kazakh text-to-speech (TTS) applications in both academia and industry. In this paper, we share our experience by describing the dataset development procedures and faced challenges, and discuss important future directions. To demonstrate the reliability of our dataset, we built baseline end-to-end TTS models and evaluated them using the subjective mean opinion score (MOS) measure. Evaluation results show that the best TTS models trained on our dataset achieve MOS above 4 for both speakers, which makes them applicable for practical use. The dataset, training recipe, and pretrained TTS models are freely available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2022

MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline

This paper introduces a high-quality open-source text-to-speech (TTS) sy...
research
01/19/2022

Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis

This paper introduces Opencpop, a publicly available high-quality Mandar...
research
01/15/2022

KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics

We present an expanded version of our previously released Kazakh text-to...
research
11/23/2022

IMaSC – ICFOSS Malayalam Speech Corpus

Modern text-to-speech (TTS) systems use deep learning to synthesize spee...
research
05/05/2022

Introducing the Welsh Text Summarisation Dataset and Baseline Systems

Welsh is an official language in Wales and is spoken by an estimated 884...
research
11/12/2019

A Model-View-ViewModel (MVVM) Application Framework for Hearing Impairment Diagnosis

Around 466 million people worldwide (over 5 disabling hearing loss, and ...

Please sign up or login with your details

Forgot password? Click here to reset