A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline

09/22/2020
by   Yerbolat Khassanov, et al.
0

We present an open-source speech corpus for the Kazakh language. The Kazakh speech corpus (KSC) contains around 335 hours of transcribed audio comprising over 154,000 utterances spoken by participants from different regions, age groups, and gender. It was carefully inspected by native Kazakh speakers to ensure high quality. The KSC is the largest publicly available database developed to advance various Kazakh speech and language processing applications. In this paper, we first describe the data collection and prepossessing procedures followed by the description of the database specifications. We also share our experience and challenges faced during database construction. To demonstrate the reliability of the database, we performed the preliminary speech recognition experiments. The experimental results imply that the quality of audio and transcripts are promising. To enable experiment reproducibility and ease the corpus usage, we also released the ESPnet recipe.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/30/2021

USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments

We present a freely available speech corpus for the Uzbek language and r...
research
09/16/2017

AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

An open-source Mandarin speech corpus called AISHELL-1 is released. It i...
research
02/08/2018

Praaline: Integrating Tools for Speech Corpus Research

This paper presents Praaline, an open-source software system for managin...
research
04/03/2021

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

This paper introduces a new open-source speech corpus named "speechocean...
research
01/15/2022

KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics

We present an expanded version of our previously released Kazakh text-to...
research
09/22/2022

MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline

This paper introduces a high-quality open-source text-to-speech (TTS) sy...
research
03/27/2018

Comprehending Real Numbers: Development of Bengali Real Number Speech Corpus

Speech recognition has received a less attention in Bengali literature d...

Please sign up or login with your details

Forgot password? Click here to reset