THCHS-30 : A Free Chinese Speech Corpus

12/07/2015
by   Dong Wang, et al.
0

Speech data is crucially important for speech recognition research. There are quite some speech databases that can be purchased at prices that are reasonable for most research institutes. However, for young people who just start research activities or those who just gain initial interest in this direction, the cost for data is still an annoying barrier. We support the `free data' movement in speech recognition: research institutes (particularly supported by public funds) publish their data freely so that new researchers can obtain sufficient data to kick of their career. In this paper, we follow this trend and release a free Chinese speech database THCHS-30 that can be used to build a full- edged Chinese speech recognition system. We report the baseline system established with this database, including the performance under highly noisy conditions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2019

Speech Recognition With No Speech Or With Noisy Speech Beyond English

In this paper we demonstrate continuous noisy speech recognition using c...
research
09/16/2017

AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

An open-source Mandarin speech corpus called AISHELL-1 is released. It i...
research
12/30/2017

Multichannel Robot Speech Recognition Database: MChRSR

In real human robot interaction (HRI) scenarios, speech recognition repr...
research
08/09/2019

Challenging the Boundaries of Speech Recognition: The MALACH Corpus

There has been huge progress in speech recognition over the last several...
research
03/24/2022

Does human speech follow Benford's Law?

Researchers have observed that the frequencies of leading digits in many...
research
08/13/2019

IMS-Speech: A Speech to Text Tool

We present the IMS-Speech, a web based tool for German and English speec...
research
08/31/2018

AISHELL-2: Transforming Mandarin ASR Research Into Industrial Scale

AISHELL-1 is by far the largest open-source speech corpus available for ...

Please sign up or login with your details

Forgot password? Click here to reset