AISHELL-2: Transforming Mandarin ASR Research Into Industrial Scale

08/31/2018
by   Jiayu Du, et al.
0

AISHELL-1 is by far the largest open-source speech corpus available for Mandarin speech recognition research. It was released with a baseline system containing solid training and testing pipelines for Mandarin ASR. In AISHELL-2, 1000 hours of clean read-speech data from iOS is published, which is free for academic usage. On top of AISHELL-2 corpus, an improved recipe is developed and released, containing key components for industrial applications, such as Chinese word segmentation, flexible vocabulary expension and phone set transformation etc. Pipelines support various state-of-the-art techniques, such as time-delayed neural networks and Lattic-Free MMI objective funciton. In addition, we also release dev and test data from other channels(Android and Mic). For research community, we hope that AISHELL-2 corpus can be a solid resource for topics like transfer learning and robust ASR. For industry, we hope AISHELL-2 recipe can be a helpful reference for building meaningful industrial systems and products.

READ FULL TEXT
research
02/09/2021

BembaSpeech: A Speech Recognition Corpus for the Bemba Language

We present a preprocessed, ready-to-use automatic speech recognition cor...
research
09/16/2017

AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

An open-source Mandarin speech corpus called AISHELL-1 is released. It i...
research
11/26/2019

ATCSpeech: a multilingual pilot-controller speech corpus from real Air Traffic Control environment

Automatic Speech Recognition (ASR) is greatly developed in recent years,...
research
12/23/2018

Pansori: ASR Corpus Generation from Open Online Video Contents

This paper introduces Pansori, a program used to create ASR (automatic s...
research
04/22/2021

Earnings-21: A Practical Benchmark for ASR in the Wild

Commonly used speech corpora inadequately challenge academic and commerc...
research
12/07/2015

THCHS-30 : A Free Chinese Speech Corpus

Speech data is crucially important for speech recognition research. Ther...
research
09/08/2022

Goodness of Pronunciation Pipelines for OOV Problem

In the following report we propose pipelines for Goodness of Pronunciati...

Please sign up or login with your details

Forgot password? Click here to reset