speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

04/03/2021
by   Junbo Zhang, et al.
0

This paper introduces a new open-source speech corpus named "speechocean762" designed for pronunciation assessment use, consisting of 5000 English utterances from 250 non-native speakers, where half of the speakers are children. Five experts annotated each of the utterances at sentence-level, word-level and phoneme-level. A baseline system is released in open source to illustrate the phoneme-level pronunciation assessment workflow on this corpus. This corpus is allowed to be used freely for commercial and non-commercial purposes. It is available for free download from OpenSLR, and the corresponding baseline system is published in the Kaldi speech recognition toolkit.

READ FULL TEXT
research
01/22/2020

TLT-school: a Corpus of Non Native Children Speech

This paper describes "TLT-school" a corpus of speech utterances collecte...
research
04/13/2021

Experiments of ASR-based mispronunciation detection for children and adult English learners

Pronunciation is one of the fundamentals of language learning, and it is...
research
09/22/2020

A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline

We present an open-source speech corpus for the Kazakh language. The Kaz...
research
10/06/2017

The DIRHA-English corpus and related tasks for distant-speech recognition in domestic environments

This paper introduces the contents and the possible usage of the DIRHA-E...
research
06/07/2021

Weakly-supervised word-level pronunciation error detection in non-native English speech

We propose a weakly-supervised model for word-level mispronunciation det...
research
12/12/2021

Learning Nigerian accent embeddings from speech: preliminary results based on SautiDB-Naija corpus

This paper describes foundational efforts with SautiDB-Naija, a novel co...
research
04/30/2020

The role of context in neural pitch accent detection in English

Prosody is a rich information source in natural language, serving as a m...

Please sign up or login with your details

Forgot password? Click here to reset