CUCHILD: A Large-Scale Cantonese Corpus of Child Speech for Phonology and Articulation Assessment

08/07/2020
by   Si-Ioi Ng, et al.
0

This paper describes the design and development of CUCHILD, a large-scale Cantonese corpus of child speech. The corpus contains spoken words collected from 1,986 child speakers aged from 3 to 6 years old. The speech materials include 130 words of 1 to 4 syllables in length. The speakers cover both typically developing (TD) children and children with speech disorder. The intended use of the corpus is to support scientific and clinical research, as well as technology development related to child speech assessment. The design of the corpus, including selection of words, participants recruitment, data acquisition process, and data pre-processing are described in detail. The results of acoustical analysis are presented to illustrate the properties of child speech. Potential applications of the corpus in automatic speech recognition, phonological error detection and speaker diarization are also discussed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2022

Lahjoita puhetta – a large-scale corpus of spoken Finnish with some benchmarks

The Donate Speech campaign has so far succeeded in gathering approximate...
research
10/08/2020

Analysis of Disfluency in Children's Speech

Disfluencies are prevalent in spontaneous speech, as shown in many studi...
research
12/19/2019

Developing a Multi-Platform Speech Recording System Toward Open Service of Building Large-Scale Speech Corpora

This paper briefly reports our ongoing attempt at the development of a m...
research
07/16/2019

RadioTalk: a large-scale corpus of talk radio transcripts

We introduce RadioTalk, a corpus of speech recognition transcripts sampl...
research
03/15/2023

A large-scale multimodal dataset of human speech recognition

Nowadays, non-privacy small-scale motion detection has attracted an incr...
research
08/07/2020

Automatic Detection of Phonological Errors in Child Speech Using Siamese Recurrent Autoencoder

Speech sound disorder (SSD) refers to the developmental disorder in whic...
research
06/01/2017

Polish Read Speech Corpus for Speech Tools and Services

This paper describes the speech processing activities conducted at the P...

Please sign up or login with your details

Forgot password? Click here to reset