The CUHK-TUDELFT System for The SLT 2021 Children Speech Recognition Challenge

11/12/2020
by   Si-Ioi Ng, et al.
0

This technical report describes our submission to the 2021 SLT Children Speech Recognition Challenge (CSRC) Track 1. Our approach combines the use of a joint CTC-attention end-to-end (E2E) speech recognition framework, transfer learning, data augmentation and development of various language models. Procedures of data pre-processing, the background and the course of system development are described. The analysis of the experiment results, as well as the comparison between the E2E and DNN-HMM hybrid system are discussed in detail. Our system achieved a character error rate (CER) of 20.1 designated test set, and 23.6 at 10-th overall.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2020

Data Augmentation For Children's Speech Recognition – The "Ethiopian" System For The SLT 2021 Children Speech Recognition Challenge

This paper presents the "Ethiopian" system for the SLT 2021 Children Spe...
research
02/19/2021

End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study

A key desiderata for inclusive and accessible speech recognition technol...
research
10/02/2021

Significance of Data Augmentation for Improving Cleft Lip and Palate Speech Recognition

The automatic recognition of pathological speech, particularly from chil...
research
05/25/2016

On model architecture for a children's speech recognition interactive dialog system

This report presents a general model of the architecture of information ...
research
06/14/2023

Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure

To address the issue of poor generalization ability in end-to-end speech...
research
05/05/2020

End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning

Whispering is an important mode of human speech, but no end-to-end recog...
research
10/08/2020

Population Based Training for Data Augmentation and Regularization in Speech Recognition

Varying data augmentation policies and regularization over the course of...

Please sign up or login with your details

Forgot password? Click here to reset