Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults

09/12/2023
by   Ahmed Adel Attia, et al.
0

Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data. However, this progress doesn't readily extend to ASR for children due to the limited availability of suitable child-specific databases and the distinct characteristics of children's speech. A recent study investigated leveraging the My Science Tutor (MyST) children's speech corpus to enhance Whisper's performance in recognizing children's speech. They were able to demonstrate some improvement on a limited testset. This paper builds on these findings by enhancing the utility of the MyST dataset through more efficient data preprocessing. We reduce the Word Error Rate (WER) on the MyST testset 13.93 from 13.23 generalized to unseen datasets. We also highlight important challenges towards improving children's ASR performance. The results showcase the viable and efficient integration of Whisper for effective children's speech recognition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2022

Automatic Speech recognition for Speech Assessment of Preschool Children

The acoustic and linguistic features of preschool speech are investigate...
research
06/19/2022

Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping

Automatic Speech Recognition (ASR) systems are known to exhibit difficul...
research
11/08/2016

Automatic recognition of child speech for robotic applications in noisy environments

Automatic speech recognition (ASR) allows a natural and intuitive interf...
research
09/13/2023

Enhancing Child Vocalization Classification in Multi-Channel Child-Adult Conversations Through Wav2vec2 Children ASR Features

Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that oft...
research
06/07/2023

An ASR-Based Tutor for Learning to Read: How to Optimize Feedback to First Graders

The interest in employing automatic speech recognition (ASR) in applicat...
research
09/06/2019

Neural Network-Based Modeling of Phonetic Durations

A deep neural network (DNN)-based model has been developed to predict no...
research
11/13/2020

The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines

Automatic speech recognition (ASR) has been significantly advanced with ...

Please sign up or login with your details

Forgot password? Click here to reset