Improving Accented Speech Recognition with Multi-Domain Training

03/14/2023
by   Lucas Maison, et al.
0

Thanks to the rise of self-supervised learning, automatic speech recognition (ASR) systems now achieve near-human performance on a wide variety of datasets. However, they still lack generalization capability and are not robust to domain shifts like accent variations. In this work, we use speech audio representing four different French accents to create fine-tuning datasets that improve the robustness of pre-trained ASR models. By incorporating various accents in the training set, we obtain both in-domain and out-of-domain improvements. Our numerical experiments show that we can reduce error rates by up to 25 (relative) on African and Belgian accents compared to single-domain training while keeping a good performance on standard French.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/24/2023

Adaptation of Whisper models to child speech recognition

Automatic Speech Recognition (ASR) systems often struggle with transcrib...
research
04/06/2022

Can Self-Supervised Learning solve the problem of child speech recognition?

Despite recent advancements in deep learning technologies, Child Speech ...
research
06/01/2023

Some voices are too common: Building fair speech recognition systems using the Common Voice dataset

Automatic speech recognition (ASR) systems become increasingly efficient...
research
04/06/2021

Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model

In this work, we investigate if the wav2vec 2.0 self-supervised pretrain...
research
03/30/2022

Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation

Speech distortions are a long-standing problem that degrades the perform...
research
06/29/2022

The THUEE System Description for the IARPA OpenASR21 Challenge

This paper describes the THUEE team's speech recognition system for the ...
research
10/24/2022

ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition

Speech recognition applications cover a range of different audio and tex...

Please sign up or login with your details

Forgot password? Click here to reset