Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition

02/07/2018
by   Xuesong Yang, et al.
0

The performance of automatic speech recognition systems degrades with increasing mismatch between the training and testing scenarios. Differences in speaker accents are a significant source of such mismatch. The traditional approach to deal with multiple accents involves pooling data from several accents during training and building a single model in multi-task fashion, where tasks correspond to individual accents. In this paper, we explore an alternate model where we jointly learn an accent classifier and a multi-task acoustic model. Experiments on the American English Wall Street Journal and British English Cambridge corpora demonstrate that our joint model outperforms the strong multi-task acoustic model baseline. We obtain a 5.94 improvement in word error rate on British English, and 9.47 improvement on American English. This illustrates that jointly modeling with accent information improves acoustic model performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2021

Senone-aware Adversarial Multi-task Training for Unsupervised Child to Adult Speech Adaptation

Acoustic modeling for child speech is challenging due to the high acoust...
research
02/02/2019

Using multi-task learning to improve the performance of acoustic-to-word and conventional hybrid models

Acoustic-to-word (A2W) models that allow direct mapping from acoustic si...
research
11/12/2018

Multi-encoder multi-resolution framework for end-to-end speech recognition

Attention-based methods and Connectionist Temporal Classification (CTC) ...
research
02/01/2018

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription

State-of-the-art English automatic speech recognition systems typically ...
research
05/07/2020

The Perceptimatic English Benchmark for Speech Perception Models

We present the Perceptimatic English Benchmark, an open experimental ben...
research
06/30/2020

Multi-view Frequency LSTM: An Efficient Frontend for Automatic Speech Recognition

Acoustic models in real-time speech recognition systems typically stack ...
research
11/05/2020

Multi-Accent Adaptation based on Gate Mechanism

When only a limited amount of accented speech data is available, to prom...

Please sign up or login with your details

Forgot password? Click here to reset