Learning Fast Adaptation on Cross-Accented Speech Recognition

03/04/2020
by   Genta Indra Winata, et al.
0

Local dialects influence people to pronounce words of the same language differently from each other. The great variability and complex characteristics of accents creates a major challenge for training a robust and accent-agnostic automatic speech recognition (ASR) system. In this paper, we introduce a cross-accented English speech recognition task as a benchmark for measuring the ability of the model to adapt to unseen accents using the existing CommonVoice corpus. We also propose an accent-agnostic approach that extends the model-agnostic meta-learning (MAML) algorithm for fast adaptation to unseen accents. Our approach significantly outperforms joint training in both zero-shot, few-shot, and all-shot in the mixed-region and cross-region settings in terms of word error rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2019

Meta Learning for End-to-End Low-Resource Speech Recognition

In this paper, we proposed to apply meta learning approach for low-resou...
research
04/01/2022

Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition

Aphasia is a common speech and language disorder, typically caused by a ...
research
01/26/2022

Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition

The high cost of data acquisition makes Automatic Speech Recognition (AS...
research
02/25/2021

Meta-Learning for improving rare word recognition in end-to-end ASR

We propose a new method of generating meaningful embeddings for speech, ...
research
07/23/2023

A meta learning scheme for fast accent domain expansion in Mandarin speech recognition

Spoken languages show significant variation across mandarin and accent. ...
research
10/05/2021

Fast Contextual Adaptation with Neural Associative Memory for On-Device Personalized Speech Recognition

Fast contextual adaptation has shown to be effective in improving Automa...
research
11/10/2021

Scaling ASR Improves Zero and Few Shot Learning

With 4.5 million hours of English speech from 10 different sources acros...

Please sign up or login with your details

Forgot password? Click here to reset