REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling

12/14/2020
by   Hu Hu, et al.
0

Accents mismatching is a critical problem for end-to-end ASR. This paper aims to address this problem by building an accent-robust RNN-T system with domain adversarial training (DAT). We unveil the magic behind DAT and provide, for the first time, a theoretical guarantee that DAT learns accent-invariant representations. We also prove that performing the gradient reversal in DAT is equivalent to minimizing the Jensen-Shannon divergence between domain output distributions. Motivated by the proof of equivalence, we introduce reDAT, a novel technique based on DAT, which relabels data using either unsupervised clustering or soft labels. Experiments on 23K hours of multi-accent data show that DAT achieves competitive results over accent-specific baselines on both native and non-native English accents but up to 13 unseen accents; our reDAT yields further improvements over DAT by 3 relatively on non-native accents of American and British English.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2020

AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition

Modern Automatic Speech Recognition (ASR) technology has evolved to iden...
research
03/01/2023

Synthetic Cross-accent Data Augmentation for Automatic Speech Recognition

The awareness for biased ASR datasets or models has increased notably in...
research
06/05/2020

ELITR Non-Native Speech Translation at IWSLT 2020

This paper is an ELITR system submission for the non-native speech trans...
research
05/25/2023

INTapt: Information-Theoretic Adversarial Prompt Tuning for Enhanced Non-Native Speech Recognition

Automatic Speech Recognition (ASR) systems have attained unprecedented p...
research
06/22/2019

End-to-End ASR for Code-switched Hindi-English Speech

End-to-end (E2E) models have been explored for large speech corpora and ...
research
05/26/2023

Adversarial Multi-task Learning for End-to-end Metaphor Detection

Metaphor detection (MD) suffers from limited training data. In this pape...
research
07/30/2020

Beyond ℋ-Divergence: Domain Adaptation Theory With Jensen-Shannon Divergence

We reveal the incoherence between the widely-adopted empirical domain ad...

Please sign up or login with your details

Forgot password? Click here to reset