REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling

by   Hu Hu, et al.

Accents mismatching is a critical problem for end-to-end ASR. This paper aims to address this problem by building an accent-robust RNN-T system with domain adversarial training (DAT). We unveil the magic behind DAT and provide, for the first time, a theoretical guarantee that DAT learns accent-invariant representations. We also prove that performing the gradient reversal in DAT is equivalent to minimizing the Jensen-Shannon divergence between domain output distributions. Motivated by the proof of equivalence, we introduce reDAT, a novel technique based on DAT, which relabels data using either unsupervised clustering or soft labels. Experiments on 23K hours of multi-accent data show that DAT achieves competitive results over accent-specific baselines on both native and non-native English accents but up to 13 unseen accents; our reDAT yields further improvements over DAT by 3 relatively on non-native accents of American and British English.



There are no comments yet.


page 1

page 2

page 3

page 4


AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition

Modern Automatic Speech Recognition (ASR) technology has evolved to iden...

ELITR Non-Native Speech Translation at IWSLT 2020

This paper is an ELITR system submission for the non-native speech trans...

Dual Script E2E framework for Multilingual and Code-Switching ASR

India is home to multiple languages, and training automatic speech recog...

End-to-End ASR for Code-switched Hindi-English Speech

End-to-end (E2E) models have been explored for large speech corpora and ...

Fine-tuning of Pre-trained End-to-end Speech Recognition with Generative Adversarial Networks

Adversarial training of end-to-end (E2E) ASR systems using generative ad...

The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge

This paper describes the NTNU ASR system participating in the Interspeec...

Beyond ℋ-Divergence: Domain Adaptation Theory With Jensen-Shannon Divergence

We reveal the incoherence between the widely-adopted empirical domain ad...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.