Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation

10/27/2022
by   Tsz Kin Lam, et al.
0

Data augmentation is a technique to generate new training data based on existing data. We evaluate the simple and cost-effective method of concatenating the original data examples to build new training instances. Continued training with such augmented data is able to improve off-the-shelf Transformer and Conformer models that were optimized on the original data only. We demonstrate considerable improvements on the LibriSpeech-960h test sets (WER 2.83 and 6.87 for test-clean and test-other), which carry over to models combined with shallow fusion (WER 2.55 and 6.27). Our method of continued training also leads to improvements of up to 0.9 WER on the ASR part of CoVoST-2 for four non English languages, and we observe that the gains are highly dependent on the size of the original training data. We compare different concatenation strategies and found that our method does not need speaker information to achieve its improvements. Finally, we demonstrate on two datasets that our methods also works for speech translation tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2020

SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech Translation

We propose autoencoding speaker conversion for training data augmentatio...
research
04/03/2021

On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR

We propose an on-the-fly data augmentation method for automatic speech r...
research
09/14/2019

Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade

For automatic speech translation (AST), end-to-end approaches are outper...
research
05/13/2022

Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition

Despite the rapid progress of automatic speech recognition (ASR) technol...
research
07/09/2021

Noisy Training Improves E2E ASR for the Edge

Automatic speech recognition (ASR) has become increasingly ubiquitous on...
research
12/11/2019

SpecAugment on Large Scale Datasets

Recently, SpecAugment, an augmentation scheme for automatic speech recog...
research
12/20/2022

An Augmentation Strategy for Visually Rich Documents

Many business workflows require extracting important fields from form-li...

Please sign up or login with your details

Forgot password? Click here to reset