Data Techniques For Online End-to-end Speech Recognition

01/24/2020
by   Yang Chen, et al.
0

Practitioners often need to build ASR systems for new use cases in a short amount of time, given limited in-domain data. While recently developed end-to-end methods largely simplify the modeling pipelines, they still suffer from the data sparsity issue. In this work, we explore a few simple-to-implement techniques for building online ASR systems in an end-to-end fashion, with a small amount of transcribed data in the target domain. These techniques include data augmentation in the target domain, domain adaptation using models previously trained on a large source domain, and knowledge distillation on non-transcribed target domain data; they are applicable in real scenarios with different types of resources. Our experiments demonstrate that each technique is independently useful in the low-resource setting, and combining them yields significant improvement of the online ASR performance in the target domain.

READ FULL TEXT
research
02/18/2022

Domain Adaptation of low-resource Target-Domain models using well-trained ASR Conformer Models

In this paper, we investigate domain adaptation for low-resource Automat...
research
12/04/2018

Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain

End-to-end deep learning language or dialect identification systems oper...
research
09/12/2021

Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages

Building an automatic speech recognition (ASR) system from scratch requi...
research
08/25/2023

Decoupled Structure for Improved Adaptability of End-to-End Models

Although end-to-end (E2E) trainable automatic speech recognition (ASR) h...
research
02/16/2023

Adaptable End-to-End ASR Models using Replaceable Internal LMs and Residual Softmax

End-to-end (E2E) automatic speech recognition (ASR) implicitly learns th...
research
12/31/2022

Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek

Modern speech recognition systems exhibits rapid performance degradation...

Please sign up or login with your details

Forgot password? Click here to reset