Continual Learning for On-Device Speech Recognition using Disentangled Conformers

12/02/2022
by   Anuj Diwan, et al.
0

Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as speech models are increasingly deployed on personal devices, such models encounter user-specific distributional shifts. To simulate this real-world scenario, we introduce LibriContinual, a continual learning benchmark for speaker-specific domain adaptation derived from LibriVox audiobooks, with data corresponding to 118 individual speakers and 6 train splits per speaker of different sizes. Additionally, current speech recognition models and continual learning algorithms are not optimized to be compute-efficient. We adapt a general-purpose training algorithm NetAug for ASR and create a novel Conformer variant called the DisConformer (Disentangled Conformer). This algorithm produces ASR models consisting of a frozen 'core' network for general-purpose use and several tunable 'augment' networks for speaker-specific tuning. Using such models, we propose a novel compute-efficient continual learning algorithm called DisentangledCL. Our experiments show that the DisConformer models significantly outperform baselines on general ASR i.e. LibriSpeech (15.58 speaker-specific LibriContinual they significantly outperform trainable-parameter-matched baselines (by 20.65 match fully finetuned baselines in some settings.

READ FULL TEXT
research
07/11/2022

Online Continual Learning of End-to-End Speech Recognition Models

Continual Learning, also known as Lifelong Learning, aims to continually...
research
06/19/2023

Rehearsal-Free Online Continual Learning for Automatic Speech Recognition

Fine-tuning an Automatic Speech Recognition (ASR) model to new domains r...
research
07/14/2023

Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition

While Automatic Speech Recognition (ASR) models have shown significant a...
research
12/17/2021

Continual Learning for Monolingual End-to-End Automatic Speech Recognition

Adapting Automatic Speech Recognition (ASR) models to new domains leads ...
research
10/13/2021

Continual learning using lattice-free MMI for speech recognition

Continual learning (CL), or domain expansion, recently became a popular ...
research
06/10/2020

Continual Learning for Affective Computing

Real-world application require affect perception models to be sensitive ...
research
06/12/2023

Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation

In dysarthric speech recognition, data scarcity and the vast diversity b...

Please sign up or login with your details

Forgot password? Click here to reset