Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition

07/14/2023
by   Theresa Pekarek-Rosin, et al.
0

While Automatic Speech Recognition (ASR) models have shown significant advances with the introduction of unsupervised or self-supervised training techniques, these improvements are still only limited to a subsection of languages and speakers. Transfer learning enables the adaptation of large-scale multilingual models to not only low-resource languages but also to more specific speaker groups. However, fine-tuning on data from new domains is usually accompanied by a decrease in performance on the original domain. Therefore, in our experiments, we examine how well the performance of large-scale ASR models can be approximated for smaller domains, with our own dataset of German Senior Voice Commands (SVC-de), and how much of the general speech recognition performance can be preserved by selectively freezing parts of the model during training. To further increase the robustness of the ASR model to vocabulary and speakers outside of the fine-tuned domain, we apply Experience Replay for continual learning. By adding only a fraction of data from the original domain, we are able to reach Word-Error-Rates (WERs) below 5% on the new domain, while stabilizing performance for general speech recognition at acceptable WERs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2021

Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech

Automatic Speech Recognition (ASR) systems are often optimized to work b...
research
06/19/2023

Rehearsal-Free Online Continual Learning for Automatic Speech Recognition

Fine-tuning an Automatic Speech Recognition (ASR) model to new domains r...
research
03/26/2021

Continual Speaker Adaptation for Text-to-Speech Synthesis

Training a multi-speaker Text-to-Speech (TTS) model from scratch is comp...
research
05/12/2023

Investigating the Sensitivity of Automatic Speech Recognition Systems to Phonetic Variation in L2 Englishes

Automatic Speech Recognition (ASR) systems exhibit the best performance ...
research
04/07/2022

Detecting Dysfluencies in Stuttering Therapy Using wav2vec 2.0

Stuttering is a varied speech disorder that harms an individual's commun...
research
12/02/2022

Continual Learning for On-Device Speech Recognition using Disentangled Conformers

Automatic speech recognition research focuses on training and evaluating...
research
04/20/2023

Spaiche: Extending State-of-the-Art ASR Models to Swiss German Dialects

Recent breakthroughs in NLP largely increased the presence of ASR system...

Please sign up or login with your details

Forgot password? Click here to reset