Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech

09/14/2021
by   Katrin Tomanek, et al.
0

Automatic Speech Recognition (ASR) systems are often optimized to work best for speakers with canonical speech patterns. Unfortunately, these systems perform poorly when tested on atypical speech and heavily accented speech. It has previously been shown that personalization through model fine-tuning substantially improves performance. However, maintaining such large models per speaker is costly and difficult to scale. We show that by adding a relatively small number of extra parameters to the encoder layers via so-called residual adapter, we can achieve similar adaptation gains compared to model fine-tuning, while only updating a tiny fraction (less than 0.5 We demonstrate this on two speech adaptation tasks (atypical and accented speech) and for two state-of-the-art ASR architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2023

Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition

While Automatic Speech Recognition (ASR) models have shown significant a...
research
05/03/2021

Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models

This work analyzes how attention-based Bidirectional Long Short-Term Mem...
research
09/21/2023

Sparsely Shared LoRA on Whisper for Child Speech Recognition

Whisper is a powerful automatic speech recognition (ASR) model. Neverthe...
research
11/02/2022

Intermediate Fine-Tuning Using Imperfect Synthetic Speech for Improving Electrolaryngeal Speech Recognition

Research on automatic speech recognition (ASR) systems for electrolaryng...
research
06/09/2023

Developing Speech Processing Pipelines for Police Accountability

Police body-worn cameras have the potential to improve accountability an...
research
06/27/2023

Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition

Automatic recognition of disordered and elderly speech remains highly ch...
research
04/02/2022

Speaker adaptation for Wav2vec2 based dysarthric ASR

Dysarthric speech recognition has posed major challenges due to lack of ...

Please sign up or login with your details

Forgot password? Click here to reset