From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition

01/19/2023
by   Chao-Han Huck Yang, et al.
0

In this work, we propose a new parameter-efficient learning framework based on neural model reprogramming for cross-lingual speech recognition, which can re-purpose well-trained English automatic speech recognition (ASR) models to recognize the other languages. We design different auxiliary neural architectures focusing on learnable pre-trained feature enhancement that, for the first time, empowers model reprogramming on ASR. Specifically, we investigate how to select trainable components (i.e., encoder) of a conformer-based RNN-Transducer, as a frozen pre-trained backbone. Experiments on a seven-language multilingual LibriSpeech speech (MLS) task show that model reprogramming only requires 4.2 its original trainable parameters from a full ASR model to perform competitive results in a range of 11.9 addition, we discover different setups to make large-scale pre-trained ASR succeed in both monolingual and multilingual speech recognition. Our methods outperform existing ASR tuning architectures and their extension with self-supervised losses (e.g., w2v-bert) in terms of lower WER and better training efficiency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2023

Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models

Self-supervised learning (SSL) has been dramatically successful not only...
research
11/04/2020

Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis

Even though over seven hundred ethnic languages are spoken in Indonesia,...
research
06/10/2021

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition

Recent work on speech self-supervised learning (speech SSL) demonstrated...
research
07/21/2023

Prompting Large Language Models with Speech Recognition Abilities

Large language models have proven themselves highly flexible, able to so...
research
05/02/2020

A language score based output selection method for multilingual speech recognition

The quality of a multilingual speech recognition system can be improved ...
research
05/25/2023

INTapt: Information-Theoretic Adversarial Prompt Tuning for Enhanced Non-Native Speech Recognition

Automatic Speech Recognition (ASR) systems have attained unprecedented p...
research
09/02/2021

Coarse-To-Fine And Cross-Lingual ASR Transfer

End-to-end neural automatic speech recognition systems achieved recently...

Please sign up or login with your details

Forgot password? Click here to reset