Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

06/28/2023
by   Yuang Li, et al.
0

The integration of Language Models (LMs) has proven to be an effective way to address domain shifts in speech recognition. However, these approaches usually require a significant amount of target domain text data for the training of LMs. Different from these methods, in this work, with only a domain-specific text prompt, we propose two zero-shot ASR domain adaptation methods using LLaMA, a 7-billion-parameter large language model (LLM). LLM is used in two ways: 1) second-pass rescoring: reranking N-best hypotheses of a given ASR system with LLaMA; 2) deep LLM-fusion: incorporating LLM into the decoder of an encoder-decoder based ASR system. Experiments show that, with only one domain prompt, both methods can effectively reduce word error rates (WER) on out-of-domain TedLium-2 and SPGISpeech datasets. Especially, the deep LLM-fusion has the advantage of better recall of entity and out-of-vocabulary words.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2023

Corpus Synthesis for Zero-shot ASR domain Adaptation using Large Language Models

While Automatic Speech Recognition (ASR) systems are widely used in many...
research
04/08/2021

Exploring Machine Speech Chain for Domain Adaptation and Few-Shot Speaker Adaptation

Machine Speech Chain, which integrates both end-to-end (E2E) automatic s...
research
05/11/2023

Masked Audio Text Encoders are Effective Multi-Modal Rescorers

Masked Language Models (MLMs) have proven to be effective for second-pas...
research
01/06/2023

Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition

Despite improvements to the generalization performance of automated spee...
research
07/20/2021

Seed Words Based Data Selection for Language Model Adaptation

We address the problem of language model customization in applications w...
research
06/01/2023

Adapting an Unadaptable ASR System

As speech recognition model sizes and training data requirements grow, i...
research
04/22/2021

Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network

Adaption of end-to-end speech recognition systems to new tasks is known ...

Please sign up or login with your details

Forgot password? Click here to reset