Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning

07/18/2023
by   Feng-Ting Liao, et al.
0

In this work, we propose a method to create domain-sensitive speech recognition models that utilize textual domain information by conditioning its generation on a given text prompt. This is accomplished by fine-tuning a pre-trained, end-to-end model (Whisper) to learn from demonstrations with prompt examples. We show that this ability can be generalized to different domains and even various prompt contexts, with our model gaining a Word Error Rate (WER) reduction of up to 33 as medical conversation, air traffic control communication, and financial meetings. Considering the limited availability of audio-transcript pair data, we further extend our method to text-only fine-tuning to achieve domain sensitivity as well as domain adaptation. We demonstrate that our text-only fine-tuned model can also attend to various prompt contexts, with the model reaching the most WER reduction of 29

READ FULL TEXT
research
03/31/2022

How Does Pre-trained Wav2Vec2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications

Recent work on self-supervised pre-training focus on leveraging large-sc...
research
10/13/2022

Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models

Labeled audio data is insufficient to build satisfying speech recognitio...
research
01/06/2023

Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition

Despite improvements to the generalization performance of automated spee...
research
06/27/2023

Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition

Automatic recognition of disordered and elderly speech remains highly ch...
research
04/09/2022

Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization

The most advanced abstractive dialogue summarizers lack generalization a...
research
09/22/2022

Prompting for a conversation: How to control a dialog model?

Dialog modelling faces a difficult trade-off. Models are trained on a la...
research
10/23/2019

Instance-Based Model Adaptation For Direct Speech Translation

Despite recent technology advancements, the effectiveness of neural appr...

Please sign up or login with your details

Forgot password? Click here to reset