Contextual Biasing of Named-Entities with Large Language Models

09/01/2023
by   Chuanneng Sun, et al.
0

This paper studies contextual biasing with Large Language Models (LLMs), where during second-pass rescoring additional contextual information is provided to a LLM to boost Automatic Speech Recognition (ASR) performance. We propose to leverage prompts for a LLM without fine tuning during rescoring which incorporate a biasing list and few-shot examples to serve as additional information when calculating the score for the hypothesis. In addition to few-shot prompt learning, we propose multi-task training of the LLM to predict both the entity class and the next token. To improve the efficiency for contextual biasing and to avoid exceeding LLMs' maximum sequence lengths, we propose dynamic prompting, where we select the most likely class using the class tag prediction, and only use entities in this class as contexts for next token prediction. Word Error Rate (WER) evaluation is performed on i) an internal calling, messaging, and dictation dataset, and ii) the SLUE-Voxpopuli dataset. Results indicate that biasing lists and few-shot examples can achieve 17.8 multi-task training and dynamic prompting can achieve 20.0 WER improvement, respectively.

READ FULL TEXT
research
12/05/2018

End-to-end contextual speech recognition using class language models and a token passing decoder

End-to-end modeling (E2E) of automatic speech recognition (ASR) blends a...
research
10/29/2018

Contextual Speech Recognition with Difficult Negative Training Examples

Improving the representation of contextual information is key to unlocki...
research
02/09/2023

Leveraging supplementary text data to kick-start automatic speech recognition system development with limited transcriptions

Recent research using pre-trained transformer models suggests that just ...
research
08/28/2023

A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NER

The objective of few-shot named entity recognition is to identify named ...
research
01/17/2023

Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer

It is difficult for an end-to-end (E2E) ASR system to recognize words su...
research
06/02/2023

Can Contextual Biasing Remain Effective with Whisper and GPT-2?

End-to-end automatic speech recognition (ASR) and large language models,...

Please sign up or login with your details

Forgot password? Click here to reset