RescoreBERT: Discriminative Speech Recognition Rescoring with BERT

02/02/2022
by   Liyan Xu, et al.
0

Second-pass rescoring is an important component in automatic speech recognition (ASR) systems that is used to improve the outputs from a first-pass decoder by implementing a lattice rescoring or n-best re-ranking. While pretraining with a masked language model (MLM) objective has received great success in various natural language understanding (NLU) tasks, it has not gained traction as a rescoring model for ASR. Specifically, training a bidirectional model like BERT on a discriminative objective such as minimum WER (MWER) has not been explored. Here we show how to train a BERT-based rescoring model with MWER loss, to incorporate the improvements of a discriminative loss into fine-tuning of deep bidirectional pretrained models for ASR. Specifically, we propose a fusion strategy that incorporates the MLM into the discriminative training process to effectively distill knowledge from a pretrained model. We further propose an alternative discriminative loss. This approach, which we call RescoreBERT, reduces WER by 6.6 clean/other test sets over a BERT baseline without discriminative objective. We also evaluate our method on an internal dataset from a conversational agent and find that it reduces both latency and WER (by 3 to 8 rescoring model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2021

Speech Recognition by Simply Fine-tuning BERT

We propose a simple method for automatic speech recognition (ASR) by fin...
research
11/02/2020

Adapting Pretrained Transformer to Lattices for Spoken Language Understanding

Lattices are compact representations that encode multiple hypotheses, su...
research
10/25/2019

L2RS: A Learning-to-Rescore Mechanism for Automatic Speech Recognition

Modern Automatic Speech Recognition (ASR) systems primarily rely on scor...
research
04/21/2021

Discriminative Self-training for Punctuation Prediction

Punctuation prediction for automatic speech recognition (ASR) output tra...
research
10/08/2020

Discriminatively-Tuned Generative Classifiers for Robust Natural Language Inference

While discriminative neural network classifiers are generally preferred,...
research
07/20/2023

Integrating Pretrained ASR and LM to Perform Sequence Generation for Spoken Language Understanding

There has been an increased interest in the integration of pretrained sp...
research
11/06/2018

Discriminative training of RNNLMs with the average word error criterion

In automatic speech recognition (ASR), recurrent neural language models ...

Please sign up or login with your details

Forgot password? Click here to reset