DeepAI AI Chat
Log In Sign Up

A Light-weight contextual spelling correction model for customizing transducer-based speech recognition systems

by   Xiaoqiang Wang, et al.

It's challenging to customize transducer-based automatic speech recognition (ASR) system with context information which is dynamic and unavailable during model training. In this work, we introduce a light-weight contextual spelling correction model to correct context-related recognition errors in transducer-based ASR systems. We incorporate the context information into the spelling correction model with a shared context encoder and use a filtering algorithm to handle large-size context lists. Experiments show that the model improves baseline ASR model performance with about 50 reduction, which also significantly outperforms the baseline method such as contextual LM biasing. The model also shows excellent performance for out-of-vocabulary terms not seen during training.


page 1

page 2

page 3

page 4


Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Contextual biasing is an important and challenging task for end-to-end a...

Fast and Robust Unsupervised Contextual Biasing for Speech Recognition

Automatic speech recognition (ASR) system is becoming a ubiquitous techn...

Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition

Optimization of modern ASR architectures is among the highest priority t...

Can Visual Context Improve Automatic Speech Recognition for an Embodied Agent?

The usage of automatic speech recognition (ASR) systems are becoming omn...

Multimodal Punctuation Prediction with Contextual Dropout

Automatic speech recognition (ASR) is widely used in consumer electronic...