Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition

07/05/2021
by   Christian Huber, et al.
0

Neural sequence-to-sequence systems deliver state-of-the-art performance for automatic speech recognition (ASR). When using appropriate modeling units, e.g., byte-pair encoded characters, these systems are in principal open vocabulary systems. In practice, however, they often fail to recognize words not seen during training, e.g., named entities, numbers or technical terms. To alleviate this problem we supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly. After the training of the ASR system, and when it has already been deployed, a relevant word can be added or subtracted instantly without the need for further training. In this paper we demonstrate that through this mechanism our system is able to recognize more than 85 added words that it previously failed to recognize compared to a strong baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2022

Short-Term Word-Learning in a Dynamically Changing Environment

Neural sequence-to-sequence automatic speech recognition (ASR) systems a...
research
06/16/2015

Recognize Foreign Low-Frequency Words with Similar Pairs

Low-frequency words place a major challenge for automatic speech recogni...
research
07/09/2018

Foreign English Accent Adjustment by Learning Phonetic Patterns

State-of-the-art automatic speech recognition (ASR) systems struggle wit...
research
07/27/2022

Subword Dictionary Learning and Segmentation Techniques for Automatic Speech Recognition in Tamil and Kannada

We present automatic speech recognition (ASR) systems for Tamil and Kann...
research
02/20/2023

Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition

Due to the dynamic nature of human language, automatic speech recognitio...
research
07/16/2021

A Comparison of Methods for OOV-word Recognition on a New Public Dataset

A common problem for automatic speech recognition systems is how to reco...
research
11/20/2018

WEST: Word Encoded Sequence Transducers

Most of the parameters in large vocabulary models are used in embedding ...

Please sign up or login with your details

Forgot password? Click here to reset