Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages

07/03/2023
by   Devang Kulshreshtha, et al.
0

Connectionist Temporal Classification (CTC) models are popular for their balance between speed and performance for Automatic Speech Recognition (ASR). However, these CTC models still struggle in other areas, such as personalization towards custom words. A recent approach explores Contextual Adapters, wherein an attention-based biasing model for CTC is used to improve the recognition of custom entities. While this approach works well with enough data, we showcase that it isn't an effective strategy for low-resource languages. In this work, we propose a supervision loss for smoother training of the Contextual Adapters. Further, we explore a multilingual strategy to improve performance with limited training data. Our method achieves 48 in retrieving unseen custom entities for a low-resource language. Interestingly, as a by-product of training the Contextual Adapters, we see a 5-11 as well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2021

Voice Conversion Can Improve ASR in Very Low-Resource Settings

Voice conversion (VC) has been proposed to improve speech recognition sy...
research
05/28/2022

Adaptive Activation Network For Low Resource Multilingual Speech Recognition

Low resource automatic speech recognition (ASR) is a useful but thorny t...
research
06/12/2018

Multilingual End-to-End Speech Recognition with A Single Transformer on Low-Resource Languages

Sequence-to-sequence attention-based models integrate an acoustic, pronu...
research
07/17/2018

Low-Resource Contextual Topic Identification on Speech

In topic identification (topic ID) on real-world unstructured audio, an ...
research
04/08/2022

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

Low resource speech recognition has been long-suffering from insufficien...
research
11/10/2022

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities

End-to-end multilingual ASR has become more appealing because of several...
research
11/07/2020

Acoustics Based Intent Recognition Using Discovered Phonetic Units for Low Resource Languages

With recent advancements in language technologies, humansare now interac...

Please sign up or login with your details

Forgot password? Click here to reset