Multilingual Adaptation of RNN Based ASR Systems

11/13/2017
by   Markus Müller, et al.
0

A large amount of data is required for automatic speech recognition (ASR) systems achieving good performance. While such data is readily available for languages like English, there exists a long tail of languages with only limited language resources. By using data from additional source languages, this problem can be mitigated. In this work, we focus on multilingual systems based on recurrent neural networks (RNNs), trained using the Connectionist Temporal Classification (CTC) loss function. Using a multilingual set of acoustic units to train systems jointly on multiple languages poses difficulties: While the same phones share the same symbols across languages, they are pronounced slightly different because of, e.g., small shifts in tongue positions. To address this issue, we proposed Language Feature Vectors (LFVs) to train language adaptive multilingual systems. In this work, we extended this approach by introducing a novel technique which we call "modulation" to add LFVs . We evaluated our approach in multiple conditions, showing improvements in both full and low resource conditions as well as for grapheme and phone based systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/13/2017

Phonemic and Graphemic Multilingual CTC Based Speech Recognition

Training automatic speech recognition (ASR) systems requires large amoun...
research
06/02/2021

Dual Script E2E framework for Multilingual and Code-Switching ASR

India is home to multiple languages, and training automatic speech recog...
research
09/14/2019

Multilingual ASR with Massive Data Augmentation

Towards developing high-performing ASR for low-resource languages, appro...
research
05/16/2020

That Sounds Familiar: an Analysis of Phonetic Representations Transfer Across Languages

Only a handful of the world's languages are abundant with the resources ...
research
01/24/2022

Data and knowledge-driven approaches for multilingual training to improve the performance of speech recognition systems of Indian languages

We propose data and knowledge-driven approaches for multilingual trainin...
research
11/07/2020

Acoustics Based Intent Recognition Using Discovered Phonetic Units for Low Resource Languages

With recent advancements in language technologies, humansare now interac...
research
12/13/2016

Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints

Mismatched transcriptions have been proposed as a mean to acquire probab...

Please sign up or login with your details

Forgot password? Click here to reset