Multilingual Speech Recognition for Low-Resource Indian Languages using Multi-Task conformer

08/22/2021
by   Krishna D N, et al.
0

Transformers have recently become very popular for sequence-to-sequence applications such as machine translation and speech recognition. In this work, we propose a multi-task learning-based transformer model for low-resource multilingual speech recognition for Indian languages. Our proposed model consists of a conformer [1] encoder and two parallel transformer decoders. We use a phoneme decoder (PHN-DEC) for the phoneme recognition task and a grapheme decoder (GRP-DEC) to predict grapheme sequence. We consider the phoneme recognition task as an auxiliary task for our multi-task learning framework. We jointly optimize the network for both phoneme and grapheme recognition tasks using Joint CTC-Attention [2] training. We use a conditional decoding scheme to inject the language information into the model before predicting the grapheme sequence. Our experiments show that our proposed approach can obtain significant improvement over previous approaches [4]. We also show that our conformer-based dual-decoder approach outperforms both the transformer-based dual-decoder approach and single decoder approach. Finally, We compare monolingual ASR models with our proposed multilingual ASR approach.

READ FULL TEXT
research
08/22/2021

A Dual-Decoder Conformer for Multilingual Speech Recognition

Transformer-based models have recently become very popular for sequence-...
research
11/02/2020

Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation

We introduce dual-decoder Transformer, a new model architecture that joi...
research
10/21/2020

A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks

Attention-based sequence-to-sequence modeling provides a powerful and el...
research
07/15/2021

Multi-task Learning with Cross Attention for Keyword Spotting

Keyword spotting (KWS) is an important technique for speech applications...
research
04/05/2016

Character-Level Neural Translation for Multilingual Media Monitoring in the SUMMA Project

The paper steps outside the comfort-zone of the traditional NLP tasks li...
research
09/27/2016

Multi-task Recurrent Model for True Multilingual Speech Recognition

Research on multilingual speech recognition remains attractive yet chall...
research
12/03/2020

Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition

One crucial challenge of real-world multilingual speech recognition is t...

Please sign up or login with your details

Forgot password? Click here to reset