Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling

10/04/2018
by   Jaejin Cho, et al.
0

Sequence-to-sequence (seq2seq) approach for low-resource ASR is a relatively new direction in speech research. The approach benefits by performing model training without using lexicon and alignments. However, this poses a new problem of requiring more data compared to conventional DNN-HMM systems. In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model as a prior model, and then port them towards 4 other BABEL languages using transfer learning approach. We also explore different architectures for improving the prior multilingual seq2seq model. The paper also discusses the effect of integrating a recurrent neural network language model (RNNLM) with a seq2seq model during decoding. Experimental results show that the transfer learning approach from the multilingual model shows substantial gains over monolingual models across all 4 BABEL languages. Incorporating an RNNLM also brings significant improvements in terms of and achieves recognition performance comparable to the models trained with twice more training data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2018

Analysis of Multilingual Sequence-to-Sequence speech recognition systems

This paper investigates the applications of various multilingual approac...
research
11/19/2015

Transfer Learning for Speech and Language Processing

Transfer learning is a vital technique that generalizes models trained f...
research
01/06/2022

An exploratory experiment on Hindi, Bengali hate-speech detection and transfer learning using neural networks

This work presents our approach to train a neural network to detect hate...
research
11/06/2018

Language model integration based on memory control for sequence to sequence speech recognition

In this paper, we explore several new schemes to train a seq2seq model t...
research
04/29/2020

Meta-Transfer Learning for Code-Switched Speech Recognition

An increasing number of people in the world today speak a mixed-language...
research
11/06/2018

Transfer learning of language-independent end-to-end ASR with language model fusion

This work explores better adaptation methods to low-resource languages u...
research
04/03/2018

Multi-lingual neural title generation for e-Commerce browse pages

To provide better access of the inventory to buyers and better search en...

Please sign up or login with your details

Forgot password? Click here to reset