Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer Learning

05/21/2020
by   Zhiping Zeng, et al.
0

In this work, we study leveraging extra text data to improve low-resource end-to-end ASR under cross-lingual transfer learning setting. To this end, we extend our prior work [1], and propose a hybrid Transformer-LSTM based architecture. This architecture not only takes advantage of the highly effective encoding capacity of the Transformer network but also benefits from extra text data due to the LSTM-based independent language model network. We conduct experiments on our in-house Malay corpus which contains limited labeled data and a large amount of extra text. Results show that the proposed architecture outperforms the previous LSTM-based architecture [1] by 24.2 relative word error rate (WER) when both are trained using limited labeled data. Starting from this, we obtain further 25.4 transfer learning from another resource-rich language. Moreover, we obtain additional 13.6 transferred model with the extra text data. Overall, our best model outperforms the vanilla Transformer ASR by 11.9 proposed hybrid architecture offers much faster inference compared to both LSTM and Transformer architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2023

Cross-Lingual Transfer Learning for Phrase Break Prediction with Multilingual Language Model

Phrase break prediction is a crucial task for improving the prosody natu...
research
11/06/2018

Transfer learning of language-independent end-to-end ASR with language model fusion

This work explores better adaptation methods to low-resource languages u...
research
06/09/2020

Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation

Transfer learning from high-resource languages is known to be an efficie...
research
11/25/2019

Independent language modeling architecture for end-to-end ASR

The attention-based end-to-end (E2E) automatic speech recognition (ASR) ...
research
07/05/2022

Compute Cost Amortized Transformer for Streaming ASR

We present a streaming, Transformer-based end-to-end automatic speech re...
research
05/08/2019

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation

We present state-of-the-art automatic speech recognition (ASR) systems e...
research
04/02/2020

The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment

We present a complete training pipeline to build a state-of-the-art hybr...

Please sign up or login with your details

Forgot password? Click here to reset