DeepAI
Log In Sign Up

Improving RNN Transducer Based ASR with Auxiliary Tasks

11/05/2020
by   Chunxi Liu, et al.
0

End-to-end automatic speech recognition (ASR) models with a single neural network have recently demonstrated state-of-the-art results compared to conventional hybrid speech recognizers. Specifically, recurrent neural network transducer (RNN-T) has shown competitive ASR performance on various benchmarks. In this work, we examine ways in which RNN-T can achieve better ASR accuracy via performing auxiliary tasks. We propose (i) using the same auxiliary task as primary RNN-T ASR task, and (ii) performing context-dependent graphemic state prediction as in conventional hybrid modeling. In transcribing social media videos with varying training data size, we first evaluate the streaming ASR performance on three languages: Romanian, Turkish and German. We find that both proposed methods provide consistent improvements. Next, we observe that both auxiliary tasks demonstrate efficacy in learning deep transformer encoders for RNN-T criterion, thus achieving competitive results - 2.0 LibriSpeech test-clean/other - as compared to prior top performing models.

READ FULL TEXT

page 1

page 2

page 3

page 4

11/03/2020

Improving RNN transducer with normalized jointer network

Recurrent neural transducer (RNN-T) is a promising end-to-end (E2E) mode...
08/12/2020

Transfer Learning Approaches for Streaming End-to-End Speech Recognition System

Transfer learning (TL) is widely used in conventional hybrid automatic s...
11/09/2020

Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR

In this work, to measure the accuracy and efficiency for a latency-contr...
01/14/2022

A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies

In this study, we present recent developments of models trained with the...
05/19/2020

Fast, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces

In this work, we first show that on the widely used LibriSpeech benchmar...
04/07/2015

Transferring Knowledge from a RNN to a DNN

Deep Neural Network (DNN) acoustic models have yielded many state-of-the...
11/01/2021

Sequence Transduction with Graph-based Supervision

The recurrent neural network transducer (RNN-T) objective plays a major ...