End-to-end Speech Recognition with Word-based RNN Language Models

08/08/2018
by   Takaaki Hori, et al.
0

This paper investigates the impact of word-based RNN language models (RNN-LMs) on the performance of end-to-end automatic speech recognition (ASR). In our prior work, we have proposed a multi-level LM, in which character-based and word-based RNN-LMs are combined in hybrid CTC/attention-based ASR. Although this multi-level approach achieves significant error reduction in the Wall Street Journal (WSJ) task, two different LMs need to be trained and used for decoding, which increase the computational cost and memory usage. In this paper, we further propose a novel word-based RNN-LM, which allows us to decode with only the word-based LM, where it provides look-ahead word probabilities to predict next characters instead of the character-based LM, leading competitive accuracy with less computation compared to the multi-level LM. We demonstrate the efficacy of the word-based RNN-LMs using a larger corpus, LibriSpeech, in addition to WSJ we used in the prior work. Furthermore, we show that the proposed model achieves 5.1 size is increased, which is the best WER reported for end-to-end ASR systems on this benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2020

Exploration of End-to-End ASR for OpenSTT – Russian Open Speech-to-Text Dataset

This paper presents an exploration of end-to-end automatic speech recogn...
research
09/13/2016

Character-Level Language Modeling with Hierarchical Recurrent Neural Networks

Recurrent neural network (RNN) based character-level language models (CL...
research
10/08/2021

Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units

In end-to-end automatic speech recognition (ASR), a model is expected to...
research
04/07/2022

3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition

Recently, Conformer based CTC/AED model has become a mainstream architec...
research
05/21/2023

Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems

End-to-end (e2e) systems have recently gained wide popularity in automat...
research
04/27/2021

On Addressing Practical Challenges for RNN-Transducer

In this paper, several works are proposed to address practical challenge...
research
08/02/2022

Multi-Module G2P Converter for Persian Focusing on Relations between Words

In this paper, we investigate the application of end-to-end and multi-mo...

Please sign up or login with your details

Forgot password? Click here to reset