Character-Level Incremental Speech Recognition with Recurrent Neural Networks

01/25/2016
by   Kyuyeon Hwang, et al.
0

In real-time speech recognition applications, the latency is an important issue. We have developed a character-level incremental speech recognition (ISR) system that responds quickly even during the speech, where the hypotheses are gradually improved while the speaking proceeds. The algorithm employs a speech-to-character unidirectional recurrent neural network (RNN), which is end-to-end trained with connectionist temporal classification (CTC), and an RNN-based character-level language model (LM). The output values of the CTC-trained RNN are character-level probabilities, which are processed by beam search decoding. The RNN LM augments the decoding by providing long-term dependency information. We propose tree-based online beam search with additional depth-pruning, which enables the system to process infinitely long input speech with low latency. This system not only responds quickly on speech but also can dictate out-of-vocabulary (OOV) words according to pronunciation. The proposed model achieves the word error rate (WER) of 8.90 Street Journal (WSJ) Nov'92 20K evaluation set when trained on the WSJ SI-284 training set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2019

Who Needs Words? Lexicon-Free Speech Recognition

Lexicon-free speech recognition naturally deals with the problem of out-...
research
09/13/2016

Character-Level Language Modeling with Hierarchical Recurrent Neural Networks

Recurrent neural network (RNN) based character-level language models (CL...
research
08/18/2015

End-to-End Attention-based Large Vocabulary Speech Recognition

Many of the current state-of-the-art Large Vocabulary Continuous Speech ...
research
12/30/2015

Online Keyword Spotting with a Character-Level Recurrent Neural Network

In this paper, we propose a context-aware keyword spotting model employi...
research
07/17/2023

TST: Time-Sparse Transducer for Automatic Speech Recognition

End-to-end model, especially Recurrent Neural Network Transducer (RNN-T)...
research
07/09/2018

On Training Recurrent Networks with Truncated Backpropagation Through Time in Speech Recognition

Recurrent neural networks have been the dominant models for many speech ...
research
11/21/2015

Online Sequence Training of Recurrent Neural Networks with Connectionist Temporal Classification

Connectionist temporal classification (CTC) based supervised sequence tr...

Please sign up or login with your details

Forgot password? Click here to reset