Character-Level Language Modeling with Hierarchical Recurrent Neural Networks

09/13/2016
by   Kyuyeon Hwang, et al.
0

Recurrent neural network (RNN) based character-level language models (CLMs) are extremely useful for modeling out-of-vocabulary words by nature. However, their performance is generally much worse than the word-level language models (WLMs), since CLMs need to consider longer history of tokens to properly predict the next one. We address this problem by proposing hierarchical RNN architectures, which consist of multiple modules with different timescales. Despite the multi-timescale structures, the input and output layers operate with the character-level clock, which allows the existing RNN CLM training approaches to be directly applicable without any modifications. Our CLM models show better perplexity than Kneser-Ney (KN) 5-gram WLMs on the One Billion Word Benchmark with only 2 character-level end-to-end speech recognition examples on the Wall Street Journal (WSJ) corpus, where replacing traditional mono-clock RNN CLMs with the proposed models results in better recognition accuracies even though the number of parameters are reduced to 30

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2016

Character-Level Incremental Speech Recognition with Recurrent Neural Networks

In real-time speech recognition applications, the latency is an importan...
research
11/19/2015

Alternative structures for character-level RNNs

Recurrent neural networks are convenient and efficient models for langua...
research
02/11/2018

Understanding Recurrent Neural State Using Memory Signatures

We demonstrate a network visualization technique to analyze the recurren...
research
08/08/2018

End-to-end Speech Recognition with Word-based RNN Language Models

This paper investigates the impact of word-based RNN language models (RN...
research
05/28/2020

Subword RNNLM Approximations for Out-Of-Vocabulary Keyword Search

In spoken Keyword Search, the query may contain out-of-vocabulary (OOV) ...
research
05/09/2016

Efficiency Evaluation of Character-level RNN Training Schedules

We present four training and prediction schedules from the same characte...
research
12/16/2016

Delta Networks for Optimized Recurrent Network Computation

Many neural networks exhibit stability in their activation patterns over...

Please sign up or login with your details

Forgot password? Click here to reset