DeepAI AI Chat
Log In Sign Up

Recurrent Neural Network Regularization

09/08/2014
by   Wojciech Zaremba, et al.
Google
0

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper, we show how to correctly apply dropout to LSTMs, and show that it substantially reduces overfitting on a variety of tasks. These tasks include language modeling, speech recognition, image caption generation, and machine translation.

READ FULL TEXT

page 1

page 2

page 3

page 4

02/05/2014

Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition

Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) archit...
11/05/2013

Dropout improves Recurrent Neural Networks for Handwriting Recognition

Recurrent neural networks (RNNs) with Long Short-Term memory cells curre...
08/07/2017

Regularizing and Optimizing LSTM Language Models

Recurrent neural networks (RNNs), such as long short-term memory network...
08/23/2020

Variational Inference-Based Dropout in Recurrent Neural Networks for Slot Filling in Spoken Language Understanding

This paper proposes to generalize the variational recurrent neural netwo...
09/15/2017

Learning Intrinsic Sparse Structures within Long Short-Term Memory

Model compression is significant for the wide adoption of Recurrent Neur...
11/05/2016

Quasi-Recurrent Neural Networks

Recurrent neural networks are a powerful tool for modeling sequential da...
10/17/2014

Learning to Execute

Recurrent Neural Networks (RNNs) with Long Short-Term Memory units (LSTM...

Code Repositories

tensorflow-rnn-shakespeare

Code from the "Tensorflow and deep learning - without a PhD, Part 2" session on Recurrent Neural Networks.


view repo

pytorch-intro

A couple of scripts to illustrate how to do CNNs and RNNs in PyTorch


view repo

tensorflow-statereader

This repository provides scripts to train an LSTM and then extract states from it in Tensorflow.


view repo

bAbi_QA

Solving Question-Answering Problem Using Deep Learning


view repo