The unreasonable effectiveness of the forget gate

04/13/2018
by   Jos van der Westhuizen, et al.
0

Given the success of the gated recurrent unit, a natural question is whether all the gates of the long short-term memory (LSTM) network are necessary. Previous research has shown that the forget gate is one of the most important gates in the LSTM. Here we show that a forget-gate-only version of the LSTM with chrono-initialized biases, not only provides computational savings but outperforms the standard LSTM on multiple benchmark datasets and competes with some of the best contemporary models. Our proposed network, the JANET, achieves accuracies of 99 standard LSTM which yields accuracies of 98.5

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2016

Empirical Evaluation of A New Approach to Simplifying Long Short-term Memory (LSTM)

The standard LSTM, although it succeeds in the modeling long-range depen...
research
05/25/2019

Bivariate Beta LSTM

Long Short-Term Memory (LSTM) infers the long term dependency through a ...
research
06/08/2018

Towards Binary-Valued Gates for Robust LSTM Training

Long Short-Term Memory (LSTM) is one of the most widely used recurrent s...
research
08/31/2021

Working Memory Connections for LSTM

Recurrent Neural Networks with Long Short-Term Memory (LSTM) make use of...
research
04/12/2020

Inception LSTM

In this paper, we proposed a novel deep-learning method called Inception...
research
05/30/2019

Quantization Loss Re-Learning Method

In order to quantize the gate parameters of the LSTM (Long Short-Term Me...
research
06/10/2017

Image Matching via Loopy RNN

Most existing matching algorithms are one-off algorithms, i.e., they usu...

Please sign up or login with your details

Forgot password? Click here to reset