Dual Rectified Linear Units (DReLUs): A Replacement for Tanh Activation Functions in Quasi-Recurrent Neural Networks

07/25/2017
by   Fréderic Godin, et al.
0

In this paper, we introduce a novel type of Rectified Linear Unit (ReLU), called a Dual Rectified Linear Unit (DReLU). A DReLU, which comes with an unbounded positive and negative image, can be used as a drop-in replacement for a tanh activation function in the recurrent step of Quasi-Recurrent Neural Networks (QRNNs) (Bradbury et al. (2017)). Similar to ReLUs, DReLUs are less prone to the vanishing gradient problem, they are noise robust, and they induce sparse activations. We independently reproduce the QRNN experiments of Bradbury et al. (2017) and compare our DReLU-based QRNNs with the original tanh-based QRNNs and Long Short-Term Memory networks (LSTMs) on sentiment classification and word-level language modeling. Additionally, we evaluate on character-level language modeling, showing that we are able to stack up to eight QRNN layers with DReLUs, thus making it possible to improve the current state-of-the-art in character-level language modeling over shallow architectures based on LSTMs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2019

Multiplicative Models for Recurrent Language Modeling

Recently, there has been interest in multiplicative recurrent neural net...
research
08/27/2018

Pyramidal Recurrent Unit for Language Modeling

LSTMs are powerful tools for modeling contextual information, as evidenc...
research
07/19/2017

Improving Language Modeling using Densely Connected Recurrent Neural Networks

In this paper, we introduce the novel concept of densely connected layer...
research
11/05/2016

Quasi-Recurrent Neural Networks

Recurrent neural networks are a powerful tool for modeling sequential da...
research
08/07/2017

Regularizing and Optimizing LSTM Language Models

Recurrent neural networks (RNNs), such as long short-term memory network...
research
10/15/2018

Trellis Networks for Sequence Modeling

We present trellis networks, a new architecture for sequence modeling. O...
research
11/07/2013

Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks

In this paper we propose and investigate a novel nonlinear unit, called ...

Please sign up or login with your details

Forgot password? Click here to reset