Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations

05/23/2016
by   Behnam Neyshabur, et al.
0

We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-SGD optimization method, attuned to this geometry, that can learn plain RNNs with ReLU activations. On several datasets that require capturing long-term dependency structure, we show that path-SGD can significantly improve trainability of ReLU RNNs compared to RNNs trained with SGD, even with various recently suggested initialization schemes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2015

Path-SGD: Path-Normalized Optimization in Deep Neural Networks

We revisit the choice of SGD for training deep neural networks by recons...
research
10/30/2019

Input-Output Equivalence of Unitary and Contractive RNNs

Unitary recurrent neural networks (URNNs) have been proposed as a method...
research
02/04/2019

Can SGD Learn Recurrent Neural Networks with Provable Generalization?

Recurrent Neural Networks (RNNs) are among the most popular models in se...
research
11/12/2015

Improving performance of recurrent neural network with relu nonlinearity

In recent years significant progress has been made in successfully train...
research
10/22/2020

CryptoGRU: Low Latency Privacy-Preserving Text Analysis With GRU

Billions of text analysis requests containing private emails, personal t...
research
09/07/2023

Brief technical note on linearizing recurrent neural networks (RNNs) before vs after the pointwise nonlinearity

Linearization of the dynamics of recurrent neural networks (RNNs) is oft...
research
10/21/2019

On Predictive Information Sub-optimality of RNNs

Certain biological neurons demonstrate a remarkable capability to optima...

Please sign up or login with your details

Forgot password? Click here to reset