Breaking the Activation Function Bottleneck through Adaptive Parameterization

05/22/2018
by   Sebastian Flennerhag, et al.
0

Standard neural network architectures are non-linear only by virtue of a simple element-wise activation function, making them both brittle and excessively large. In this paper, we consider methods for making the feed-forward layer more flexible while preserving its basic structure. We develop simple drop-in replacements that learn to adapt their parameterization conditional on the input, thereby increasing statistical efficiency significantly. We present an adaptive LSTM that advances the state of the art for the Penn Treebank and Wikitext-2 word-modeling tasks while using fewer parameters and converging in half as many iterations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2023

Moderate Adaptive Linear Units (MoLU)

We propose a new high-performance activation function, Moderate Adaptive...
research
02/08/2019

A simple and efficient architecture for trainable activation functions

Learning automatically the best activation function for the task is an a...
research
01/01/2019

Dense Morphological Network: An Universal Function Approximator

Artificial neural networks are built on the basic operation of linear co...
research
05/03/2019

Effectiveness of Self Normalizing Neural Networks for Text Classification

Self Normalizing Neural Networks(SNN) proposed on Feed Forward Neural Ne...
research
06/24/2020

AReLU: Attention-based Rectified Linear Unit

Element-wise activation functions play a critical role in deep neural ne...
research
05/03/2017

Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks

We present an approach for the verification of feed-forward neural netwo...
research
09/28/2020

A thermodynamically consistent chemical spiking neuron capable of autonomous Hebbian learning

We propose a fully autonomous, thermodynamically consistent set of chemi...

Please sign up or login with your details

Forgot password? Click here to reset