Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks

01/09/2019
by   Steffen Eger, et al.
0

Activation functions play a crucial role in neural networks because they are the nonlinearities which have been attributed to the success story of deep learning. One of the currently most popular activation functions is ReLU, but several competitors have recently been proposed or 'discovered', including LReLU functions and swish. While most works compare newly proposed activation functions on few tasks (usually from image classification) and against few competitors (usually ReLU), we perform the first large-scale comparison of 21 activation functions across eight different NLP tasks. We find that a largely unknown activation function performs most stably across all tasks, the so-called penalized tanh function. We also show that it can successfully replace the sigmoid and tanh gates in LSTM cells, leading to a 2 percentage point (pp) improvement over the standard choices on a challenging NLP task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2020

EIS – a family of activation functions combining Exponential, ISRU, and Softplus

Activation functions play a pivotal role in the function learning using ...
research
12/15/2018

Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning

Activation functions are essential for deep learning methods to learn an...
research
10/22/2021

Logical Activation Functions: Logit-space equivalents of Boolean Operators

Neuronal representations within artificial neural networks are commonly ...
research
10/16/2017

Searching for Activation Functions

The choice of activation functions in deep networks has a significant ef...
research
07/26/2020

Regularized Flexible Activation Function Combinations for Deep Neural Networks

Activation in deep neural networks is fundamental to achieving non-linea...
research
06/24/2022

Evolution of Activation Functions for Deep Learning-Based Image Classification

Activation functions (AFs) play a pivotal role in the performance of neu...
research
01/29/2018

Learning Combinations of Activation Functions

In the last decade, an active area of research has been devoted to desig...

Please sign up or login with your details

Forgot password? Click here to reset