Activate or Not: Learning Customized Activation

09/10/2020
by   Ningning Ma, et al.
0

Modern activation layers use non-linear functions to activate the neurons. In this paper, we present a simple but effective activation function we term ACON which learns to activate the neurons or not. Surprisingly, we find Swish, the recent popular NAS-searched activation, can be interpreted as a smooth approximation to ReLU. Intuitively, in the same way, we approximate the variants in the ReLU family to the Swish family, we call ACON, which makes Swish a special case of ACON and remarkably improves the performance. Next, we present meta-ACON, which explicitly learns to optimize the parameter switching between non-linear (activate) and linear (inactivate) and provides a new design space. By simply changing the activation function, we improve the ImageNet top-1 accuracy rate by 6.7 respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2021

SAU: Smooth activation function using convolution with approximate identities

Well-known activation functions like ReLU or Leaky ReLU are non-differen...
research
11/06/2020

Parametric Flatten-T Swish: An Adaptive Non-linear Activation Function For Deep Learning

Activation function is a key component in deep learning that performs no...
research
01/01/2019

Dense Morphological Network: An Universal Function Approximator

Artificial neural networks are built on the basic operation of linear co...
research
05/15/2023

ReLU soothes the NTK condition number and accelerates optimization for wide neural networks

Rectified linear unit (ReLU), as a non-linear activation function, is we...
research
12/31/2022

Smooth Mathematical Function from Compact Neural Networks

This is paper for the smooth function approximation by neural networks (...
research
06/18/2023

Learn to Enhance the Negative Information in Convolutional Neural Network

This paper proposes a learnable nonlinear activation mechanism specifica...
research
02/11/2023

Global Convergence Rate of Deep Equilibrium Models with General Activations

In a recent paper, Ling et al. investigated the over-parametrized Deep E...

Please sign up or login with your details

Forgot password? Click here to reset