Learning specialized activation functions with the Piecewise Linear Unit

04/08/2021
by   Yucong Zhou, et al.
0

The choice of activation functions is crucial for modern deep neural networks. Popular hand-designed activation functions like Rectified Linear Unit(ReLU) and its variants show promising performance in various tasks and models. Swish, the automatically discovered activation function, has been proposed and outperforms ReLU on many challenging datasets. However, it has two main drawbacks. First, the tree-based search space is highly discrete and restricted, which is difficult for searching. Second, the sample-based searching method is inefficient, making it infeasible to find specialized activation functions for each dataset or neural architecture. To tackle these drawbacks, we propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method. It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO. For example, on ImageNet classification dataset, PWLU improves 0.9 accuracy over Swish for ResNet-18/ResNet-50/MobileNet-V2/MobileNet-V3/EfficientNet-B0. PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2017

Searching for Activation Functions

The choice of activation functions in deep networks has a significant ef...
research
10/03/2018

Weighted Sigmoid Gate Unit for an Activation Function of Deep Neural Network

An activation function has crucial role in a deep neural network. A si...
research
08/02/2021

Piecewise Linear Units Improve Deep Neural Networks

The activation function is at the heart of a deep neural networks nonlin...
research
04/18/2023

Amplifying Sine Unit: An Oscillatory Activation Function for Deep Neural Networks to Recover Nonlinear Oscillations Efficiently

Many industrial and real life problems exhibit highly nonlinear periodic...
research
05/18/2023

Learning Activation Functions for Sparse Neural Networks

Sparse Neural Networks (SNNs) can potentially demonstrate similar perfor...
research
11/23/2022

Dual Graphs of Polyhedral Decompositions for the Detection of Adversarial Attacks

Previous work has shown that a neural network with the rectified linear ...
research
08/22/2018

An Attention-Gated Convolutional Neural Network for Sentence Classification

The classification task of sentences is very challenging because of the ...

Please sign up or login with your details

Forgot password? Click here to reset