Exploring Alternatives to Softmax Function

11/23/2020
by   Kunal Banerjee, et al.
18

Softmax function is widely used in artificial neural networks for multiclass classification, multilabel classification, attention mechanisms, etc. However, its efficacy is often questioned in literature. The log-softmax loss has been shown to belong to a more generic class of loss functions, called spherical family, and its member log-Taylor softmax loss is arguably the best alternative in this class. In another approach which tries to enhance the discriminative nature of the softmax function, soft-margin softmax (SM-softmax) has been proposed to be the most suitable alternative. In this work, we investigate Taylor softmax, SM-softmax and our proposed SM-Taylor softmax, an amalgamation of the earlier two functions, as alternatives to softmax function. Furthermore, we explore the effect of expanding Taylor softmax up to ten terms (original work proposed expanding only to two terms) along with the ramifications of considering Taylor softmax to be a finite or infinite series during backpropagation. Our experiments for the image classification task on different datasets reveal that there is always a configuration of the SM-Taylor softmax function that outperforms the normal softmax function and its other alternatives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2015

An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family

In a multi-class classification problem, it is standard to model the out...
research
05/10/2018

Ensemble Soft-Margin Softmax Loss for Image Classification

Softmax loss is arguably one of the most popular losses to train CNN mod...
research
12/23/2021

Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation

The softmax function is widely used in artificial neural networks for th...
research
05/08/2018

Online normalizer calculation for softmax

The Softmax function is ubiquitous in machine learning, multiple previou...
research
08/16/2021

Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechanism

Softmax is widely used in neural networks for multiclass classification,...
research
04/29/2016

The Z-loss: a shift and scale invariant classification loss belonging to the Spherical Family

Despite being the standard loss function to train multi-class neural net...
research
04/11/2023

r-softmax: Generalized Softmax with Controllable Sparsity Rate

Nowadays artificial neural network models achieve remarkable results in ...

Please sign up or login with your details

Forgot password? Click here to reset