Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation

12/23/2021
by   Shaoshi Sun, et al.
0

The softmax function is widely used in artificial neural networks for the multiclass classification problems, where the softmax transformation enforces the output to be positive and sum to one, and the corresponding loss function allows to use maximum likelihood principle to optimize the model. However, softmax leaves a large margin for loss function to conduct optimizing operation when it comes to high-dimensional classification, which results in low-performance to some extent. In this paper, we provide an empirical study on a simple and concise softmax variant, namely sparse-softmax, to alleviate the problem that occurred in traditional softmax in terms of high-dimensional classification problems. We evaluate our approach in several interdisciplinary tasks, the experimental results show that sparse-softmax is simpler, faster, and produces better results than the baseline models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2020

Exploring Alternatives to Softmax Function

Softmax function is widely used in artificial neural networks for multic...
research
02/05/2016

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

We propose sparsemax, a new activation function similar to the tradition...
research
07/02/2023

Towards Unbiased Exploration in Partial Label Learning

We consider learning a probabilistic classifier from partially-labelled ...
research
06/03/2018

Data-Free/Data-Sparse Softmax Parameter Estimation with Structured Class Geometries

This note considers softmax parameter estimation when little/no labeled ...
research
01/30/2019

Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference

Computations for the softmax function are significantly expensive when t...
research
04/29/2016

The Z-loss: a shift and scale invariant classification loss belonging to the Spherical Family

Despite being the standard loss function to train multi-class neural net...
research
02/08/2021

Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions

Takagi-Sugeno-Kang (TSK) fuzzy system with Gaussian membership functions...

Please sign up or login with your details

Forgot password? Click here to reset