Online normalizer calculation for softmax

05/08/2018
by   Maxim Milakov, et al.
0

The Softmax function is ubiquitous in machine learning, multiple previous works suggested faster alternatives for it. In this paper we propose a way to compute classical Softmax with fewer memory accesses and hypothesize that this reduction in memory accesses should improve Softmax performance on actual hardware. The benchmarks confirm this hypothesis: Softmax accelerates by up to 1.3x and Softmax+TopK combined by up to 5x.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2020

Exploring Alternatives to Softmax Function

Softmax function is widely used in artificial neural networks for multic...
research
05/28/2018

Sigsoftmax: Reanalysis of the Softmax Bottleneck

Softmax is an output activation function for modeling categorical probab...
research
04/28/2019

Softmax Optimizations for Intel Xeon Processor-based Platforms

Softmax is popular normalization method used in machine learning. Deep l...
research
03/16/2021

Softermax: Hardware/Software Co-Design of an Efficient Softmax for Transformers

Transformers have transformed the field of natural language processing. ...
research
03/03/2023

Convex Bounds on the Softmax Function with Applications to Robustness Verification

The softmax function is a ubiquitous component at the output of neural n...
research
12/16/2016

An Alternative Softmax Operator for Reinforcement Learning

A softmax operator applied to a set of values acts somewhat like the max...

Please sign up or login with your details

Forgot password? Click here to reset