Unbiased scalable softmax optimization

03/22/2018
by   Francois Fagan, et al.
0

Recent neural network and language models rely on softmax distributions with an extremely large number of categories. Since calculating the softmax normalizing constant in this context is prohibitively expensive, there is a growing literature of efficiently computable but biased estimates of the softmax. In this paper we propose the first unbiased algorithms for maximizing the softmax likelihood whose work per iteration is independent of the number of classes and datapoints (and no extra work is required at the end of each epoch). We show that our proposed unbiased methods comprehensively outperform the state-of-the-art on seven real world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/13/2018

Effectiveness of Hierarchical Softmax in Large Scale Classification Tasks

Typically, Softmax is used in the final layer of a neural network to get...
research
08/23/2023

How to Protect Copyright Data in Optimization of Large Language Models?

Large language models (LLMs) and generative AI have played a transformat...
research
07/24/2019

Sampled Softmax with Random Fourier Features

The computational cost of training with softmax cross entropy loss grows...
research
07/17/2023

Zero-th Order Algorithm for Softmax Attention Optimization

Large language models (LLMs) have brought about significant transformati...
research
12/31/2020

A Constant-time Adaptive Negative Sampling

Softmax classifiers with a very large number of classes naturally occur ...
research
01/30/2019

Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference

Computations for the softmax function are significantly expensive when t...
research
07/02/2023

Towards Unbiased Exploration in Partial Label Learning

We consider learning a probabilistic classifier from partially-labelled ...

Please sign up or login with your details

Forgot password? Click here to reset