Uses and Abuses of the Cross-Entropy Loss: Case Studies in Modern Deep Learning

11/10/2020
by   Elliott Gordon-Rodriguez, et al.
0

Modern deep learning is primarily an experimental science, in which empirical advances occasionally come at the expense of probabilistic rigor. Here we focus on one such example; namely the use of the categorical cross-entropy loss to model data that is not strictly categorical, but rather takes values on the simplex. This practice is standard in neural network architectures with label smoothing and actor-mimic reinforcement learning, amongst others. Drawing on the recently discovered continuous-categorical distribution, we propose probabilistically-inspired alternatives to these models, providing an approach that is more principled and theoretically appealing. Through careful experimentation, including an ablation study, we identify the potential for outperformance in these models, thereby highlighting the importance of a proper probabilistic treatment, as well as illustrating some of the failure modes thereof.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2022

Rényi Cross-Entropy Measures for Common Distributions and Processes with Memory

Two Rényi-type generalizations of the Shannon cross-entropy, the Rényi c...
research
06/14/2022

Loss Functions for Classification using Structured Entropy

Cross-entropy loss is the standard metric used to train classification m...
research
01/25/2019

Deep Learning on Small Datasets without Pre-Training using Cosine Loss

Two things seem to be indisputable in the contemporary deep learning dis...
research
06/12/2020

Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks

Modern neural architectures for classification tasks are trained using t...
research
02/20/2020

The continuous categorical: a novel simplex-valued exponential family

Simplex-valued data appear throughout statistics and machine learning, f...
research
11/22/2019

Instance Cross Entropy for Deep Metric Learning

Loss functions play a crucial role in deep metric learning thus a variet...
research
06/05/2019

Learning to Rank for Plausible Plausibility

Researchers illustrate improvements in contextual encoding strategies vi...

Please sign up or login with your details

Forgot password? Click here to reset