DeepAI AI Chat
Log In Sign Up

Uses and Abuses of the Cross-Entropy Loss: Case Studies in Modern Deep Learning

11/10/2020
by   Elliott Gordon-Rodriguez, et al.
Layer 6 AI
Columbia University
0

Modern deep learning is primarily an experimental science, in which empirical advances occasionally come at the expense of probabilistic rigor. Here we focus on one such example; namely the use of the categorical cross-entropy loss to model data that is not strictly categorical, but rather takes values on the simplex. This practice is standard in neural network architectures with label smoothing and actor-mimic reinforcement learning, amongst others. Drawing on the recently discovered continuous-categorical distribution, we propose probabilistically-inspired alternatives to these models, providing an approach that is more principled and theoretically appealing. Through careful experimentation, including an ablation study, we identify the potential for outperformance in these models, thereby highlighting the importance of a proper probabilistic treatment, as well as illustrating some of the failure modes thereof.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/15/2022

Rényi Cross-Entropy Measures for Common Distributions and Processes with Memory

Two Rényi-type generalizations of the Shannon cross-entropy, the Rényi c...
06/14/2022

Loss Functions for Classification using Structured Entropy

Cross-entropy loss is the standard metric used to train classification m...
01/25/2019

Deep Learning on Small Datasets without Pre-Training using Cosine Loss

Two things seem to be indisputable in the contemporary deep learning dis...
06/12/2020

Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks

Modern neural architectures for classification tasks are trained using t...
02/20/2020

The continuous categorical: a novel simplex-valued exponential family

Simplex-valued data appear throughout statistics and machine learning, f...
11/22/2019

Instance Cross Entropy for Deep Metric Learning

Loss functions play a crucial role in deep metric learning thus a variet...
06/05/2019

Learning to Rank for Plausible Plausibility

Researchers illustrate improvements in contextual encoding strategies vi...