What's in a Loss Function for Image Classification?

10/30/2020
by   Simon Kornblith, et al.
31

It is common to use the softmax cross-entropy loss to train neural networks on classification datasets where a single class label is assigned to each example. However, it has been shown that modifying softmax cross-entropy with label smoothing or regularizers such as dropout can lead to higher performance. This paper studies a variety of loss functions and output layer regularization strategies on image classification tasks. We observe meaningful differences in model predictions, accuracy, calibration, and out-of-distribution robustness for networks trained with different objectives. However, differences in hidden representations of networks trained with different objectives are restricted to the last few layers; representational similarity reveals no differences among network layers that are not close to the output. We show that all objectives that improve over vanilla softmax loss produce greater class separation in the penultimate layer of the network, which potentially accounts for improved performance on the original task, but results in features that transfer worse to other tasks.

READ FULL TEXT

Authors

page 7

page 17

page 18

08/30/2019

Handwritten Chinese Character Recognition by Convolutional Neural Network and Similarity Ranking

Convolution Neural Networks (CNN) have recently achieved state-of-the ar...
02/19/2020

Being Bayesian about Categorical Probability

Neural networks utilize the softmax as a building block in classificatio...
08/31/2021

Chi-square Loss for Softmax: an Echo of Neural Network Structure

Softmax working with cross-entropy is widely used in classification, whi...
12/23/2018

Leveraging Class Similarity to Improve Deep Neural Network Robustness

Traditionally artificial neural networks (ANNs) are trained by minimizin...
05/01/2019

On Expected Accuracy

We empirically investigate the (negative) expected accuracy as an altern...
09/22/2020

Role of Orthogonality Constraints in Improving Properties of Deep Networks for Image Classification

Standard deep learning models that employ the categorical cross-entropy ...
03/08/2018

Rethinking Feature Distribution for Loss Functions in Image Classification

We propose a large-margin Gaussian Mixture (L-GM) loss for deep neural n...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.