What's in a Loss Function for Image Classification?

10/30/2020
by   Simon Kornblith, et al.
31

It is common to use the softmax cross-entropy loss to train neural networks on classification datasets where a single class label is assigned to each example. However, it has been shown that modifying softmax cross-entropy with label smoothing or regularizers such as dropout can lead to higher performance. This paper studies a variety of loss functions and output layer regularization strategies on image classification tasks. We observe meaningful differences in model predictions, accuracy, calibration, and out-of-distribution robustness for networks trained with different objectives. However, differences in hidden representations of networks trained with different objectives are restricted to the last few layers; representational similarity reveals no differences among network layers that are not close to the output. We show that all objectives that improve over vanilla softmax loss produce greater class separation in the penultimate layer of the network, which potentially accounts for improved performance on the original task, but results in features that transfer worse to other tasks.

READ FULL TEXT

page 7

page 17

page 18

research
08/30/2019

Handwritten Chinese Character Recognition by Convolutional Neural Network and Similarity Ranking

Convolution Neural Networks (CNN) have recently achieved state-of-the ar...
research
02/19/2020

Being Bayesian about Categorical Probability

Neural networks utilize the softmax as a building block in classificatio...
research
08/31/2021

Chi-square Loss for Softmax: an Echo of Neural Network Structure

Softmax working with cross-entropy is widely used in classification, whi...
research
11/14/2022

Interpreting Bias in the Neural Networks: A Peek Into Representational Similarity

Neural networks trained on standard image classification data sets are s...
research
05/01/2019

On Expected Accuracy

We empirically investigate the (negative) expected accuracy as an altern...
research
09/22/2020

Role of Orthogonality Constraints in Improving Properties of Deep Networks for Image Classification

Standard deep learning models that employ the categorical cross-entropy ...
research
06/16/2019

Mixture separability loss in a deep convolutional network for image classification

In machine learning, the cost function is crucial because it measures ho...

Please sign up or login with your details

Forgot password? Click here to reset