On the Separability of Classes with the Cross-Entropy Loss Function

09/16/2019
by   Rudrajit Das, et al.
0

In this paper, we focus on the separability of classes with the cross-entropy loss function for classification problems by theoretically analyzing the intra-class distance and inter-class distance (i.e. the distance between any two points belonging to the same class and different classes, respectively) in the feature space, i.e. the space of representations learnt by neural networks. Specifically, we consider an arbitrary network architecture having a fully connected final layer with Softmax activation and trained using the cross-entropy loss. We derive expressions for the value and the distribution of the squared L2 norm of the product of a network dependent matrix and a random intra-class and inter-class distance vector (i.e. the vector between any two points belonging to the same class and different classes), respectively, in the learnt feature space (or the transformation of the original data) just before Softmax activation, as a function of the cross-entropy loss value. The main result of our analysis is the derivation of a lower bound for the probability with which the inter-class distance is more than the intra-class distance in this feature space, as a function of the loss value. We do so by leveraging some empirical statistical observations with mild assumptions and sound theoretical analysis. As per intuition, the probability with which the inter-class distance is more than the intra-class distance decreases as the loss value increases, i.e. the classes are better separated when the loss value is low. To the best of our knowledge, this is the first work of theoretical nature trying to explain the separability of classes in the feature space learnt by neural networks trained with the cross-entropy loss function.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2020

CC-Loss: Channel Correlation Loss For Image Classification

The loss function is a key component in deep learning models. A commonly...
research
11/17/2016

Squared Earth Mover's Distance-based Loss for Training Deep Neural Networks

In the context of single-label classification, despite the huge success ...
research
09/22/2022

CAMRI Loss: Improving Recall of a Specific Class without Sacrificing Accuracy

In real-world applications of multi-class classification models, misclas...
research
05/25/2019

Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness

Previous work shows that adversarially robust generalization requires la...
research
04/25/2021

Class Equilibrium using Coulomb's Law

Projection algorithms learn a transformation function to project the dat...
research
03/11/2023

Generalizing and Decoupling Neural Collapse via Hyperspherical Uniformity Gap

The neural collapse (NC) phenomenon describes an underlying geometric sy...
research
03/25/2021

Orthogonal Projection Loss

Deep neural networks have achieved remarkable performance on a range of ...

Please sign up or login with your details

Forgot password? Click here to reset