Orthogonal Projection Loss

by   Kanchana Ranasinghe, et al.

Deep neural networks have achieved remarkable performance on a range of classification tasks, with softmax cross-entropy (CE) loss emerging as the de-facto objective function. The CE loss encourages features of a class to have a higher projection score on the true class-vector compared to the negative classes. However, this is a relative constraint and does not explicitly force different class features to be well-separated. Motivated by the observation that ground-truth class representations in CE loss are orthogonal (one-hot encoded vectors), we develop a novel loss function termed `Orthogonal Projection Loss' (OPL) which imposes orthogonality in the feature space. OPL augments the properties of CE loss and directly enforces inter-class separation alongside intra-class clustering in the feature space through orthogonality constraints on the mini-batch level. As compared to other alternatives of CE, OPL offers unique advantages e.g., no additional learnable parameters, does not require careful negative mining and is not sensitive to the batch size. Given the plug-and-play nature of OPL, we evaluate it on a diverse range of tasks including image recognition (CIFAR-100), large-scale classification (ImageNet), domain generalization (PACS) and few-shot learning (miniImageNet, CIFAR-FS, tiered-ImageNet and Meta-dataset) and demonstrate its effectiveness across the board. Furthermore, OPL offers better robustness against practical nuisances such as adversarial attacks and label noise. Code is available at: https://github.com/kahnchana/opl.


page 5

page 6

page 13

page 14

page 15


Cyclical Focal Loss

The cross-entropy softmax loss is the primary loss function used to trai...

On the Separability of Classes with the Cross-Entropy Loss Function

In this paper, we focus on the separability of classes with the cross-en...

CAMRI Loss: Improving Recall of a Specific Class without Sacrificing Accuracy

In real-world applications of multi-class classification models, misclas...

OLÉ: Orthogonal Low-rank Embedding, A Plug and Play Geometric Loss for Deep Learning

Deep neural networks trained using a softmax layer at the top and the cr...

ViM: Out-Of-Distribution with Virtual-logit Matching

Most of the existing Out-Of-Distribution (OOD) detection algorithms depe...

Implicit Semantic Data Augmentation for Deep Networks

In this paper, we propose a novel implicit semantic data augmentation (I...

Well-classified Examples are Underestimated in Classification with Deep Neural Networks

The conventional wisdom behind learning deep classification models is to...

Please sign up or login with your details

Forgot password? Click here to reset