Intriguing Properties of Contrastive Losses

by   Ting Chen, et al.

Contrastive loss and its variants have become very popular recently for learning visual representations without supervision. In this work, we first generalize the standard contrastive loss based on cross entropy to a broader family of losses that share an abstract form of ℒ_alignment + λℒ_distribution, where hidden representations are encouraged to (1) be aligned under some transformations/augmentations, and (2) match a prior distribution of high entropy. We show that various instantiations of the generalized loss perform similarly under the presence of a multi-layer non-linear projection head, and the temperature scaling (τ) widely used in the standard contrastive loss is (within a range) inversely related to the weighting (λ) between the two loss terms. We then study an intriguing phenomenon of feature suppression among competing features shared acros augmented views, such as "color distribution" vs "object class". We construct datasets with explicit and controllable competing features, and show that, for contrastive learning, a few bits of easy-to-learn shared features could suppress, and even fully prevent, the learning of other sets of competing features. Interestingly, this characteristic is much less detrimental in autoencoders based on a reconstruction loss. Existing contrastive learning methods critically rely on data augmentation to favor certain sets of features than others, while one may wish that a network would learn all competing features as much as its capacity allows.


page 4

page 6

page 11

page 13

page 14


A Broad Study on the Transferability of Visual Representations with Contrastive Learning

Tremendous progress has been made in visual representation learning, not...

Contrast and Classify: Alternate Training for Robust VQA

Recent Visual Question Answering (VQA) models have shown impressive perf...

Supervised Contrastive Learning to Classify Paranasal Anomalies in the Maxillary Sinus

Using deep learning techniques, anomalies in the paranasal sinus system ...

Robust Contrastive Learning against Noisy Views

Contrastive learning relies on an assumption that positive pairs contain...

Dissecting Supervised Constrastive Learning

Minimizing cross-entropy over the softmax scores of a linear map compose...

Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage

We investigate the role of projection heads, also known as projectors, w...

Cross-Entropy Estimators for Sequential Experiment Design with Reinforcement Learning

Reinforcement learning can effectively learn amortised design policies f...

Please sign up or login with your details

Forgot password? Click here to reset