CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

10/21/2021
by   Andreas Fürst, et al.
11

Contrastive learning with the InfoNCE objective is exceptionally successful in various self-supervised learning tasks. Recently, the CLIP model yielded impressive results on zero-shot transfer learning when using InfoNCE for learning visual representations from natural language supervision. However, InfoNCE as a lower bound on the mutual information has been shown to perform poorly for high mutual information. In contrast, the InfoLOOB upper bound (leave one out bound) works well for high mutual information but suffers from large variance and instabilities. We introduce "Contrastive Leave One Out Boost" (CLOOB), where modern Hopfield networks boost learning with the InfoLOOB objective. Modern Hopfield networks replace the original embeddings by retrieved embeddings in the InfoLOOB objective. The retrieved embeddings give InfoLOOB two assets. Firstly, the retrieved embeddings stabilize InfoLOOB, since they are less noisy and more similar to one another than the original embeddings. Secondly, they are enriched by correlations, since the covariance structure of embeddings is reinforced through retrievals. We compare CLOOB to CLIP after learning on the Conceptual Captions and the YFCC dataset with respect to their zero-shot transfer learning performance on other datasets. CLOOB consistently outperforms CLIP at zero-shot transfer learning across all considered architectures and datasets.

READ FULL TEXT
research
11/25/2021

Self-Distilled Self-Supervised Representation Learning

State-of-the-art frameworks in self-supervised learning have recently sh...
research
11/09/2022

miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings

This paper presents miCSE, a mutual information-based Contrastive learni...
research
12/14/2021

CLIP-Lite: Information Efficient Visual Representation Learning from Textual Annotations

We propose CLIP-Lite, an information efficient method for visual represe...
research
12/10/2021

Analysis and Prediction of NLP Models Via Task Embeddings

Task embeddings are low-dimensional representations that are trained to ...
research
11/13/2020

On the Transferability of VAE Embeddings using Relational Knowledge with Semi-Supervision

We propose a new model for relational VAE semi-supervision capable of ba...
research
12/21/2022

Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning

Traditional approaches to RL have focused on learning decision policies ...
research
10/08/2019

AutoML using Metadata Language Embeddings

As a human choosing a supervised learning algorithm, it is natural to be...

Please sign up or login with your details

Forgot password? Click here to reset