CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

10/21/2021
by   Andreas Fürst, et al.
11

Contrastive learning with the InfoNCE objective is exceptionally successful in various self-supervised learning tasks. Recently, the CLIP model yielded impressive results on zero-shot transfer learning when using InfoNCE for learning visual representations from natural language supervision. However, InfoNCE as a lower bound on the mutual information has been shown to perform poorly for high mutual information. In contrast, the InfoLOOB upper bound (leave one out bound) works well for high mutual information but suffers from large variance and instabilities. We introduce "Contrastive Leave One Out Boost" (CLOOB), where modern Hopfield networks boost learning with the InfoLOOB objective. Modern Hopfield networks replace the original embeddings by retrieved embeddings in the InfoLOOB objective. The retrieved embeddings give InfoLOOB two assets. Firstly, the retrieved embeddings stabilize InfoLOOB, since they are less noisy and more similar to one another than the original embeddings. Secondly, they are enriched by correlations, since the covariance structure of embeddings is reinforced through retrievals. We compare CLOOB to CLIP after learning on the Conceptual Captions and the YFCC dataset with respect to their zero-shot transfer learning performance on other datasets. CLOOB consistently outperforms CLIP at zero-shot transfer learning across all considered architectures and datasets.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 36

11/25/2021

Self-Distilled Self-Supervised Representation Learning

State-of-the-art frameworks in self-supervised learning have recently sh...
05/27/2020

On Mutual Information in Contrastive Learning for Visual Representations

In recent years, several unsupervised, "contrastive" learning algorithms...
12/14/2021

CLIP-Lite: Information Efficient Visual Representation Learning from Textual Annotations

We propose CLIP-Lite, an information efficient method for visual represe...
11/13/2020

On the Transferability of VAE Embeddings using Relational Knowledge with Semi-Supervision

We propose a new model for relational VAE semi-supervision capable of ba...
10/08/2019

AutoML using Metadata Language Embeddings

As a human choosing a supervised learning algorithm, it is natural to be...
06/22/2020

CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information

Mutual information (MI) minimization has gained considerable interests i...
06/29/2021

Self-Contrastive Learning

This paper proposes a novel contrastive learning framework, coined as Se...

Code Repositories

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.