Contrastive Learning Inverts the Data Generating Process

by   Roland S. Zimmermann, et al.

Contrastive learning has recently seen tremendous success in self-supervised learning. So far, however, it is largely unclear why the learned representations generalize so effectively to a large variety of downstream tasks. We here prove that feedforward models trained with objectives belonging to the commonly used InfoNCE family learn to implicitly invert the underlying generative model of the observed data. While the proofs make certain statistical assumptions about the generative model, we observe empirically that our findings hold even if these assumptions are severely violated. Our theory highlights a fundamental connection between contrastive learning, generative modeling, and nonlinear independent component analysis, thereby furthering our understanding of the learned representations as well as providing a theoretical foundation to derive more effective contrastive losses.


page 4

page 6

page 9

page 10

page 12

page 13

page 14

page 17


The Power of Contrast for Feature Learning: A Theoretical Analysis

Contrastive learning has achieved state-of-the-art performance in variou...

Understanding Contrastive Learning Requires Incorporating Inductive Biases

Contrastive learning is a popular form of self-supervised learning that ...

Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework

As a seminal tool in self-supervised representation learning, contrastiv...

Hybrid Discriminative-Generative Training via Contrastive Learning

Contrastive learning and supervised learning have both seen significant ...

Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning

While the empirical success of self-supervised learning (SSL) heavily re...

GeomCA: Geometric Evaluation of Data Representations

Evaluating the quality of learned representations without relying on a d...

Contrastive learning of strong-mixing continuous-time stochastic processes

Contrastive learning is a family of self-supervised methods where a mode...