Contrastive Learning as Kernel Approximation

In standard supervised machine learning, it is necessary to provide a label for every input in the data. While raw data in many application domains is easily obtainable on the Internet, manual labelling of this data is prohibitively expensive. To circumvent this issue, contrastive learning methods produce low-dimensional vector representations (also called features) of high-dimensional inputs on large unlabelled datasets. This is done by training with a contrastive loss function, which enforces that similar inputs have high inner product and dissimilar inputs have low inner product in the feature space. Rather than annotating each input individually, it suffices to define a means of sampling pairs of similar and dissimilar inputs. Contrastive features can then be fed as inputs to supervised learning systems on much smaller labelled datasets to obtain high accuracy on end tasks of interest. The goal of this thesis is to provide an overview of the current theoretical understanding of contrastive learning, specifically as it pertains to the minimizers of contrastive loss functions and their relationship to prior methods for learning features from unlabelled data. We highlight popular contrastive loss functions whose minimizers implicitly approximate a positive semidefinite (PSD) kernel. The latter is a well-studied object in functional analysis and learning theory that formalizes a notion of similarity between elements of a space. PSD kernels provide an implicit definition of features through the theory of reproducing kernel Hilbert spaces.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2022

Joint Embedding Self-Supervised Learning in the Kernel Regime

The fundamental goal of self-supervised learning (SSL) is to produce use...
research
05/27/2023

Kernel-SSL: Kernel KL Divergence for Self-Supervised Learning

Contrastive learning usually compares one positive anchor sample with lo...
research
03/27/2023

Contrastive Learning Is Spectral Clustering On Similarity Graph

Contrastive learning is a powerful self-supervised learning method, but ...
research
04/12/2023

Localisation of Regularised and Multiview Support Vector Machine Learning

We prove a few representer theorems for a localised version of the regul...
research
10/04/2022

Contrastive Learning Can Find An Optimal Basis For Approximately View-Invariant Functions

Contrastive learning is a powerful framework for learning self-supervise...
research
04/28/2022

Keep the Caption Information: Preventing Shortcut Learning in Contrastive Image-Caption Retrieval

To train image-caption retrieval (ICR) methods, contrastive loss functio...
research
05/10/2023

Supervised learning with probabilistic morphisms and kernel mean embeddings

In this paper I propose a concept of a correct loss function in a genera...

Please sign up or login with your details

Forgot password? Click here to reset