Sharp Learning Bounds for Contrastive Unsupervised Representation Learning

10/06/2021
by   Han Bao, et al.
0

Contrastive unsupervised representation learning (CURL) encourages data representation to make semantically similar pairs closer than randomly drawn negative samples, which has been successful in various domains such as vision, language, and graphs. Although recent theoretical studies have attempted to explain its success by upper bounds of a downstream classification loss by the contrastive loss, they are still not sharp enough to explain an experimental fact: larger negative samples improve the classification performance. This study establishes a downstream classification loss bound with a tight intercept in the negative sample size. By regarding the contrastive loss as a downstream loss estimator, our theory not only improves the existing learning bounds substantially but also explains why downstream classification empirically improves with larger negative samples – because the estimation variance of the downstream loss decays with larger negative samples. We verify that our theory is consistent with experiments on synthetic, vision, and language datasets.

READ FULL TEXT

page 11

page 12

research
05/03/2022

Do More Negative Samples Necessarily Hurt in Contrastive Learning?

Recent investigations in noise contrastive estimation suggest, both empi...
research
07/01/2020

Debiased Contrastive Learning

A prominent technique for self-supervised representation learning has be...
research
02/25/2019

A Theoretical Analysis of Contrastive Unsupervised Representation Learning

Recent empirical works have successfully used unlabeled data to learn fe...
research
12/15/2020

Understanding the Behaviour of Contrastive Loss

Unsupervised contrastive learning has achieved outstanding success, whil...
research
06/18/2021

Investigating the Role of Negatives in Contrastive Representation Learning

Noise contrastive learning is a popular technique for unsupervised repre...
research
05/24/2023

SUVR: A Search-based Approach to Unsupervised Visual Representation Learning

Unsupervised learning has grown in popularity because of the difficulty ...
research
06/03/2022

Rethinking Positive Sampling for Contrastive Learning with Kernel

Data augmentation is a crucial component in unsupervised contrastive lea...

Please sign up or login with your details

Forgot password? Click here to reset