Contrastive learning unifies t-SNE and UMAP

06/03/2022
by   Sebastian Damrich, et al.
0

Neighbor embedding methods t-SNE and UMAP are the de facto standard for visualizing high-dimensional datasets. They appear to use very different loss functions with different motivations, and the exact relationship between them has been unclear. Here we show that UMAP is effectively negative sampling applied to the t-SNE loss function. We explain the difference between negative sampling and noise-contrastive estimation (NCE), which has been used to optimize t-SNE under the name NCVis. We prove that, unlike NCE, negative sampling learns a scaled data distribution. When applied in the neighbor embedding setting, it yields more compact embeddings with increased attraction, explaining differences in appearance between UMAP and t-SNE. Further, we generalize the notion of negative sampling and obtain a spectrum of embeddings, encompassing visualizations similar to t-SNE, NCVis, and UMAP. Finally, we explore the connection between representation learning in the SimCLR setting and neighbor embeddings, and show that (i) t-SNE can be optimized using the InfoNCE loss and in a parametric setting; (ii) various contrastive losses with only few noise samples can yield competitive performance in the SimCLR setup.

READ FULL TEXT
research
05/03/2022

Do More Negative Samples Necessarily Hurt in Contrastive Learning?

Recent investigations in noise contrastive estimation suggest, both empi...
research
07/17/2020

A Unifying Perspective on Neighbor Embeddings along the Attraction-Repulsion Spectrum

Neighbor embeddings are a family of methods for visualizing complex high...
research
08/31/2022

Supervised Contrastive Learning with Hard Negative Samples

Unsupervised contrastive learning (UCL) is a self-supervised learning te...
research
01/27/2022

Ranking Info Noise Contrastive Estimation: Boosting Contrastive Learning via Ranked Positives

This paper introduces Ranking Info Noise Contrastive Estimation (RINCE),...
research
05/22/2018

Adversarial Training of Word2Vec for Basket Completion

In recent years, the Word2Vec model trained with the Negative Sampling l...
research
05/09/2018

Adversarial Contrastive Estimation

Learning by contrasting positive and negative samples is a general strat...
research
10/23/2022

Tail Batch Sampling: Approximating Global Contrastive Losses as Optimization over Batch Assignments

Contrastive Learning has recently achieved state-of-the-art performance ...

Please sign up or login with your details

Forgot password? Click here to reset