Similarity and Generalization: From Noise to Corruption

01/30/2022
by   Nayara Fonseca, et al.
0

Contrastive learning aims to extract distinctive features from data by finding an embedding representation where similar samples are close to each other, and different ones are far apart. We study generalization in contrastive learning, focusing on its simplest representative: Siamese Neural Networks (SNNs). We show that Double Descent also appears in SNNs and is exacerbated by noise. We point out that SNNs can be affected by two distinct sources of noise: Pair Label Noise (PLN) and Single Label Noise (SLN). The effect of SLN is asymmetric, but it preserves similarity relations, while PLN is symmetric but breaks transitivity. We show that the dataset topology crucially affects generalization. While sparse datasets show the same performances under SLN and PLN for an equal amount of noise, SLN outperforms PLN in the overparametrized region in dense datasets. Indeed, in this regime, PLN similarity violation becomes macroscopical, corrupting the dataset to the point where complete overfitting cannot be achieved. We call this phenomenon Density-Induced Break of Similarity (DIBS). We also probe the equivalence between online optimization and offline generalization for similarity tasks. We observe that an online/offline correspondence in similarity learning can be affected by both the network architecture and label noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2023

Multi-Similarity Contrastive Learning

Given a similarity metric, contrastive methods learn a representation in...
research
03/01/2021

Using contrastive learning to improve the performance of steganalysis schemes

To improve the detection accuracy and generalization of steganalysis, th...
research
08/20/2021

Contrastive Representations for Label Noise Require Fine-Tuning

In this paper we show that the combination of a Contrastive representati...
research
06/29/2021

How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?

Incorrectly labeled examples, or label noise, is common in real-world co...
research
09/12/2023

BatMan-CLR: Making Few-shots Meta-Learners Resilient Against Label Noise

The negative impact of label noise is well studied in classical supervis...
research
06/28/2021

A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning

For an image query, unsupervised contrastive learning labels crops of th...
research
02/10/2021

Input Similarity from the Neural Network Perspective

We first exhibit a multimodal image registration task, for which a neura...

Please sign up or login with your details

Forgot password? Click here to reset