Sample-Specific Debiasing for Better Image-Text Models

04/25/2023
by   Peiqi Wang, et al.
0

Self-supervised representation learning on image-text data facilitates crucial medical applications, such as image classification, visual grounding, and cross-modal retrieval. One common approach involves contrasting semantically similar (positive) and dissimilar (negative) pairs of data points. Drawing negative samples uniformly from the training data set introduces false negatives, i.e., samples that are treated as dissimilar but belong to the same class. In healthcare data, the underlying class distribution is nonuniform, implying that false negatives occur at a highly variable rate. To improve the quality of learned representations, we develop a novel approach that corrects for false negatives. Our method can be viewed as a variant of debiased constrastive learning that uses estimated sample-specific class probabilities. We provide theoretical analysis of the objective function and demonstrate the proposed approach on both image and paired image-text data sets. Our experiments demonstrate empirical advantages of sample-specific debiasing.

READ FULL TEXT
research
01/31/2019

Self-Supervised Visual Representations for Cross-Modal Retrieval

Cross-modal retrieval methods have been significantly improved in last y...
research
02/25/2019

A Theoretical Analysis of Contrastive Unsupervised Representation Learning

Recent empirical works have successfully used unlabeled data to learn fe...
research
09/30/2021

CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations

Contrastive learning allows us to flexibly define powerful losses by con...
research
09/28/2021

Sample-Efficient Safety Assurances using Conformal Prediction

When deploying machine learning models in high-stakes robotics applicati...
research
12/11/2022

Using Multiple Instance Learning to Build Multimodal Representations

Image-text multimodal representation learning aligns data across modalit...
research
02/15/2023

InfoNCE Loss Provably Learns Cluster-Preserving Representations

The goal of contrasting learning is to learn a representation that prese...
research
08/01/2022

XOOD: Extreme Value Based Out-Of-Distribution Detection For Image Classification

Detecting out-of-distribution (OOD) data at inference time is crucial fo...

Please sign up or login with your details

Forgot password? Click here to reset