Towards Fine-grained Visual Representations by Combining Contrastive Learning with Image Reconstruction and Attention-weighted Pooling

04/09/2021
by   Jonas Dippel, et al.
6

This paper presents Contrastive Reconstruction, ConRec - a self-supervised learning algorithm that obtains image representations by jointly optimizing a contrastive and a self-reconstruction loss. We showcase that state-of-the-art contrastive learning methods (e.g. SimCLR) have shortcomings to capture fine-grained visual features in their representations. ConRec extends the SimCLR framework by adding (1) a self-reconstruction task and (2) an attention mechanism within the contrastive learning task. This is accomplished by applying a simple encoder-decoder architecture with two heads. We show that both extensions contribute towards an improved vector representation for images with fine-grained visual features. Combining those concepts, ConRec outperforms SimCLR and SimCLR with Attention-Pooling on fine-grained classification datasets.

READ FULL TEXT

page 3

page 4

research
02/13/2020

A Simple Framework for Contrastive Learning of Visual Representations

This paper presents SimCLR: a simple framework for contrastive learning ...
research
05/09/2022

Visual Encoding and Debiasing for CTR Prediction

Extracting expressive visual features is crucial for accurate Click-Thro...
research
01/19/2023

Semantic-aware Contrastive Learning for More Accurate Semantic Parsing

Since the meaning representations are detailed and accurate annotations ...
research
11/11/2021

Unsupervised Part Discovery from Contrastive Reconstruction

The goal of self-supervised visual representation learning is to learn s...
research
08/17/2022

SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation

This paper introduces a novel Self-supervised Fine-grained Dialogue Eval...
research
08/18/2022

MvDeCor: Multi-view Dense Correspondence Learning for Fine-grained 3D Segmentation

We propose to utilize self-supervised techniques in the 2D domain for fi...
research
01/11/2023

Generative-Contrastive Learning for Self-Supervised Latent Representations of 3D Shapes from Multi-Modal Euclidean Input

We propose a combined generative and contrastive neural architecture for...

Please sign up or login with your details

Forgot password? Click here to reset