Learning Where to Learn in Cross-View Self-Supervised Learning

03/28/2022
by   Lang Huang, et al.
0

Self-supervised learning (SSL) has made enormous progress and largely narrowed the gap with the supervised ones, where the representation learning is mainly guided by a projection into an embedding space. During the projection, current methods simply adopt uniform aggregation of pixels for embedding; however, this risks involving object-irrelevant nuisances and spatial misalignment for different augmentations. In this paper, we present a new approach, Learning Where to Learn (LEWEL), to adaptively aggregate spatial information of features, so that the projected embeddings could be exactly aligned and thus guide the feature learning better. Concretely, we reinterpret the projection head in SSL as a per-pixel projection and predict a set of spatial alignment maps from the original features by this weight-sharing projection head. A spectrum of aligned embeddings is thus obtained by aggregating the features with spatial weighting according to these alignment maps. As a result of this adaptive alignment, we observe substantial improvements on both image-level prediction and dense prediction at the same time: LEWEL improves MoCov2 by 1.6 1.3 Pascal VOC semantic segmentation, and object detection, respectively.

READ FULL TEXT

page 4

page 12

research
07/10/2022

Self-supervised Learning with Local Contrastive Loss for Detection and Semantic Segmentation

We present a self-supervised learning (SSL) method suitable for semi-glo...
research
07/19/2021

Exploring Set Similarity for Dense Self-supervised Representation Learning

By considering the spatial correspondence, dense self-supervised represe...
research
11/21/2021

HoughCL: Finding Better Positive Pairs in Dense Self-supervised Learning

Recently, self-supervised methods show remarkable achievements in image-...
research
01/28/2023

Deciphering the Projection Head: Representation Evaluation Self-supervised Learning

Self-supervised learning (SSL) aims to learn intrinsic features without ...
research
06/09/2023

FLSL: Feature-level Self-supervised Learning

Current self-supervised learning (SSL) methods (e.g., SimCLR, DINO, VICR...
research
01/17/2022

AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

Object detection through either RGB images or the LiDAR point clouds has...
research
11/24/2021

ViCE: Self-Supervised Visual Concept Embeddings as Contextual and Pixel Appearance Invariant Semantic Representations

This work presents a self-supervised method to learn dense semantically ...

Please sign up or login with your details

Forgot password? Click here to reset