CrOC: Cross-View Online Clustering for Dense Visual Representation Learning

03/23/2023
by   Thomas Stegmüller, et al.
0

Learning dense visual representations without labels is an arduous task and more so from scene-centric data. We propose to tackle this challenging problem by proposing a Cross-view consistency objective with an Online Clustering mechanism (CrOC) to discover and segment the semantics of the views. In the absence of hand-crafted priors, the resulting method is more generalizable and does not require a cumbersome pre-processing step. More importantly, the clustering algorithm conjointly operates on the features of both views, thereby elegantly bypassing the issue of content not represented in both views and the ambiguous matching of objects from one crop to the other. We demonstrate excellent performance on linear and unsupervised segmentation transfer tasks on various datasets and similarly for video object segmentation. Our code and pre-trained models are publicly available at https://github.com/stegmuel/CrOC.

READ FULL TEXT

page 1

page 3

page 5

page 14

research
04/25/2022

Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering Transformers

Unsupervised semantic segmentation aims to discover groupings within and...
research
06/08/2023

Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models

The advent of large pre-trained models has brought about a paradigm shif...
research
09/19/2022

NeRF-SOS: Any-View Self-supervised Object Segmentation from Complex Real-World Scenes

Neural volumetric representations have shown the potential that Multi-la...
research
12/02/2021

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

Recent progress has shown that large-scale pre-training using contrastiv...
research
05/30/2022

Self-Supervised Visual Representation Learning with Semantic Grouping

In this paper, we tackle the problem of learning visual representations ...
research
04/19/2023

Investigating the Nature of 3D Generalization in Deep Neural Networks

Visual object recognition systems need to generalize from a set of 2D tr...
research
03/21/2023

Sample4Geo: Hard Negative Sampling For Cross-View Geo-Localisation

Cross-View Geo-Localisation is still a challenging task where additional...

Please sign up or login with your details

Forgot password? Click here to reset