MOST: Multiple Object localization with Self-supervised Transformers for object discovery

04/11/2023
by   Sai Saketh Rambhatla, et al.
0

We tackle the challenging task of unsupervised object localization in this work. Recently, transformers trained with self-supervised learning have been shown to exhibit object localization properties without being trained for this task. In this work, we present Multiple Object localization with Self-supervised Transformers (MOST) that uses features of transformers trained using self-supervised learning to localize multiple objects in real world images. MOST analyzes the similarity maps of the features using box counting; a fractal analysis tool to identify tokens lying on foreground patches. The identified tokens are then clustered together, and tokens of each cluster are used to generate bounding boxes on foreground regions. Unlike recent state-of-the-art object localization methods, MOST can localize multiple objects per image and outperforms SOTA algorithms on several object localization and discovery benchmarks on PASCAL-VOC 07, 12 and COCO20k datasets. Additionally, we show that MOST can be used for self-supervised pre-training of object detectors, and yields consistent improvements on fully, semi-supervised object detection and unsupervised region proposal generation.

READ FULL TEXT

page 1

page 3

page 4

page 7

page 8

research
02/16/2021

Instance Localization for Self-supervised Detection Pretraining

Prior research on self-supervised learning has led to considerable progr...
research
09/08/2023

Unsupervised Object Localization with Representer Point Selection

We propose a novel unsupervised object localization method that allows u...
research
11/06/2020

Self Supervised Learning for Object Localisation in 3D Tomographic Images

While a lot of work is dedicated to self-supervised learning, most of it...
research
06/13/2023

Is Anisotropy Inherent to Transformers?

The representation degeneration problem is a phenomenon that is widely o...
research
06/08/2021

DETReg: Unsupervised Pretraining with Region Priors for Object Detection

Unsupervised pretraining has recently proven beneficial for computer vis...
research
10/24/2022

Foreground Guidance and Multi-Layer Feature Fusion for Unsupervised Object Discovery with Transformers

Unsupervised object discovery (UOD) has recently shown encouraging progr...
research
09/29/2021

Localizing Objects with Self-Supervised Transformers and no Labels

Localizing objects in image collections without supervision can help to ...

Please sign up or login with your details

Forgot password? Click here to reset