Learning Explicit Object-Centric Representations with Vision Transformers

10/25/2022
by   Oscar Vikström, et al.
0

With the recent successful adaptation of transformers to the vision domain, particularly when trained in a self-supervised fashion, it has been shown that vision transformers can learn impressive object-reasoning-like behaviour and features expressive for the task of object segmentation in images. In this paper, we build on the self-supervision task of masked autoencoding and explore its effectiveness for explicitly learning object-centric representations with transformers. To this end, we design an object-centric autoencoder using transformers only and train it end-to-end to reconstruct full images from unmasked patches. We show that the model efficiently learns to decompose simple scenes as measured by segmentation metrics on several multi-object benchmarks.

READ FULL TEXT

page 6

page 11

page 12

research
06/01/2023

Affinity-based Attention in Self-supervised Transformers Predicts Dynamics of Object Grouping in Humans

The spreading of attention has been proposed as a mechanism for how huma...
research
10/27/2022

PatchRot: A Self-Supervised Technique for Training Vision Transformers

Vision transformers require a huge amount of labeled data to outperform ...
research
12/29/2022

AttEntropy: Segmenting Unknown Objects in Complex Scenes using the Spatial Attention Entropy of Semantic Segmentation Transformers

Vision transformers have emerged as powerful tools for many computer vis...
research
03/11/2022

Towards Self-Supervised Learning of Global and Object-Centric Representations

Self-supervision allows learning meaningful representations of natural i...
research
04/28/2023

Representation Matters: The Game of Chess Poses a Challenge to Vision Transformers

While transformers have gained the reputation as the "Swiss army knife o...
research
12/12/2021

Magnifying Networks for Images with Billions of Pixels

The shift towards end-to-end deep learning has brought unprecedented adv...
research
01/31/2023

Real Estate Property Valuation using Self-Supervised Vision Transformers

The use of Artificial Intelligence (AI) in the real estate market has be...

Please sign up or login with your details

Forgot password? Click here to reset