Siamese DETR

03/31/2023
by   Zeren Chen, et al.
3

Recent self-supervised methods are mainly designed for representation learning with the base model, e.g., ResNets or ViTs. They cannot be easily transferred to DETR, with task-specific Transformer modules. In this work, we present Siamese DETR, a Siamese self-supervised pretraining approach for the Transformer architecture in DETR. We consider learning view-invariant and detection-oriented representations simultaneously through two complementary tasks, i.e., localization and discrimination, in a novel multi-view learning framework. Two self-supervised pretext tasks are designed: (i) Multi-View Region Detection aims at learning to localize regions-of-interest between augmented views of the input, and (ii) Multi-View Semantic Discrimination attempts to improve object-level discrimination for each region. The proposed Siamese DETR achieves state-of-the-art transfer performance on COCO and PASCAL VOC detection using different DETR variants in all setups. Code is available at https://github.com/Zx55/SiameseDETR.

READ FULL TEXT

page 8

page 12

research
04/07/2021

Self-supervised Learning of Depth Inference for Multi-view Stereo

Recent supervised multi-view depth estimation networks have achieved pro...
research
04/07/2022

mulEEG: A Multi-View Representation Learning on EEG Signals

Modeling effective representations using multiple views that positively ...
research
06/02/2022

Siamese Image Modeling for Self-Supervised Vision Representation Learning

Self-supervised learning (SSL) has delivered superior performance on a v...
research
11/25/2022

Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning

Siamese-network-based self-supervised learning (SSL) suffers from slow c...
research
04/15/2023

Multi-View Graph Representation Learning Beyond Homophily

Unsupervised graph representation learning(GRL) aims to distill diverse ...
research
04/01/2022

On the Importance of Asymmetry for Siamese Representation Learning

Many recent self-supervised frameworks for visual representation learnin...
research
07/20/2023

The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

The mechanisms behind the success of multi-view self-supervised learning...

Please sign up or login with your details

Forgot password? Click here to reset