SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers

03/15/2023
by   Guoqiang Jin, et al.
0

Self-supervised pre-training and transformer-based networks have significantly improved the performance of object detection. However, most of the current self-supervised object detection methods are built on convolutional-based architectures. We believe that the transformers' sequence characteristics should be considered when designing a transformer-based self-supervised method for the object detection task. To this end, we propose SeqCo-DETR, a novel Sequence Consistency-based self-supervised method for object DEtection with TRansformers. SeqCo-DETR defines a simple but effective pretext by minimizes the discrepancy of the output sequences of transformers with different image views as input and leverages bipartite matching to find the most relevant sequence pairs to improve the sequence-level self-supervised representation learning performance. Furthermore, we provide a mask-based augmentation strategy incorporated with the sequence consistency strategy to extract more representative contextual information about the object for the object detection task. Our method achieves state-of-the-art results on MS COCO (45.8 AP) and PASCAL VOC (64.1 AP), demonstrating the effectiveness of our approach.

READ FULL TEXT

page 4

page 5

page 11

page 12

page 13

research
11/22/2021

Benchmarking Detection Transfer Learning with Vision Transformers

Object detection is a central downstream task used to test if pre-traine...
research
08/07/2023

Deepfake Detection: A Comparative Analysis

This paper present a comprehensive comparative analysis of supervised an...
research
03/07/2022

Knowledge Amalgamation for Object Detection with Transformers

Knowledge amalgamation (KA) is a novel deep model reusing task aiming to...
research
03/09/2022

A high-precision underwater object detection based on joint self-supervised deblurring and improved spatial transformer network

Deep learning-based underwater object detection (UOD) remains a major ch...
research
10/30/2022

Foreign Object Debris Detection for Airport Pavement Images based on Self-supervised Localization and Vision Transformer

Supervised object detection methods provide subpar performance when appl...
research
10/11/2021

Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning

Studies on self-supervised visual representation learning (SSL) improve ...
research
06/01/2023

Affinity-based Attention in Self-supervised Transformers Predicts Dynamics of Object Grouping in Humans

The spreading of attention has been proposed as a mechanism for how huma...

Please sign up or login with your details

Forgot password? Click here to reset