DeepAI AI Chat
Log In Sign Up

TransCenter: Transformers with Dense Queries for Multiple-Object Tracking

03/28/2021
by   Yihong Xu, et al.
MIT
Inria
52

Transformer networks have proven extremely powerful for a wide variety of tasks since they were introduced. Computer vision is not an exception, as the use of transformers has become very popular in the vision community in recent years. Despite this wave, multiple-object tracking (MOT) exhibits for now some sort of incompatibility with transformers. We argue that the standard representation – bounding boxes – is not adapted to learning transformers for MOT. Inspired by recent research, we propose TransCenter, the first transformer-based architecture for tracking the centers of multiple targets. Methodologically, we propose the use of dense queries in a double-decoder network, to be able to robustly infer the heatmap of targets' centers and associate them through time. TransCenter outperforms the current state-of-the-art in multiple-object tracking, both in MOT17 and MOT20. Our ablation study demonstrates the advantage in the proposed architecture compared to more naive alternatives. The code will be made publicly available.

READ FULL TEXT

page 1

page 3

page 6

page 7

page 8

page 13

page 14

10/24/2022

Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense Representations

Transformer networks have been a focus of research in many fields in rec...
02/23/2023

Transformers in Single Object Tracking: An Experimental Survey

Single object tracking is a well-known and challenging research topic in...
12/17/2021

Efficient Visual Tracking with Exemplar Transformers

The design of more complex and powerful neural network models has signif...
10/17/2022

Track Targets by Dense Spatio-Temporal Position Encoding

In this work, we propose a novel paradigm to encode the position of targ...
12/10/2021

Visual Transformers with Primal Object Queries for Multi-Label Image Classification

Multi-label image classification is about predicting a set of class labe...
05/19/2022

Training Vision-Language Transformers from Captions Alone

We show that Vision-Language Transformers can be learned without human l...
11/24/2022

On designing light-weight object trackers through network pruning: Use CNNs or transformers?

Object trackers deployed on low-power devices need to be light-weight, h...

Code Repositories

TransCenter

This is a placeholder of the official implementation of TransCenter. The code will be made publicly available soon.


view repo