Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework

03/22/2022
by   Botao Ye, et al.
0

The current popular two-stream, two-stage tracking framework extracts the template and the search region features separately and then performs relation modeling, thus the extracted features lack the awareness of the target and have limited target-background discriminability. To tackle the above issue, we propose a novel one-stream tracking (OSTrack) framework that unifies feature learning and relation modeling by bridging the template-search image pairs with bidirectional information flows. In this way, discriminative target-oriented features can be dynamically extracted by mutual guidance. Since no extra heavy relation modeling module is needed and the implementation is highly parallelized, the proposed tracker runs at a fast speed. To further improve the inference efficiency, an in-network candidate early elimination module is proposed based on the strong similarity prior calculated in the one-stream framework. As a unified framework, OSTrack achieves state-of-the-art performance on multiple benchmarks, in particular, it shows impressive results on the one-shot tracking benchmark GOT-10k, i.e., achieving 73.7 the existing best result (SwinTrack) by 4.3 good performance-speed trade-off and shows faster convergence. The code and models will be available at https://github.com/botaoye/OSTrack.

READ FULL TEXT

page 7

page 9

page 19

page 22

page 23

research
08/09/2023

Robust Object Modeling for Visual Tracking

Object modeling has become a core part of recent tracking frameworks. Cu...
research
04/02/2021

Learning to Filter: Siamese Relation Network for Robust Tracking

Despite the great success of Siamese-based trackers, their performance u...
research
05/09/2021

TrTr: Visual Tracking with Transformer

Template-based discriminative trackers are currently the dominant tracki...
research
09/04/2019

'Skimming-Perusal' Tracking: A Framework for Real-Time and Robust Long-term Tracking

Compared with traditional short-term tracking, long-term tracking poses ...
research
03/21/2022

MixFormer: End-to-End Tracking with Iterative Mixed Attention

Tracking often uses a multi-stage pipeline of feature extraction, target...
research
03/21/2023

Joint Visual Grounding and Tracking with Natural Language Specification

Tracking by natural language specification aims to locate the referred t...
research
09/17/2023

LiteTrack: Layer Pruning with Asynchronous Feature Extraction for Lightweight and Efficient Visual Tracking

The recent advancements in transformer-based visual trackers have led to...

Please sign up or login with your details

Forgot password? Click here to reset