Robust Object Modeling for Visual Tracking

08/09/2023
by   Yidong Cai, et al.
0

Object modeling has become a core part of recent tracking frameworks. Current popular tackers use Transformer attention to extract the template feature separately or interactively with the search region. However, separate template learning lacks communication between the template and search regions, which brings difficulty in extracting discriminative target-oriented features. On the other hand, interactive template learning produces hybrid template features, which may introduce potential distractors to the template via the cluttered search regions. To enjoy the merits of both methods, we propose a robust object modeling framework for visual tracking (ROMTrack), which simultaneously models the inherent template and the hybrid template features. As a result, harmful distractors can be suppressed by combining the inherent features of target objects with search regions' guidance. Target-related features can also be extracted using the hybrid template, thus resulting in a more robust object modeling framework. To further enhance robustness, we present novel variation tokens to depict the ever-changing appearance of target objects. Variation tokens are adaptable to object deformation and appearance variations, which can boost overall performance with negligible computation. Experiments show that our ROMTrack sets a new state-of-the-art on multiple benchmarks.

READ FULL TEXT

page 15

page 16

page 17

page 18

page 19

research
02/04/2019

End-to-end feature fusion siamese network for adaptive visual tracking

According to observations, different visual objects have different salie...
research
03/22/2022

Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework

The current popular two-stream, two-stage tracking framework extracts th...
research
09/06/2023

Towards Efficient Training with Negative Samples in Visual Tracking

Current state-of-the-art (SOTA) methods in visual object tracking often ...
research
05/10/2021

tFold-TR: Combining Deep Learning Enhanced Hybrid Potential Energy for Template-Based Modelling Structure Refinement

Proteins structure prediction has long been a grand challenge over the p...
research
03/24/2013

A Diffusion Process on Riemannian Manifold for Visual Tracking

Robust visual tracking for long video sequences is a research area that ...
research
11/25/2020

CRACT: Cascaded Regression-Align-Classification for Robust Visual Tracking

High quality object proposals are crucial in visual tracking algorithms ...
research
08/21/2012

Shape Tracking With Occlusions via Coarse-To-Fine Region-Based Sobolev Descent

We present a method to track the precise shape of an object in video bas...

Please sign up or login with your details

Forgot password? Click here to reset