Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

11/08/2021
by   Chenglong Li, et al.
11

In many visual systems, visual tracking often bases on RGB image sequences, in which some targets are invalid in low-light conditions, and tracking performance is thus affected significantly. Introducing other modalities such as depth and infrared data is an effective way to handle imaging limitations of individual sources, but multi-modal imaging platforms usually require elaborate designs and cannot be applied in many real-world applications at present. Near-infrared (NIR) imaging becomes an essential part of many surveillance cameras, whose imaging is switchable between RGB and NIR based on the light intensity. These two modalities are heterogeneous with very different visual properties and thus bring big challenges for visual tracking. However, existing works have not studied this challenging problem. In this work, we address the cross-modal object tracking problem and contribute a new video dataset, including 654 cross-modal image sequences with over 481K frames in total, and the average video length is more than 735 frames. To promote the research and development of cross-modal object tracking, we propose a new algorithm, which learns the modality-aware target representation to mitigate the appearance gap between RGB and NIR modalities in the tracking process. It is plug-and-play and could thus be flexibly embedded into different tracking frameworks. Extensive experiments on the dataset are conducted, and we demonstrate the effectiveness of the proposed algorithm in two representative tracking frameworks against 17 state-of-the-art tracking methods. We will release the dataset for free academic usage, dataset download link and code will be released soon.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 8

research
07/09/2023

Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers

This paper addresses the problem of cross-modal object tracking from RGB...
research
08/12/2019

Learning Target-oriented Dual Attention for Robust RGB-T Tracking

RGB-Thermal object tracking attempt to locate target object using comple...
research
01/23/2022

Visual Object Tracking on Multi-modal RGB-D Videos: A Review

The development of visual object tracking has continued for decades. Rec...
research
03/21/2022

3D Multi-Object Tracking Using Graph Neural Networks with Cross-Edge Modality Attention

Online 3D multi-object tracking (MOT) has witnessed significant research...
research
06/15/2023

Cross-Modal Video to Body-joints Augmentation for Rehabilitation Exercise Quality Assessment

Exercise-based rehabilitation programs have been shown to enhance qualit...
research
12/08/2020

Multi-modal Visual Tracking: Review and Experimental Comparison

Visual object tracking, as a fundamental task in computer vision, has dr...
research
07/16/2022

Cross Vision-RF Gait Re-identification with Low-cost RGB-D Cameras and mmWave Radars

Human identification is a key requirement for many applications in every...

Please sign up or login with your details

Forgot password? Click here to reset