RGBT Tracking via Multi-Adapter Network with Hierarchical Divergence Loss

by   Andong Lu, et al.

RGBT tracking has attracted increasing attention since RGB and thermal infrared data have strong complementary advantages, which could make trackers all-day and all-weather work. However, how to effectively represent RGBT data for visual tracking remains unstudied well. Existing works usually focus on extracting modality-shared or modality-specific information, but the potentials of these two cues are not well explored and exploited in RGBT tracking. In this paper, we propose a novel multi-adapter network to jointly perform modality-shared, modality-specific and instance-aware target representation learning for RGBT tracking. To this end, we design three kinds of adapters within an end-to-end deep learning framework. In specific, we use the modified VGG-M as the generality adapter to extract the modality-shared target representations.To extract the modality-specific features while reducing the computational complexity, we design a modality adapter, which adds a small block to the generality adapter in each layer and each modality in a parallel manner. Such a design could learn multilevel modality-specific representations with a modest number of parameters as the vast majority of parameters are shared with the generality adapter. We also design instance adapter to capture the appearance properties and temporal variations of a certain target. Moreover, to enhance the shared and specific features, we employ the loss of multiple kernel maximum mean discrepancy to measure the distribution divergence of different modal features and integrate it into each layer for more robust representation learning. Extensive experiments on two RGBT tracking benchmark datasets demonstrate the outstanding performance of the proposed tracker against the state-of-the-art methods.


page 3

page 5

page 6

page 7

page 8

page 9

page 11

page 12


Multi-Adapter RGBT Tracking

The task of RGBT tracking aims to take the complementary advantages from...

Challenge-Aware RGBT Tracking

RGB and thermal source data suffer from both shared and specific challen...

Specificity-preserving RGB-D Saliency Detection

RGB-D saliency detection has attracted increasing attention, due to its ...

FANet: Quality-Aware Feature Aggregation Network for RGB-T Tracking

This paper investigates how to perform robust visual tracking in adverse...

Duality-Gated Mutual Condition Network for RGBT Tracking

Low-quality modalities contain not only a lot of noisy information but a...

Cooperative Cross-Stream Network for Discriminative Action Representation

Spatial and temporal stream model has gained great success in video acti...

Dense Feature Aggregation and Pruning for RGBT Tracking

How to perform effective information fusion of different modalities is a...

Please sign up or login with your details

Forgot password? Click here to reset