Multi-Modal Fusion for End-to-End RGB-T Tracking

08/30/2019
by   Lichao Zhang, et al.
4

We propose an end-to-end tracking framework for fusing the RGB and TIR modalities in RGB-T tracking. Our baseline tracker is DiMP (Discriminative Model Prediction), which employs a carefully designed target prediction network trained end-to-end using a discriminative loss. We analyze the effectiveness of modality fusion in each of the main components in DiMP, i.e. feature extractor, target estimation network, and classifier. We consider several fusion mechanisms acting at different levels of the framework, including pixel-level, feature-level and response-level. Our tracker is trained in an end-to-end manner, enabling the components to learn how to fuse the information from both modalities. As data to train our model, we generate a large-scale RGB-T dataset by considering an annotated RGB tracking dataset (GOT-10k) and synthesizing paired TIR images using an image-to-image translation approach. We perform extensive experiments on VOT-RGBT2019 dataset and RGBT210 dataset, evaluating each type of modality fusing on each model component. The results show that the proposed fusion mechanisms improve the performance of the single modality counterparts. We obtain our best results when fusing at the feature-level on both the IoU-Net and the model predictor, obtaining an EAO score of 0.391 on VOT-RGBT2019 dataset. With this fusion mechanism we achieve the state-of-the-art performance on RGBT210 dataset.

READ FULL TEXT

page 1

page 4

research
04/09/2023

RGB-T Tracking Based on Mixed Attention

RGB-T tracking involves the use of images from both visible and thermal ...
research
06/04/2018

Synthetic data generation for end-to-end thermal infrared tracking

The usage of both off-the-shelf and end-to-end trained deep networks hav...
research
01/21/2022

Exploring Fusion Strategies for Accurate RGBT Visual Object Tracking

We address the problem of multi-modal object tracking in video and explo...
research
07/24/2019

Dense Feature Aggregation and Pruning for RGBT Tracking

How to perform effective information fusion of different modalities is a...
research
04/27/2023

Adaptive-Mask Fusion Network for Segmentation of Drivable Road and Negative Obstacle With Untrustworthy Features

Segmentation of drivable roads and negative obstacles is critical to the...
research
07/04/2020

Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking

In this study, we propose a novel RGB-T tracking framework by jointly mo...
research
03/12/2021

Siamese Infrared and Visible Light Fusion Network for RGB-T Tracking

Due to the different photosensitive properties of infrared and visible l...

Please sign up or login with your details

Forgot password? Click here to reset