Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection

09/28/2022
by   Maoxun Yuan, et al.
0

Integrating multispectral data in object detection, especially visible and infrared images, has received great attention in recent years. Since visible (RGB) and infrared (IR) images can provide complementary information to handle light variations, the paired images are used in many fields, such as multispectral pedestrian detection, RGB-IR crowd counting and RGB-IR salient object detection. Compared with natural RGB-IR images, we find detection in aerial RGB-IR images suffers from cross-modal weakly misalignment problems, which are manifested in the position, size and angle deviations of the same object. In this paper, we mainly address the challenge of cross-modal weakly misalignment in aerial RGB-IR images. Specifically, we firstly explain and analyze the cause of the weakly misalignment problem. Then, we propose a Translation-Scale-Rotation Alignment (TSRA) module to address the problem by calibrating the feature maps from these two modalities. The module predicts the deviation between two modality objects through an alignment process and utilizes Modality-Selection (MS) strategy to improve the performance of alignment. Finally, a two-stream feature alignment detector (TSFADet) based on the TSRA module is constructed for RGB-IR object detection in aerial images. With comprehensive experiments on the public DroneVehicle datasets, we verify that our method reduces the effect of the cross-modal misalignment and achieve robust detection results.

READ FULL TEXT

page 2

page 6

page 7

research
06/28/2023

𝐂^2Former: Calibrated and Complementary Transformer for RGB-Infrared Object Detection

Object detection on visible (RGB) and infrared (IR) images, as an emergi...
research
01/11/2022

Drone Object Detection Using RGB/IR Fusion

Object detection using aerial drone imagery has received a great deal of...
research
04/21/2022

Weakly Aligned Feature Fusion for Multimodal Object Detection

To achieve accurate and robust object detection in the real-world scenar...
research
10/30/2018

Cross-Modal Attentional Context Learning for RGB-D Object Detection

Recognizing objects from simultaneously sensed photometric (RGB) and dep...
research
10/19/2022

Spatio-channel Attention Blocks for Cross-modal Crowd Counting

Crowd counting research has made significant advancements in real-world ...
research
12/06/2021

Cross-Modality Attentive Feature Fusion for Object Detection in Multispectral Remote Sensing Imagery

Cross-modality fusing complementary information of multispectral remote ...
research
05/23/2023

Flare-Aware Cross-modal Enhancement Network for Multi-spectral Vehicle Re-identification

Multi-spectral vehicle re-identification aims to address the challenge o...

Please sign up or login with your details

Forgot password? Click here to reset