DeepAI AI Chat
Log In Sign Up

Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection

by   Jinyuan Liu, et al.
Dalian University of Technology

This study addresses the issue of fusing infrared and visible images that appear differently for object detection. Aiming at generating an image of high visual quality, previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks. These approaches neglect that modality differences implying the complementary information are extremely important for both fusion and subsequent detection task. This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network. The fusion network with one generator and dual discriminators seeks commons while learning from differences, which preserves structural information of targets from the infrared and textural details from the visible. Furthermore, we build a synchronized imaging system with calibrated infrared and optical sensors, and collect currently the most comprehensive benchmark covering a wide range of scenarios. Extensive experiments on several public datasets and our benchmark demonstrate that our method outputs not only visually appealing fusion but also higher detection mAP than the state-of-the-art approaches.


page 3

page 5

page 6

page 7

page 8


Multispectral Fusion for Object Detection with Cyclic Fuse-and-Refine Blocks

Multispectral images (e.g. visible and infrared) may be particularly use...

Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and Beyond

Recently, multi-modality scene perception tasks, e.g., image fusion and ...

Equivariant Multi-Modality Image Fusion

Multi-modality image fusion is a technique used to combine information f...

An Interactively Reinforced Paradigm for Joint Infrared-Visible Image Fusion and Saliency Object Detection

This research focuses on the discovery and localization of hidden object...

CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion

Infrared and visible image fusion targets to provide an informative imag...

An Attention-Guided and Wavelet-Constrained Generative Adversarial Network for Infrared and Visible Image Fusion

The GAN-based infrared and visible image fusion methods have gained ever...

MVLoc: Multimodal Variational Geometry-Aware Learning for Visual Localization

Recent learning-based research has achieved impressive results in the fi...