Multispectral Pedestrian Detection via Reference Box Constrained Cross Attention and Modality Balanced Optimization

02/01/2023
by   Yinghui Xing, et al.
0

Multispectral pedestrian detection is an important task for many around-the-clock applications, since the visible and thermal modalities can provide complementary information especially under low light conditions. To reduce the influence of hand-designed components in available multispectral pedestrian detectors, we propose a MultiSpectral pedestrian DEtection TRansformer (MS-DETR), which extends deformable DETR to multi-modal paradigm. In order to facilitate the multi-modal learning process, a Reference box Constrained Cross-Attention (RCCA) module is firstly introduced to the multi-modal Transformer decoder, which takes fusion branch together with the reference boxes as intermediaries to enable the interaction of visible and thermal modalities. To further balance the contribution of different modalities, we design a modality-balanced optimization strategy, which aligns the slots of decoders by adaptively adjusting the instance-level weight of three branches. Our end-to-end MS-DETR shows superior performance on the challenging KAIST and CVC-14 benchmark datasets.

READ FULL TEXT

page 2

page 8

research
08/07/2020

Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems

Multispectral pedestrian detection is capable of adapting to insufficien...
research
07/23/2021

Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU

The combined use of multiple modalities enables accurate pedestrian dete...
research
04/15/2023

MA-ViT: Modality-Agnostic Vision Transformers for Face Anti-Spoofing

The existing multi-modal face anti-spoofing (FAS) frameworks are designe...
research
02/17/2023

Cascaded information enhancement and cross-modal attention feature fusion for multispectral pedestrian detection

Multispectral pedestrian detection is a technology designed to detect an...
research
06/27/2021

Accelerated Multi-Modal MR Imaging with Transformers

Accelerating multi-modal magnetic resonance (MR) imaging is a new and ef...
research
05/26/2021

Spatio-Contextual Deep Network Based Multimodal Pedestrian Detection For Autonomous Driving

Pedestrian Detection is the most critical module of an Autonomous Drivin...
research
12/04/2021

BAANet: Learning Bi-directional Adaptive Attention Gates for Multispectral Pedestrian Detection

Thermal infrared (TIR) image has proven effectiveness in providing tempe...

Please sign up or login with your details

Forgot password? Click here to reset