Learning Cross-Modal Deep Representations for Robust Pedestrian Detection

04/08/2017
by   Dan Xu, et al.
0

This paper presents a novel method for detecting pedestrians under adverse illumination conditions. Our approach relies on a novel cross-modality learning framework and it is based on two main phases. First, given a multimodal dataset, a deep convolutional network is employed to learn a non-linear mapping, modeling the relations between RGB and thermal data. Then, the learned feature representations are transferred to a second deep network, which receives as input an RGB image and outputs the detection results. In this way, features which are both discriminative and robust to bad illumination conditions are learned. Importantly, at test time, only the second pipeline is considered and no thermal data are required. Our extensive evaluation demonstrates that the proposed approach outperforms the state-of- the-art on the challenging KAIST multispectral pedestrian dataset and it is competitive with previous methods on the popular Caltech dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

research
03/14/2018

Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection

Multispectral images of color-thermal pairs have shown more effective th...
research
01/09/2019

The Cross-Modality Disparity Problem in Multispectral Pedestrian Detection

Aggregating extra features of novel modality brings great advantages for...
research
08/23/2023

Cross-Modality Proposal-guided Feature Mining for Unregistered RGB-Thermal Pedestrian Detection

RGB-Thermal (RGB-T) pedestrian detection aims to locate the pedestrians ...
research
07/31/2021

Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking

The target representation learned by convolutional neural networks plays...
research
01/20/2016

Deep Perceptual Mapping for Cross-Modal Face Recognition

Cross modal face matching between the thermal and visible spectrum is a ...
research
06/08/2022

Robust Environment Perception for Automated Driving: A Unified Learning Pipeline for Visual-Infrared Object Detection

The RGB complementary metal-oxidesemiconductor (CMOS) sensor works withi...
research
09/06/2023

A Multimodal Learning Framework for Comprehensive 3D Mineral Prospectivity Modeling with Jointly Learned Structure-Fluid Relationships

This study presents a novel multimodal fusion model for three-dimensiona...

Please sign up or login with your details

Forgot password? Click here to reset