A Comparison of Deep Saliency Map Generators on Multispectral Data in Object Detection

08/26/2021
by   Jens Bayer, et al.
8

Deep neural networks, especially convolutional deep neural networks, are state-of-the-art methods to classify, segment or even generate images, movies, or sounds. However, these methods lack of a good semantic understanding of what happens internally. The question, why a COVID-19 detector has classified a stack of lung-ct images as positive, is sometimes more interesting than the overall specificity and sensitivity. Especially when human domain expert knowledge disagrees with the given output. This way, human domain experts could also be advised to reconsider their choice, regarding the information pointed out by the system. In addition, the deep learning model can be controlled, and a present dataset bias can be found. Currently, most explainable AI methods in the computer vision domain are purely used on image classification, where the images are ordinary images in the visible spectrum. As a result, there is no comparison on how the methods behave with multimodal image data, as well as most methods have not been investigated on how they behave when used for object detection. This work tries to close the gaps. Firstly, investigating three saliency map generator methods on how their maps differ across the different spectra. This is achieved via accurate and systematic training. Secondly, we examine how they behave when used for object detection. As a practical problem, we chose object detection in the infrared and visual spectrum for autonomous driving. The dataset used in this work is the Multispectral Object Detection Dataset, where each scene is available in the FIR, MIR and NIR as well as visual spectrum. The results show that there are differences between the infrared and visual activation maps. Further, an advanced training with both, the infrared and visual data not only improves the network's output, it also leads to more focused spots in the saliency maps.

READ FULL TEXT

page 2

page 5

page 7

page 8

page 12

research
05/05/2023

Human Attention-Guided Explainable Artificial Intelligence for Computer Vision Models

We examined whether embedding human attention knowledge into saliency-ba...
research
03/24/2015

Unsupervised Video Analysis Based on a Spatiotemporal Saliency Detector

Visual saliency, which predicts regions in the field of view that draw t...
research
07/31/2020

Utilising Visual Attention Cues for Vehicle Detection and Tracking

Advanced Driver-Assistance Systems (ADAS) have been attracting attention...
research
06/26/2022

Woodscape Fisheye Object Detection for Autonomous Driving – CVPR 2022 OmniCV Workshop Challenge

Object detection is a comprehensively studied problem in autonomous driv...
research
09/22/2016

On the usability of deep networks for object-based image analysis

As computer vision before, remote sensing has been radically changed by ...
research
12/07/2016

Spatially Adaptive Computation Time for Residual Networks

This paper proposes a deep learning architecture based on Residual Netwo...
research
08/10/2021

Understanding Character Recognition using Visual Explanations Derived from the Human Visual System and Deep Networks

Human observers engage in selective information uptake when classifying ...

Please sign up or login with your details

Forgot password? Click here to reset