Top-down Neural Attention by Excitation Backprop

08/01/2016
by   Jianming Zhang, et al.
0

We aim to model the top-down attention of a Convolutional Neural Network (CNN) classifier for generating task-specific attention maps. Inspired by a top-down human visual attention model, we propose a new backpropagation scheme, called Excitation Backprop, to pass along top-down signals downwards in the network hierarchy via a probabilistic Winner-Take-All process. Furthermore, we introduce the concept of contrastive attention to make the top-down attention maps more discriminative. In experiments, we demonstrate the accuracy and generalizability of our method in weakly supervised localization tasks on the MS COCO, PASCAL VOC07 and ImageNet datasets. The usefulness of our method is further validated in the text-to-region association task. On the Flickr30k Entities dataset, we achieve promising performance in phrase localization by leveraging the top-down attention of a CNN model that has been trained on weakly labeled web images.

READ FULL TEXT

page 2

page 6

page 7

page 12

page 13

page 17

page 21

research
11/24/2016

Weakly Supervised Cascaded Convolutional Networks

Object detection is a challenging task in visual understanding domain, a...
research
12/24/2015

Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network

We propose a novel weakly-supervised semantic segmentation algorithm bas...
research
12/17/2020

Weakly-Supervised Action Localization and Action Recognition using Global-Local Attention of 3D CNN

3D Convolutional Neural Network (3D CNN) captures spatial and temporal i...
research
02/27/2018

Tell Me Where to Look: Guided Attention Inference Network

Weakly supervised learning with only coarse labels can obtain visual exp...
research
09/25/2020

In-sample Contrastive Learning and Consistent Attention for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) aims to localize the target...
research
07/01/2020

M3d-CAM: A PyTorch library to generate 3D data attention maps for medical deep learning

M3d-CAM is an easy to use library for generating attention maps of CNN-b...
research
07/13/2020

Deep Reinforced Attention Learning for Quality-Aware Visual Recognition

In this paper, we build upon the weakly-supervised generation mechanism ...

Please sign up or login with your details

Forgot password? Click here to reset