UAN: Unified Attention Network for Convolutional Neural Networks

01/16/2019
by   Tony Joseph, et al.
54

We propose a new architecture that learns to attend to different Convolutional Neural Networks (CNN) layers (i.e., different levels of abstraction) and different spatial locations (i.e., specific layers within a given feature map) in a sequential manner to perform the task at hand. Specifically, at each Recurrent Neural Network (RNN) timestep, a CNN layer is selected and its output is processed by a spatial soft-attention mechanism. We refer to this architecture as the Unified Attention Network (UAN), since it combines the "what" and "where" aspects of attention, i.e., "what" level of abstraction to attend to, and "where" should the network look at. We demonstrate the effectiveness of this approach on two computer vision tasks: (i) image-based camera pose and orientation regression and (ii) indoor scene classification. We evaluate our method on standard benchmarks for camera localization (Cambridge, 7-Scene, and TUM-LSI datasets) and for scene classification (MIT-67 indoor dataset), and show that our method improves upon the results of previous methods. Empirically, we show that combining "what" and "where" aspects of attention improves network performance on both tasks.

READ FULL TEXT

page 1

page 6

page 7

page 14

page 15

page 16

page 17

research
07/09/2020

DCANet: Learning Connected Attentions for Convolutional Neural Networks

While self-attention mechanism has shown promising results for many visi...
research
11/23/2016

Adaptive Feature Abstraction for Translating Video to Text

Previous models for video captioning often use the output from a specifi...
research
09/18/2020

An Enhanced Convolutional Neural Network in Side-Channel Attacks and Its Visualization

In recent years, the convolutional neural networks (CNNs) have received ...
research
11/11/2019

Conditionally Learn to Pay Attention for Sequential Visual Task

Sequential visual task usually requires to pay attention to its current ...
research
09/29/2022

Graph Attention Network for Camera Relocalization on Dynamic Scenes

We devise a graph attention network-based approach for learning a scene ...
research
08/18/2019

Scene Classification in Indoor Environments for Robots using Context Based Word Embeddings

Scene Classification has been addressed with numerous techniques in comp...
research
02/26/2023

NSANet: Noise Seeking Attention Network

LiDAR (Light Detection and Ranging) technology has remained popular in c...

Please sign up or login with your details

Forgot password? Click here to reset