Recurrent Attention Models with Object-centric Capsule Representation for Multi-object Recognition

10/11/2021
by   Hossein Adeli, et al.
0

The visual system processes a scene using a sequence of selective glimpses, each driven by spatial and object-based attention. These glimpses reflect what is relevant to the ongoing task and are selected through recurrent processing and recognition of the objects in the scene. In contrast, most models treat attention selection and recognition as separate stages in a feedforward process. Here we show that using capsule networks to create an object-centric hidden representation in an encoder-decoder model with iterative glimpse attention yields effective integration of attention and recognition. We evaluate our model on three multi-object recognition tasks; highly overlapping digits, digits among distracting clutter and house numbers, and show that it learns to effectively move its glimpse window, recognize and reconstruct the objects, all with only the classification as supervision. Our work takes a step toward a general architecture for how to integrate recurrent object-centric representation into the planning of attentional glimpses.

READ FULL TEXT

page 6

page 8

page 9

page 20

page 21

page 22

page 23

page 24

research
06/12/2017

Enriched Deep Recurrent Visual Attention Model for Multiple Object Recognition

We design an Enriched Deep Recurrent Visual Attention Model (EDRAM) - an...
research
09/12/2019

Recurrent Connectivity Aids Recognition of Partly Occluded Objects

Feedforward convolutional neural networks are the prevalent model of cor...
research
05/04/2017

Recurrent Soft Attention Model for Common Object Recognition

We propose the Recurrent Soft Attention Model, which integrates the visu...
research
09/27/2022

Reconstruction-guided attention improves the robustness and shape processing of neural networks

Many visual phenomena suggest that humans use top-down generative or rec...
research
02/04/2020

Selective Segmentation Networks Using Top-Down Attention

Convolutional neural networks model the transformation of the input sens...
research
12/08/2020

Canonical Capsules: Unsupervised Capsules in Canonical Pose

We propose an unsupervised capsule architecture for 3D point clouds. We ...
research
07/23/2020

Sequential Routing Framework: Fully Capsule Network-based Speech Recognition

Capsule networks (CapsNets) have recently gotten attention as alternativ...

Please sign up or login with your details

Forgot password? Click here to reset