GAMR: A Guided Attention Model for (visual) Reasoning

06/10/2022
by   Mohit Vaishnav, et al.
0

Humans continue to outperform modern AI systems in their ability to flexibly parse and understand complex visual scenes. Here, we present a novel module for visual reasoning, the Guided Attention Model for (visual) Reasoning (GAMR), which instantiates an active vision theory – positing that the brain solves complex visual reasoning problems dynamically – via sequences of attention shifts to select and route task-relevant visual information into memory. Experiments on an array of visual reasoning tasks and datasets demonstrate GAMR's ability to learn visual routines in a robust and sample-efficient manner. In addition, GAMR is shown to be capable of zero-shot generalization on completely novel reasoning tasks. Overall, our work provides computational support for cognitive theories that postulate the need for a critical interplay between attention and memory to dynamically maintain and manipulate task-relevant visual information to solve complex visual reasoning tasks.

READ FULL TEXT

page 14

page 17

page 18

page 26

page 27

page 28

page 29

page 31

research
06/26/2023

PhD Thesis: Exploring the role of (self-)attention in cognitive and computer vision architecture

We investigate the role of attention and memory in complex reasoning tas...
research
03/16/2018

A dataset and architecture for visual reasoning with a working memory

A vexing problem in artificial intelligence is reasoning about events th...
research
08/08/2021

Understanding the computational demands underlying visual reasoning

Visual understanding requires comprehending complex visual relations bet...
research
09/15/2020

Gravitational Models Explain Shifts on Human Visual Attention

Visual attention refers to the human brain's ability to select relevant ...
research
08/18/2021

Active Observer Visual Problem-Solving Methods are Dynamically Hypothesized, Deployed and Tested

The STAR architecture was designed to test the value of the full Selecti...
research
09/29/2022

Zero-shot visual reasoning through probabilistic analogical mapping

Human reasoning is grounded in an ability to identify highly abstract co...
research
11/24/2021

One-shot Visual Reasoning on RPMs with an Application to Video Frame Prediction

Raven's Progressive Matrices (RPMs) are frequently used in evaluating hu...

Please sign up or login with your details

Forgot password? Click here to reset