DeepAI AI Chat
Log In Sign Up

Learning to reason over visual objects

by   Shanka Subhra Mondal, et al.
Princeton University

A core component of human intelligence is the ability to identify abstract patterns inherent in complex, high-dimensional perceptual data, as exemplified by visual reasoning tasks such as Raven's Progressive Matrices (RPM). Motivated by the goal of designing AI systems with this capacity, recent work has focused on evaluating whether neural networks can learn to solve RPM-like problems. Previous work has generally found that strong performance on these problems requires the incorporation of inductive biases that are specific to the RPM problem format, raising the question of whether such models might be more broadly useful. Here, we investigated the extent to which a general-purpose mechanism for processing visual scenes in terms of objects might help promote abstract visual reasoning. We found that a simple model, consisting only of an object-centric encoder and a transformer reasoning module, achieved state-of-the-art results on both of two challenging RPM-like benchmarks (PGM and I-RAVEN), as well as a novel benchmark with greater visual complexity (CLEVR-Matrices). These results suggest that an inductive bias for object-centric processing may be a key component of abstract visual reasoning, obviating the need for problem-specific inductive biases.


page 5

page 17

page 18

page 19

page 20

page 21

page 22


Systematic Visual Reasoning through Object-Centric Relational Abstraction

Human visual reasoning is characterized by an ability to identify abstra...

Multi-Viewpoint and Multi-Evaluation with Felicitous Inductive Bias Boost Machine Abstract Reasoning Ability

Great endeavors have been made to study AI's ability in abstract reasoni...

Learning Abstract Visual Reasoning via Task Decomposition: A Case Study in Raven Progressive Matrices

One of the challenges in learning to perform abstract reasoning is that ...

Evaluating Visual Number Discrimination in Deep Neural Networks

The ability to discriminate between large and small quantities is a core...

The minimal computational substrate of fluid intelligence

The quantification of cognitive powers rests on identifying a behavioura...

Beyond Transformers for Function Learning

The ability to learn and predict simple functions is a key aspect of hum...