One-shot Visual Reasoning on RPMs with an Application to Video Frame Prediction

11/24/2021
by   Wentao He, et al.
0

Raven's Progressive Matrices (RPMs) are frequently used in evaluating human's visual reasoning ability. Researchers have made considerable effort in developing a system which could automatically solve the RPM problem, often through a black-box end-to-end Convolutional Neural Network (CNN) for both visual recognition and logical reasoning tasks. Towards the objective of developing a highly explainable solution, we propose a One-shot Human-Understandable ReaSoner (Os-HURS), which is a two-step framework including a perception module and a reasoning module, to tackle the challenges of real-world visual recognition and subsequent logical reasoning tasks, respectively. For the reasoning module, we propose a "2+1" formulation that can be better understood by humans and significantly reduces the model complexity. As a result, a precise reasoning rule can be deduced from one RPM sample only, which is not feasible for existing solution methods. The proposed reasoning module is also capable of yielding a set of reasoning rules, precisely modeling the human knowledge in solving the RPM problem. To validate the proposed method on real-world applications, an RPM-like One-shot Frame-prediction (ROF) dataset is constructed, where visual reasoning is conducted on RPMs constructed using real-world video frames instead of synthetic images. Experimental results on various RPM-like datasets demonstrate that the proposed Os-HURS achieves a significant and consistent performance gain compared with the state-of-the-art models.

READ FULL TEXT
research
06/06/2018

Progressive Reasoning by Module Composition

Humans learn to solve tasks of increasing complexity by building on top ...
research
03/09/2021

A Data Augmentation Method by Mixing Up Negative Candidate Answers for Solving Raven's Progressive Matrices

Raven's Progressive Matrices (RPMs) are frequently-used in testing human...
research
07/23/2020

Few-shot Visual Reasoning with Meta-analogical Contrastive Learning

While humans can solve a visual puzzle that requires logical reasoning b...
research
06/10/2022

GAMR: A Guided Attention Model for (visual) Reasoning

Humans continue to outperform modern AI systems in their ability to flex...
research
09/22/2019

Analyzing Recurrent Neural Network by Probabilistic Abstraction

Neural network is becoming the dominant approach for solving many real-w...
research
05/02/2023

Visual Reasoning: from State to Transformation

Most existing visual reasoning tasks, such as CLEVR in VQA, ignore an im...
research
11/26/2020

Transformation Driven Visual Reasoning

This paper defines a new visual reasoning paradigm by introducing an imp...

Please sign up or login with your details

Forgot password? Click here to reset