A Neuro-Symbolic ASP Pipeline for Visual Question Answering

05/16/2022
by   Thomas Eiter, et al.
0

We present a neuro-symbolic visual question answering (VQA) pipeline for CLEVR, which is a well-known dataset that consists of pictures showing scenes with objects and questions related to them. Our pipeline covers (i) training neural networks for object classification and bounding-box prediction of the CLEVR scenes, (ii) statistical analysis on the distribution of prediction values of the neural networks to determine a threshold for high-confidence predictions, and (iii) a translation of CLEVR questions and network predictions that pass confidence thresholds into logic programs so that we can compute the answers using an ASP solver. By exploiting choice rules, we consider deterministic and non-deterministic scene encodings. Our experiments show that the non-deterministic scene encoding achieves good results even if the neural networks are trained rather poorly in comparison with the deterministic approach. This is important for building robust VQA systems if network predictions are less-than perfect. Furthermore, we show that restricting non-determinism to reasonable choices allows for more efficient implementations in comparison with related neuro-symbolic approaches without loosing much accuracy. This work is under consideration for acceptance in TPLP.

READ FULL TEXT
research
11/08/2021

Visual Question Answering based on Formal Logic

Visual question answering (VQA) has been gaining a lot of traction in th...
research
11/21/2020

LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question Answering

The predominant approach to visual question answering (VQA) relies on en...
research
09/19/2016

Graph-Structured Representations for Visual Question Answering

This paper proposes to improve visual question answering (VQA) with stru...
research
07/17/2017

Visual Question Answering with Memory-Augmented Networks

This paper exploits a memory-augmented neural network to predict accurat...
research
02/21/2019

Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering

We propose a new class of probabilistic neural-symbolic models, that hav...
research
04/13/2021

Neuro-Symbolic VQA: A review from the perspective of AGI desiderata

An ultimate goal of the AI and ML fields is artificial general intellige...
research
06/14/2023

Improving Selective Visual Question Answering by Learning from Your Peers

Despite advances in Visual Question Answering (VQA), the ability of mode...

Please sign up or login with your details

Forgot password? Click here to reset