An Empirical Study on Leveraging Scene Graphs for Visual Question Answering

07/28/2019
by   Cheng Zhang, et al.
0

Visual question answering (Visual QA) has attracted significant attention these years. While a variety of algorithms have been proposed, most of them are built upon different combinations of image and language features as well as multi-modal attention and fusion. In this paper, we investigate an alternative approach inspired by conventional QA systems that operate on knowledge graphs. Specifically, we investigate the use of scene graphs derived from images for Visual QA: an image is abstractly represented by a graph with nodes corresponding to object entities and edges to object relationships. We adapt the recently proposed graph network (GN) to encode the scene graph and perform structured reasoning according to the input question. Our empirical studies demonstrate that scene graphs can already capture essential information of images and graph networks have the potential to outperform state-of-the-art Visual QA algorithms but with a much cleaner architecture. By analyzing the features generated by GNs we can further interpret the reasoning process, suggesting a promising direction towards explainable Visual QA.

READ FULL TEXT

page 1

page 2

page 10

page 17

page 18

page 19

page 20

page 21

research
07/13/2021

Graphhopper: Multi-Hop Scene Graph Reasoning for Visual Question Answering

Visual Question Answering (VQA) is concerned with answering free-form qu...
research
03/22/2023

GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering

Commonsense question-answering (QA) methods combine the power of pre-tra...
research
12/05/2018

Explainable and Explicit Visual Reasoning over Scene Graphs

We aim to dismantle the prevalent black-box neural architectures used in...
research
09/11/2018

The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR

Visual QA is a pivotal challenge for higher-level reasoning, requiring u...
research
10/07/2021

GNN is a Counter? Revisiting GNN for Question Answering

Question Answering (QA) has been a long-standing research topic in AI an...
research
04/20/2021

GraghVQA: Language-Guided Graph Neural Networks for Graph-based Visual Question Answering

Images are more than a collection of objects or attributes – they repres...
research
11/25/2021

Scene Graph Generation with Geometric Context

Scene Graph Generation has gained much attention in computer vision rese...

Please sign up or login with your details

Forgot password? Click here to reset