GraghVQA: Language-Guided Graph Neural Networks for Graph-based Visual Question Answering

04/20/2021
by   Weixin Liang, et al.
0

Images are more than a collection of objects or attributes – they represent a web of relationships among interconnected objects. Scene Graph has emerged as a new modality as a structured graphical representation of images. Scene Graph encodes objects as nodes connected via pairwise relations as edges. To support question answering on scene graphs, we propose GraphVQA, a language-guided graph neural network framework that translates and executes a natural language question as multiple iterations of message passing among graph nodes. We explore the design space of GraphVQA framework, and discuss the trade-off of different design choices. Our experiments on GQA dataset show that GraphVQA outperforms the state-of-the-art accuracy by a large margin (88.43 94.78

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2020

Scene Graph Reasoning for Visual Question Answering

Visual question answering is concerned with answering free-form question...
research
07/13/2021

Graphhopper: Multi-Hop Scene Graph Reasoning for Visual Question Answering

Visual Question Answering (VQA) is concerned with answering free-form qu...
research
11/03/2022

Grounding Scene Graphs on Natural Images via Visio-Lingual Message Passing

This paper presents a framework for jointly grounding objects that follo...
research
02/21/2022

OG-SGG: Ontology-Guided Scene Graph Generation. A Case Study in Transfer Learning for Telepresence Robotics

Scene graph generation from images is a task of great interest to applic...
research
12/05/2018

Explainable and Explicit Visual Reasoning over Scene Graphs

We aim to dismantle the prevalent black-box neural architectures used in...
research
07/28/2019

An Empirical Study on Leveraging Scene Graphs for Visual Question Answering

Visual question answering (Visual QA) has attracted significant attentio...
research
03/31/2020

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Answering questions that require reading texts in an image is challengin...

Please sign up or login with your details

Forgot password? Click here to reset