Visual Query Answering by Entity-Attribute Graph Matching and Reasoning

03/16/2019
by   Peixi Xiong, et al.
0

Visual Query Answering (VQA) is of great significance in offering people convenience: one can raise a question for details of objects, or high-level understanding about the scene, over an image. This paper proposes a novel method to address the VQA problem. In contrast to prior works, our method that targets single scene VQA, replies on graph-based techniques and involves reasoning. In a nutshell, our approach is centered on three graphs. The first graph, referred to as inference graph GI , is constructed via learning over labeled data. The other two graphs, referred to as query graph Q and entity-attribute graph GEA, are generated from natural language query Qnl and image Img, that are issued from users, respectively. As GEA often does not take sufficient information to answer Q, we develop techniques to infer missing information of GEA with GI . Based on GEA and Q, we provide techniques to find matches of Q in GEA, as the answer of Qnl in Img. Unlike commonly used VQA methods that are based on end-to-end neural networks, our graph-based method shows well-designed reasoning capability, and thus is highly interpretable. We also create a dataset on soccer match (Soccer-VQA) with rich annotations. The experimental results show that our approach outperforms the state-of-the-art method and has high potential for future investigation.

READ FULL TEXT

page 1

page 5

research
05/23/2022

VQA-GNN: Reasoning with Multimodal Semantic Graph for Visual Question Answering

Visual understanding requires seamless integration between recognition a...
research
02/17/2020

CQ-VQA: Visual Question Answering on Categorized Questions

This paper proposes CQ-VQA, a novel 2-level hierarchical but end-to-end ...
research
01/14/2021

Understanding the Role of Scene Graphs in Visual Question Answering

Visual Question Answering (VQA) is of tremendous interest to the researc...
research
09/02/2021

Lightweight Visual Question Answering using Scene Graphs

Visual question answering (VQA) is a challenging problem in machine perc...
research
08/24/2022

Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task

VQA is an ambitious task aiming to answer any image-related question. Ho...
research
07/09/2019

Learning by Abstraction: The Neural State Machine

We introduce the Neural State Machine, seeking to bridge the gap between...
research
07/14/2020

A Graph-based Interactive Reasoning for Human-Object Interaction Detection

Human-Object Interaction (HOI) detection devotes to learn how humans int...

Please sign up or login with your details

Forgot password? Click here to reset