Supplementing Missing Visions via Dialog for Scene Graph Generations

04/23/2022
by   Ye Zhu, et al.
3

Most current AI systems rely on the premise that the input visual data are sufficient to achieve competitive performance in various computer vision tasks. However, the classic task setup rarely considers the challenging, yet common practical situations where the complete visual data may be inaccessible due to various reasons (e.g., restricted view range and occlusions). To this end, we investigate a computer vision task setting with incomplete visual input data. Specifically, we exploit the Scene Graph Generation (SGG) task with various levels of visual data missingness as input. While insufficient visual input intuitively leads to performance drop, we propose to supplement the missing visions via the natural language dialog interactions to better accomplish the task objective. We design a model-agnostic Supplementary Interactive Dialog (SI-Dial) framework that can be jointly learned with most existing models, endowing the current AI systems with the ability of question-answer interactions in natural language. We demonstrate the feasibility of such a task setting with missing visual input and the effectiveness of our proposed dialog module as the supplementary information source through extensive experiments and analysis, by achieving promising performance improvement over multiple baselines.

READ FULL TEXT
research
06/26/2021

Saying the Unseen: Video Descriptions via Dialog Agents

Current vision and language tasks usually take complete visual data (e.g...
research
11/26/2016

Visual Dialog

We introduce the task of Visual Dialog, which requires an AI agent to ho...
research
11/26/2019

Efficient Attention Mechanism for Handling All the Interactions between Many Inputs with Application to Visual Dialog

It has been a primary concern in recent studies of vision and language t...
research
03/16/2022

Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene

Visual dialog has witnessed great progress after introducing various vis...
research
04/11/2019

Reasoning Visual Dialogs with Structural and Partial Observations

We propose a novel model to address the task of Visual Dialog which exhi...
research
05/24/2023

Frugal Prompting for Dialog Models

The use of large language models (LLMs) in natural language processing (...
research
11/24/2019

Two Causal Principles for Improving Visual Dialog

This paper is a winner report from team MReaL-BDAI for Visual Dialog Cha...

Please sign up or login with your details

Forgot password? Click here to reset