Generative Graph Perturbations for Scene Graph Prediction

07/11/2020
by   Boris Knyazev, et al.
0

Inferring objects and their relationships from an image is useful in many applications at the intersection of vision and language. Due to a long tail data distribution, the task is challenging, with the inevitable appearance of zero-shot compositions of objects and relationships at test time. Current models often fail to properly understand a scene in such cases, as during training they only observe a tiny fraction of the distribution corresponding to the most frequent compositions. This motivates us to study whether increasing the diversity of the training distribution, by generating replacement for parts of real scene graphs, can lead to better generalization? We employ generative adversarial networks (GANs) conditioned on scene graphs to generate augmented visual features. To increase their diversity, we propose several strategies to perturb the conditioning. One of them is to use a language model, such as BERT, to synthesize plausible yet still unlikely scene graphs. By evaluating our model on Visual Genome, we obtain both positive and negative results. This prompts us to make several observations that can potentially lead to further improvements.

READ FULL TEXT
research
04/01/2021

Exploiting Relationship for Complex-scene Image Generation

The significant progress on Generative Adversarial Networks (GANs) has f...
research
05/17/2020

Graph Density-Aware Losses for Novel Compositions in Scene Graph Generation

Scene graph generation (SGG) aims to predict graph-structured descriptio...
research
07/08/2022

GEMS: Scene Expansion using Generative Models of Graphs

Applications based on image retrieval require editing and associating in...
research
05/10/2023

Incorporating Structured Representations into Pretrained Vision Language Models Using Scene Graphs

Vision and Language (VL) models have demonstrated remarkable zero-shot p...
research
03/23/2023

Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World

Scene Graph Generation (SGG) aims to extract <subject, predicate, object...
research
04/25/2019

Scene Graph Prediction with Limited Labels

Visual knowledge bases such as Visual Genome power numerous applications...

Please sign up or login with your details

Forgot password? Click here to reset