Iterative Scene Graph Generation with Generative Transformers

11/30/2022
by   Sanjoy Kundu, et al.
0

Scene graphs provide a rich, structured representation of a scene by encoding the entities (objects) and their spatial relationships in a graphical format. This representation has proven useful in several tasks, such as question answering, captioning, and even object detection, to name a few. Current approaches take a generation-by-classification approach where the scene graph is generated through labeling of all possible edges between objects in a scene, which adds computational overhead to the approach. This work introduces a generative transformer-based approach to generating scene graphs beyond link prediction. Using two transformer-based components, we first sample a possible scene graph structure from detected objects and their visual features. We then perform predicate classification on the sampled edges to generate the final scene graph. This approach allows us to efficiently generate scene graphs from images with minimal inference overhead. Extensive experiments on the Visual Genome dataset demonstrate the efficiency of the proposed approach. Without bells and whistles, we obtain, on average, 20.7 different settings for scene graph generation (SGG), outperforming state-of-the-art SGG approaches while offering competitive performance to unbiased SGG approaches.

READ FULL TEXT

page 2

page 8

research
07/12/2021

Scenes and Surroundings: Scene Graph Generation using Relation Transformer

Identifying objects in an image and their mutual relationships as a scen...
research
07/16/2018

Visual Graphs from Motion (VGfM): Scene understanding with object geometry reasoning

Recent approaches on visual scene understanding attempt to build a scene...
research
03/03/2021

Energy-Based Learning for Scene Graph Generation

Traditional scene graph generation methods are trained using cross-entro...
research
02/15/2018

Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction

Structured prediction is concerned with predicting multiple inter-depend...
research
12/18/2021

Exploiting Long-Term Dependencies for Generating Dynamic Scene Graphs

Structured video representation in the form of dynamic scene graphs is a...
research
04/02/2023

Learning Similarity between Scene Graphs and Images with Transformers

Scene graph generation is conventionally evaluated by (mean) Recall@K, w...
research
07/27/2022

Iterative Scene Graph Generation

The task of scene graph generation entails identifying object entities a...

Please sign up or login with your details

Forgot password? Click here to reset