Semantic Compositional Learning for Low-shot Scene Graph Generation

08/19/2021
by   Tao He, et al.
6

Scene graphs provide valuable information to many downstream tasks. Many scene graph generation (SGG) models solely use the limited annotated relation triples for training, leading to their underperformance on low-shot (few and zero) scenarios, especially on the rare predicates. To address this problem, we propose a novel semantic compositional learning strategy that makes it possible to construct additional, realistic relation triples with objects from different images. Specifically, our strategy decomposes a relation triple by identifying and removing the unessential component and composes a new relation triple by fusing with a semantically or visually similar object from a visual components dictionary, whilst ensuring the realisticity of the newly composed triple. Notably, our strategy is generic and can be combined with existing SGG models to significantly improve their performance. We performed a comprehensive evaluation on the benchmark dataset Visual Genome. For three recent SGG models, adding our strategy improves their performance by close to 50%, and all of them substantially exceed the current state-of-the-art.

READ FULL TEXT

page 1

page 4

page 5

research
02/01/2019

Rethinking Visual Relationships for High-level Image Understanding

Relationships, as the bond of isolated entities in images, reflect the i...
research
10/19/2022

Image Semantic Relation Generation

Scene graphs provide structured semantic understanding beyond images. Fo...
research
11/26/2021

Not All Relations are Equal: Mining Informative Labels for Scene Graph Generation

Scene graph generation (SGG) aims to capture a wide variety of interacti...
research
12/16/2019

Learning Canonical Representations for Scene Graph to Image Generation

Generating realistic images of complex visual scenes becomes very challe...
research
09/05/2023

Haystack: A Panoptic Scene Graph Dataset to Evaluate Rare Predicate Classes

Current scene graph datasets suffer from strong long-tail distributions ...
research
05/23/2023

Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality

Contrastively trained vision-language models have achieved remarkable pr...
research
07/07/2023

Compositionality in Model-Based Testing

Model-based testing (MBT) promises a scalable solution to testing large ...

Please sign up or login with your details

Forgot password? Click here to reset