Towards Unseen Triples: Effective Text-Image-joint Learning for Scene Graph Generation

06/23/2023
by   Qianji Di, et al.
0

Scene Graph Generation (SGG) aims to structurally and comprehensively represent objects and their connections in images, it can significantly benefit scene understanding and other related downstream tasks. Existing SGG models often struggle to solve the long-tailed problem caused by biased datasets. However, even if these models can fit specific datasets better, it may be hard for them to resolve the unseen triples which are not included in the training set. Most methods tend to feed a whole triple and learn the overall features based on statistical machine learning. Such models have difficulty predicting unseen triples because the objects and predicates in the training set are combined differently as novel triples in the test set. In this work, we propose a Text-Image-joint Scene Graph Generation (TISGG) model to resolve the unseen triples and improve the generalisation capability of the SGG models. We propose a Joint Fearture Learning (JFL) module and a Factual Knowledge based Refinement (FKR) module to learn object and predicate categories separately at the feature level and align them with corresponding visual features so that the model is no longer limited to triples matching. Besides, since we observe the long-tailed problem also affects the generalization ability, we design a novel balanced learning strategy, including a Charater Guided Sampling (CGS) and an Informative Re-weighting (IR) module, to provide tailor-made learning methods for each predicate according to their characters. Extensive experiments show that our model achieves state-of-the-art performance. In more detail, TISGG boosts the performances by 11.7 sub-task on the Visual Genome dataset.

READ FULL TEXT

page 1

page 3

page 4

page 8

research
11/15/2018

LinkNet: Relational Embedding for Scene Graph

Objects and their relationships are critical contents for image understa...
research
01/18/2022

Resistance Training using Prior Bias: toward Unbiased Scene Graph Generation

Scene Graph Generation (SGG) aims to build a structured representation o...
research
01/01/2023

Skew Class-balanced Re-weighting for Unbiased Scene Graph Generation

An unbiased scene graph generation (SGG) algorithm referred to as Skew C...
research
08/22/2021

Learning of Visual Relations: The Devil is in the Tails

Significant effort has been recently devoted to modeling visual relation...
research
12/27/2019

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

In this paper, we address the task of semantic-guided scene generation. ...
research
08/17/2022

Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning

Scene graph generation (SGG) is a fundamental task aimed at detecting vi...
research
08/30/2021

From General to Specific: Informative Scene Graph Generation via Balance Adjustment

The scene graph generation (SGG) task aims to detect visual relationship...

Please sign up or login with your details

Forgot password? Click here to reset