Image Semantic Relation Generation

10/19/2022
by   Mingzhe Du, et al.
0

Scene graphs provide structured semantic understanding beyond images. For downstream tasks, such as image retrieval, visual question answering, visual relationship detection, and even autonomous vehicle technology, scene graphs can not only distil complex image information but also correct the bias of visual models using semantic-level relations, which has broad application prospects. However, the heavy labour cost of constructing graph annotations may hinder the application of PSG in practical scenarios. Inspired by the observation that people usually identify the subject and object first and then determine the relationship between them, we proposed to decouple the scene graphs generation task into two sub-tasks: 1) an image segmentation task to pick up the qualified objects. 2) a restricted auto-regressive text generation task to generate the relation between given objects. Therefore, in this work, we introduce image semantic relation generation (ISRG), a simple but effective image-to-text model, which achieved 31 points on the OpenPSG dataset and outperforms strong baselines respectively by 16 points (ResNet-50) and 5 points (CLIP).

READ FULL TEXT

page 2

page 3

research
12/01/2019

Interpreting Context of Images using Scene Graphs

Understanding a visual scene incorporates objects, relationships, and co...
research
05/30/2023

Fine-Grained is Too Coarse: A Novel Data-Centric Approach for Efficient Scene Graph Generation

Learning to compose visual relationships from raw images in the form of ...
research
08/19/2021

Semantic Compositional Learning for Low-shot Scene Graph Generation

Scene graphs provide valuable information to many downstream tasks. Many...
research
02/22/2022

Relation Regularized Scene Graph Generation

Scene graph generation (SGG) is built on top of detected objects to pred...
research
09/19/2019

Triplet-Aware Scene Graph Embeddings

Scene graphs have become an important form of structured knowledge for t...
research
08/10/2023

Informative Scene Graph Generation via Debiasing

Scene graph generation aims to detect visual relationship triplets, (sub...
research
05/23/2023

Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality

Contrastively trained vision-language models have achieved remarkable pr...

Please sign up or login with your details

Forgot password? Click here to reset