Segmentation-grounded Scene Graph Generation

04/29/2021
by   Siddhesh Khandelwal, et al.
0

Scene graph generation has emerged as an important problem in computer vision. While scene graphs provide a grounded representation of objects, their locations and relations in an image, they do so only at the granularity of proposal bounding boxes. In this work, we propose the first, to our knowledge, framework for pixel-level segmentation-grounded scene graph generation. Our framework is agnostic to the underlying scene graph generation method and address the lack of segmentation annotations in target scene graph datasets (e.g., Visual Genome) through transfer and multi-task learning from, and with, an auxiliary dataset (e.g., MS COCO). Specifically, each target object being detected is endowed with a segmentation mask, which is expressed as a lingual-similarity weighted linear combination over categories that have annotations present in an auxiliary dataset. These inferred masks, along with a novel Gaussian attention mechanism which grounds the relations at a pixel-level within the image, allow for improved relation prediction. The entire framework is end-to-end trainable and is learned in a multi-task manner with both target and auxiliary datasets.

READ FULL TEXT

page 1

page 7

research
04/01/2019

Scene Graph Generation with External Knowledge and Image Reconstruction

Scene graph generation has received growing attention with the advanceme...
research
07/08/2023

Learning to Group Auxiliary Datasets for Molecule

The limited availability of annotations in small molecule datasets prese...
research
07/22/2022

Panoptic Scene Graph Generation

Existing research addresses scene graph generation (SGG) – a critical te...
research
04/03/2019

CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis

Better understanding and modelling of building interiors and the emergen...
research
05/26/2020

Multi-task deep learning for image segmentation using recursive approximation tasks

Fully supervised deep neural networks for segmentation usually require a...
research
01/14/2020

NODIS: Neural Ordinary Differential Scene Understanding

Semantic image understanding is a challenging topic in computer vision. ...
research
08/20/2011

Toward Parts-Based Scene Understanding with Pixel-Support Parts-Sparse Pictorial Structures

Scene understanding remains a significant challenge in the computer visi...

Please sign up or login with your details

Forgot password? Click here to reset