Scene Graph Prediction with Limited Labels

04/25/2019
∙
by   Vincent S. Chen, et al.
∙
6
∙

Visual knowledge bases such as Visual Genome power numerous applications in computer vision, including visual question answering and captioning, but suffer from sparse, incomplete relationships. All scene graph models to date are limited to training on a small set of visual relationships that have thousands of training labels each. Hiring human annotators is expensive, and using textual knowledge base completion methods are incompatible with visual data. In this paper, we introduce a semi-supervised method that assigns probabilistic relationship labels to a large number of unlabeled images using few labeled examples. We analyze visual relationships to suggest two types of image-agnostic features that are used to generate noisy heuristics, whose outputs are aggregated using a factor graph-based generative model. With as few as 10 labeled relationship examples, the generative model creates enough training data to train any existing state-of-the-art scene graph model. We demonstrate that our method for generating training data outperforms all baseline approaches by 5.16 recall@100. Since we only use a few labels, we define a complexity metric for relationships that serves as an indicator (R^2 = 0.778) for conditions under which our method succeeds over transfer learning, the de-facto approach for training with limited labels.

READ FULL TEXT

page 4

page 5

page 6

research
∙ 06/12/2019

Visual Relationships as Functions: Enabling Few-Shot Scene Graph Prediction

Scene graph prediction --- classifying the set of objects and predicates...
research
∙ 12/21/2022

Improving Narrative Relationship Embeddings by Training with Additional Inverse-Relationship Constraints

We consider the problem of embedding character-entity relationships from...
research
∙ 03/29/2021

Visual Distant Supervision for Scene Graph Generation

Scene graph generation aims to identify objects and their relations in i...
research
∙ 10/27/2019

Leveraging Auxiliary Text for Deep Recognition of Unseen Visual Relationships

One of the most difficult tasks in scene understanding is recognizing in...
research
∙ 02/01/2019

Rethinking Visual Relationships for High-level Image Understanding

Relationships, as the bond of isolated entities in images, reflect the i...
research
∙ 03/09/2023

Knowledge-augmented Few-shot Visual Relation Detection

Visual Relation Detection (VRD) aims to detect relationships between obj...
research
∙ 07/11/2020

Generative Graph Perturbations for Scene Graph Prediction

Inferring objects and their relationships from an image is useful in man...

Please sign up or login with your details

Forgot password? Click here to reset