Large-Scale Visual Relationship Understanding

04/27/2018
by   Ji Zhang, et al.
0

Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples. In real-world scenarios with large numbers of objects and relations, some are seen very commonly while others are barely seen. We develop a new relationship detection model that embeds objects and relations into two vector spaces where both discriminative capability and semantic affinity are preserved. We learn both a visual and a semantic module that map features from the two modalities into a shared space, where matched pairs of features have to discriminate against those unmatched, but also maintain close distances to semantically similar ones. Benefiting from that, our model can achieve superior performance even when the visual entity categories scale up to more than 80,000, with extremely skewed class distribution. We demonstrate the efficacy of our model on a large and imbalanced benchmark based of Visual Genome that comprises 53,000+ objects and 29,000+ relations, a scale at which no previous work has ever been evaluated at. We show superiority of our model over carefully designed baselines on Visual Genome, as well as competitive performance on the much smaller VRD dataset.

READ FULL TEXT

page 1

page 3

page 14

page 15

page 16

research
05/28/2019

Union Visual Translation Embedding for Visual Relationship Detection and Scene Graph Generation

Relations amongst entities play a central role in image understanding. D...
research
02/27/2017

Visual Translation Embedding Network for Visual Relation Detection

Visual relations, such as "person ride bike" and "bike next to car", off...
research
12/26/2018

Exploring the Challenges towards Lifelong Fact Learning

So far life-long learning (LLL) has been studied in relatively small-sca...
research
04/03/2019

Exploring the Semantics for Visual Relationship Detection

Scene graph construction / visual relationship detection from an image a...
research
05/28/2017

Care about you: towards large-scale human-centric visual relationship detection

Visual relationship detection aims to capture interactions between pairs...
research
04/16/2019

Visual Relationship Detection with Language prior and Softmax

Visual relationship detection is an intermediate image understanding tas...
research
01/12/2015

Tri-Subject Kinship Verification: Understanding the Core of A Family

One major challenge in computer vision is to go beyond the modeling of i...

Please sign up or login with your details

Forgot password? Click here to reset