Tensor Composition Net for Visual Relationship Prediction

12/10/2020
by   Yuting Qiang, et al.
0

We present a novel Tensor Composition Network (TCN) to predict visual relationships in images. Visual Relationships in subject-predicate-object form provide a more powerful query modality than simple image tags. However Visual Relationship Prediction (VRP) also provides a more challenging test of image understanding than conventional image tagging, and is difficult to learn due to a large label-space and incomplete annotation. The key idea of our TCN is to exploit the low rank property of the visual relationship tensor, so as to leverage correlations within and across objects and relationships, and make a structured prediction of all objects and their relations in an image. To show the effectiveness of our method, we first empirically compare our model with multi-label classification alternatives on VRP, and show that our model outperforms state-of-the-art MLIC methods. We then show that, thanks to our tensor (de)composition layer, our model can predict visual relationships which have not been seen in training dataset. We finally show our TCN's image-level visual relationship prediction provides a simple and efficient mechanism for relation-based image retrieval.

READ FULL TEXT

page 5

page 7

research
07/31/2016

Visual Relationship Detection with Language Priors

Visual relationships capture a wide variety of interactions between pair...
research
11/22/2019

Visual Relationship Detection with Low Rank Non-Negative Tensor Decomposition

We address the problem of Visual Relationship Detection (VRD) which aims...
research
10/27/2020

Structured Visual Search via Composition-aware Learning

This paper studies visual search using structured queries. The structure...
research
11/27/2017

Separating Self-Expression and Visual Content in Hashtag Supervision

The variety, abundance, and structured nature of hashtags make them an i...
research
05/13/2020

Structured Query-Based Image Retrieval Using Scene Graphs

A structured query can capture the complexity of object interactions (e....
research
11/20/2021

Representing Prior Knowledge Using Randomly, Weighted Feature Networks for Visual Relationship Detection

The single-hidden-layer Randomly Weighted Feature Network (RWFN) introdu...
research
04/17/2020

CPARR: Category-based Proposal Analysis for Referring Relationships

The task of referring relationships is to localize subject and object en...

Please sign up or login with your details

Forgot password? Click here to reset