Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features

08/01/2018
by   Xu Yang, et al.
2

Due to the fact that it is prohibitively expensive to completely annotate visual relationships, i.e., the (obj1, rel, obj2) triplets, relationship models are inevitably biased to object classes of limited pairwise patterns, leading to poor generalization to rare or unseen object combinations. Therefore, we are interested in learning object-agnostic visual features for more generalizable relationship models. By "agnostic", we mean that the feature is less likely biased to the classes of paired objects. To alleviate the bias, we propose a novel Shuffle-Then-Assemble pre-training strategy. First, we discard all the triplet relationship annotations in an image, leaving two unpaired object domains without obj1-obj2 alignment. Then, our feature learning is to recover possible obj1-obj2 pairs. In particular, we design a cycle of residual transformations between the two domains, to capture shared but not object-specific visual patterns. Extensive experiments on two visual relationship benchmarks show that by using our pre-trained features, naive relationship models can be consistently improved and even outperform other state-of-the-art relationship models. Code has been made available at: <https://github.com/yangxuntu/vrd>.

READ FULL TEXT

page 9

page 10

page 14

research
05/17/2022

Disentangling Visual Embeddings for Attributes and Objects

We study the problem of compositional zero-shot learning for object-attr...
research
10/11/2020

Constructing a Visual Relationship Authenticity Dataset

A visual relationship denotes a relationship between two objects in an i...
research
07/13/2023

Bootstrapping Vision-Language Learning with Decoupled Language Pre-training

We present a novel methodology aimed at optimizing the application of fr...
research
09/04/2023

CA2: Class-Agnostic Adaptive Feature Adaptation for One-class Classification

One-class classification (OCC), i.e., identifying whether an example bel...
research
06/18/2022

VReBERT: A Simple and Flexible Transformer for Visual Relationship Detection

Visual Relationship Detection (VRD) impels a computer vision model to 's...
research
08/18/2023

Label-Free Event-based Object Recognition via Joint Learning with Image Reconstruction from Events

Recognizing objects from sparse and noisy events becomes extremely diffi...
research
11/11/2022

Probabilistic Debiasing of Scene Graphs

The quality of scene graphs generated by the state-of-the-art (SOTA) mod...

Please sign up or login with your details

Forgot password? Click here to reset