CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation

by   Jing Yu, et al.

Scene graphs are semantic abstraction of images that encourage visual understanding and reasoning. However, the performance of Scene Graph Generation (SGG) is unsatisfactory when faced with biased data in real-world scenarios. Conventional debiasing research mainly studies from the view of data representation, e.g. balancing data distribution or learning unbiased models and representations, ignoring the mechanism that how humans accomplish this task. Inspired by the role of the prefrontal cortex (PFC) in hierarchical reasoning, we analyze this problem from a novel cognition perspective: learning a hierarchical cognitive structure of the highly-biased relationships and navigating that hierarchy to locate the classes, making the tail classes receive more attention in a coarse-to-fine mode. To this end, we propose a novel Cognition Tree (CogTree) loss for unbiased SGG. We first build a cognitive structure CogTree to organize the relationships based on the prediction of a biased SGG model. The CogTree distinguishes remarkably different relationships at first and then focuses on a small portion of easily confused ones. Then, we propose a hierarchical loss specially for this cognitive structure, which supports coarse-to-fine distinction for the correct relationships while progressively eliminating the interference of irrelevant ones. The loss is model-independent and can be applied to various SGG models without extra supervision. The proposed CogTree loss consistently boosts the performance of several state-of-the-art models on the Visual Genome benchmark.


page 1

page 3

page 7


Learning To Generate Scene Graph from Head to Tail

Scene Graph Generation (SGG) represents objects and their interactions w...

Recovering the Unbiased Scene Graphs from the Biased Ones

Given input images, scene graph generation (SGG) aims to produce compreh...

Hierarchical Memory Learning for Fine-Grained Scene Graph Generation

As far as Scene Graph Generation (SGG), coarse and fine predicates mix i...

Rethinking Visual Relationships for High-level Image Understanding

Relationships, as the bond of isolated entities in images, reflect the i...

Predicate correlation learning for scene graph generation

For a typical Scene Graph Generation (SGG) method, there is often a larg...

Unbiased Scene Graph Generation from Biased Training

Today's scene graph generation (SGG) task is still far from practical, m...

Rank-based loss for learning hierarchical representations

Hierarchical taxonomies are common in many contexts, and they are a very...