Transformer-based Dual Relation Graph for Multi-label Image Recognition

10/10/2021
by   Jiawei Zhao, et al.
0

The simultaneous recognition of multiple objects in one image remains a challenging task, spanning multiple events in the recognition field such as various object scales, inconsistent appearances, and confused inter-class relationships. Recent research efforts mainly resort to the statistic label co-occurrences and linguistic word embedding to enhance the unclear semantics. Different from these researches, in this paper, we propose a novel Transformer-based Dual Relation learning framework, constructing complementary relationships by exploring two aspects of correlation, i.e., structural relation graph and semantic relation graph. The structural relation graph aims to capture long-range correlations from object context, by developing a cross-scale transformer-based architecture. The semantic graph dynamically models the semantic meanings of image objects with explicit semantic-aware constraints. In addition, we also incorporate the learnt structural relationship into the semantic graph, constructing a joint relation graph for robust representations. With the collaborative learning of these two effective relation graphs, our approach achieves new state-of-the-art on two popular multi-label recognition benchmarks, i.e., MS-COCO and VOC 2007 dataset.

READ FULL TEXT

page 1

page 3

page 8

research
03/08/2022

Graph Attention Transformer Network for Multi-Label Image Classification

Multi-label classification aims to recognize multiple objects or attribu...
research
09/28/2019

Learning Category Correlations for Multi-label Image Recognition with Graph Networks

Multi-label image recognition is a task that predicts a set of object la...
research
04/21/2023

Semantic-Aware Graph Matching Mechanism for Multi-Label Image Recognition

Multi-label image recognition aims to predict a set of labels that prese...
research
08/28/2023

GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition

Multi-Label Image Recognition (MLIR) is a challenging task that aims to ...
research
07/15/2023

Semantic Contrastive Bootstrapping for Single-positive Multi-label Recognition

Learning multi-label image recognition with incomplete annotation is gai...
research
11/27/2022

Multi-Label Continual Learning using Augmented Graph Convolutional Network

Multi-Label Continual Learning (MLCL) builds a class-incremental framewo...
research
11/19/2022

Rethinking Batch Sample Relationships for Data Representation: A Batch-Graph Transformer based Approach

Exploring sample relationships within each mini-batch has shown great po...

Please sign up or login with your details

Forgot password? Click here to reset