General Multi-label Image Classification with Transformers

11/27/2020
by   Jack Lanchantin, et al.
0

Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image. In this work we propose the Classification Transformer (C-Tran), a general framework for multi-label image classification that leverages Transformers to exploit the complex dependencies among visual features and labels. Our approach consists of a Transformer encoder trained to predict a set of target labels given an input set of masked labels, and visual features from a convolutional neural network. A key ingredient of our method is a label mask training objective that uses a ternary encoding scheme to represent the state of the labels as positive, negative, or unknown during training. Our model shows state-of-the-art performance on challenging datasets such as COCO and Visual Genome. Moreover, because our model explicitly represents the uncertainty of labels during training, it is more general by allowing us to produce improved results for images with partial or extra label annotations during inference. We demonstrate this additional capability in the COCO, Visual Genome, News500, and CUB image datasets.

READ FULL TEXT

page 12

page 13

research
02/26/2019

Learning a Deep ConvNet for Multi-label Classification with Partial Labels

Deep ConvNets have shown great performance for single-label image classi...
research
12/10/2021

Visual Transformers with Primal Object Queries for Multi-Label Image Classification

Multi-label image classification is about predicting a set of class labe...
research
09/14/2022

Combining Metric Learning and Attention Heads For Accurate and Efficient Multilabel Image Classification

Multi-label image classification allows predicting a set of labels from ...
research
03/08/2022

Graph Attention Transformer Network for Multi-Label Image Classification

Multi-label classification aims to recognize multiple objects or attribu...
research
12/04/2016

Multi-Label Image Classification with Regional Latent Semantic Dependencies

Deep convolution neural networks (CNN) have demonstrated advanced perfor...
research
06/20/2022

DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations

Solving multi-label recognition (MLR) for images in the low-label regime...
research
08/03/2023

DualCoOp++: Fast and Effective Adaptation to Multi-Label Recognition with Limited Annotations

Multi-label image recognition in the low-label regime is a task of great...

Please sign up or login with your details

Forgot password? Click here to reset