Learning to Predict Visual Attributes in the Wild

06/17/2021
by   Khoi Pham, et al.
0

Visual attributes constitute a large portion of information contained in a scene. Objects can be described using a wide variety of attributes which portray their visual appearance (color, texture), geometry (shape, size, posture), and other intrinsic properties (state, action). Existing work is mostly limited to study of attribute prediction in specific domains. In this paper, we introduce a large-scale in-the-wild visual attribute prediction dataset consisting of over 927K attribute annotations for over 260K object instances. Formally, object attribute prediction is a multi-label classification problem where all attributes that apply to an object must be predicted. Our dataset poses significant challenges to existing methods due to large number of attributes, label sparsity, data imbalance, and object occlusion. To this end, we propose several techniques that systematically tackle these challenges, including a base model that utilizes both low- and high-level CNN features with multi-hop attention, reweighting and resampling techniques, a novel negative label expansion scheme, and a novel supervised attribute-aware contrastive learning algorithm. Using these techniques, we achieve near 3.7 mAP and 5.7 overall F1 points improvement over the current state of the art. Further details about the VAW dataset can be found at http://vawdataset.com/.

READ FULL TEXT

page 1

page 5

page 7

page 14

page 15

page 19

page 20

page 21

research
11/10/2018

Multi-label Object Attribute Classification using a Convolutional Neural Network

Objects of different classes can be described using a limited number of ...
research
03/07/2022

GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction

Attaching attributes (such as color, shape, state, action) to object cat...
research
07/10/2018

Deep Imbalanced Attribute Classification using Visual Attention Aggregation

For many computer vision applications such as image description and huma...
research
01/12/2015

From Visual Attributes to Adjectives through Decompositional Distributional Semantics

As automated image analysis progresses, there is increasing interest in ...
research
07/19/2017

Deep View-Sensitive Pedestrian Attribute Inference in an end-to-end Model

Pedestrian attribute inference is a demanding problem in visual surveill...
research
08/03/2020

PhraseCut: Language-based Image Segmentation in the Wild

We consider the problem of segmenting image regions given a natural lang...
research
02/12/2023

Contrastive Learning and the Emergence of Attributes Associations

In response to an object presentation, supervised learning schemes gener...

Please sign up or login with your details

Forgot password? Click here to reset