Rethinking Crowdsourcing Annotation: Partial Annotation with Salient Labels for Multi-Label Image Classification

by   Jianzhe Lin, et al.

Annotated images are required for both supervised model training and evaluation in image classification. Manually annotating images is arduous and expensive, especially for multi-labeled images. A recent trend for conducting such laboursome annotation tasks is through crowdsourcing, where images are annotated by volunteers or paid workers online (e.g., workers of Amazon Mechanical Turk) from scratch. However, the quality of crowdsourcing image annotations cannot be guaranteed, and incompleteness and incorrectness are two major concerns for crowdsourcing annotations. To address such concerns, we have a rethinking of crowdsourcing annotations: Our simple hypothesis is that if the annotators only partially annotate multi-label images with salient labels they are confident in, there will be fewer annotation errors and annotators will spend less time on uncertain labels. As a pleasant surprise, with the same annotation budget, we show a multi-label image classifier supervised by images with salient annotations can outperform models supervised by fully annotated images. Our method contributions are 2-fold: An active learning way is proposed to acquire salient labels for multi-label images; and a novel Adaptive Temperature Associated Model (ATAM) specifically using partial annotations is proposed for multi-label image classification. We conduct experiments on practical crowdsourcing data, the Open Street Map (OSM) dataset and benchmark dataset COCO 2014. When compared with state-of-the-art classification methods trained on fully annotated images, the proposed ATAM can achieve higher accuracy. The proposed idea is promising for crowdsourcing data annotation. Our code will be publicly available.


page 1

page 6


Learning a Deep ConvNet for Multi-label Classification with Partial Labels

Deep ConvNets have shown great performance for single-label image classi...

Learning From Noisy Singly-labeled Data

Supervised learning depends on annotated examples, which are taken to be...

Interface Design for Crowdsourcing Hierarchical Multi-Label Text Annotations

Human data labeling is an important and expensive task at the heart of s...

Active Multi-Label Crowd Consensus

Crowdsourcing is an economic and efficient strategy aimed at collecting ...

Revisiting Vicinal Risk Minimization for Partially Supervised Multi-Label Classification Under Data Scarcity

Due to the high human cost of annotation, it is non-trivial to curate a ...

Much Ado About Time: Exhaustive Annotation of Temporal Data

Large-scale annotated datasets allow AI systems to learn from and build ...

Utilizing supervised models to infer consensus labels and their quality from data with multiple annotators

Real-world data for classification is often labeled by multiple annotato...

Please sign up or login with your details

Forgot password? Click here to reset