Active Learning for Open-set Annotation

01/18/2022
by   Kun-Peng Ning, et al.
17

Existing active learning studies typically work in the closed-set setting by assuming that all data examples to be labeled are drawn from known classes. However, in real annotation tasks, the unlabeled data usually contains a large amount of examples from unknown classes, resulting in the failure of most active learning methods. To tackle this open-set annotation (OSA) problem, we propose a new active learning framework called LfOSA, which boosts the classification performance with an effective sampling strategy to precisely detect examples from known classes for annotation. The LfOSA framework introduces an auxiliary network to model the per-example max activation value (MAV) distribution with a Gaussian Mixture Model, which can dynamically select the examples with highest probability from known classes in the unlabeled set. Moreover, by reducing the temperature T of the loss function, the detection model will be further optimized by exploiting both known and unknown supervision. The experimental results show that the proposed method can significantly improve the selection quality of known classes, and achieve higher classification accuracy with lower annotation cost than state-of-the-art active learning methods. To the best of our knowledge, this is the first work of active learning for open-set annotation.

READ FULL TEXT

page 1

page 3

page 6

page 7

research
07/11/2023

OpenAL: An Efficient Deep Active Learning Framework for Open-Set Pathology Image Classification

Active learning (AL) is an effective approach to select the most informa...
research
02/21/2018

Active Learning with Partial Feedback

In the large-scale multiclass setting, assigning labels often consists o...
research
02/27/2019

The Importance of Metric Learning for Robotic Vision: Open Set Recognition and Active Learning

State-of-the-art deep neural network recognition systems are designed fo...
research
08/29/2019

Active Learning for Domain Classification in a Commercial Spoken Personal Assistant

We describe a method for selecting relevant new training data for the LS...
research
10/13/2022

Meta-Query-Net: Resolving Purity-Informativeness Dilemma in Open-set Active Learning

Unlabeled data examples awaiting annotations contain open-set noise inev...
research
06/01/2011

Committee-Based Sample Selection for Probabilistic Classifiers

In many real-world learning tasks, it is expensive to acquire a sufficie...
research
06/30/2020

Similarity Search for Efficient Active Learning and Search of Rare Concepts

Many active learning and search approaches are intractable for industria...

Please sign up or login with your details

Forgot password? Click here to reset