Robust Assignment of Labels for Active Learning with Sparse and Noisy Annotations

07/25/2023
by   Daniel Kałuża, et al.
0

Supervised classification algorithms are used to solve a growing number of real-life problems around the globe. Their performance is strictly connected with the quality of labels used in training. Unfortunately, acquiring good-quality annotations for many tasks is infeasible or too expensive to be done in practice. To tackle this challenge, active learning algorithms are commonly employed to select only the most relevant data for labeling. However, this is possible only when the quality and quantity of labels acquired from experts are sufficient. Unfortunately, in many applications, a trade-off between annotating individual samples by multiple annotators to increase label quality vs. annotating new samples to increase the total number of labeled instances is necessary. In this paper, we address the issue of faulty data annotations in the context of active learning. In particular, we propose two novel annotation unification algorithms that utilize unlabeled parts of the sample space. The proposed methods require little to no intersection between samples annotated by different experts. Our experiments on four public datasets indicate the robustness and superiority of the proposed methods in both, the estimation of the annotator's reliability, and the assignment of actual labels, against the state-of-the-art algorithms and the simple majority voting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2020

Task-Aware Variational Adversarial Active Learning

Deep learning has achieved remarkable performance in various tasks thank...
research
05/29/2020

Machine learning methods to detect money laundering in the Bitcoin blockchain in the presence of label scarcity

Every year, criminals launder billions of dollars acquired from serious ...
research
01/27/2023

ActiveLab: Active Learning with Re-Labeling by Multiple Annotators

In real-world data labeling applications, annotators often provide imper...
research
10/08/2020

DART: A Lightweight Quality-Suggestive Data-to-Text Annotation Tool

We present a lightweight annotation tool, the Data AnnotatoR Tool (DART)...
research
02/27/2017

Active Learning Using Uncertainty Information

Many active learning methods belong to the retraining-based approaches, ...
research
10/18/2020

Exploiting Context for Robustness to Label Noise in Active Learning

Several works in computer vision have demonstrated the effectiveness of ...
research
08/11/2021

The Pitfalls of Sample Selection: A Case Study on Lung Nodule Classification

Using publicly available data to determine the performance of methodolog...

Please sign up or login with your details

Forgot password? Click here to reset