How to Allocate your Label Budget? Choosing between Active Learning and Learning to Reject in Anomaly Detection

01/07/2023
by   Lorenzo Perini, et al.
0

Anomaly detection attempts at finding examples that deviate from the expected behaviour. Usually, anomaly detection is tackled from an unsupervised perspective because anomalous labels are rare and difficult to acquire. However, the lack of labels makes the anomaly detector have high uncertainty in some regions, which usually results in poor predictive performance or low user trust in the predictions. One can reduce such uncertainty by collecting specific labels using Active Learning (AL), which targets examples close to the detector's decision boundary. Alternatively, one can increase the user trust by allowing the detector to abstain from making highly uncertain predictions, which is called Learning to Reject (LR). One way to do this is by thresholding the detector's uncertainty based on where its performance is low, which requires labels to be evaluated. Although both AL and LR need labels, they work with different types of labels: AL seeks strategic labels, which are evidently biased, while LR requires i.i.d. labels to evaluate the detector's performance and set the rejection threshold. Because one usually has a unique label budget, deciding how to optimally allocate it is challenging. In this paper, we propose a mixed strategy that, given a budget of labels, decides in multiple rounds whether to use the budget to collect AL labels or LR labels. The strategy is based on a reward function that measures the expected gain when allocating the budget to either side. We evaluate our strategy on 18 benchmark datasets and compare it to some baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2023

Unsupervised Anomaly Detection with Rejection

Anomaly detection aims at detecting unexpected behaviours in the data. B...
research
09/17/2018

Active Anomaly Detection via Ensembles

In critical applications of anomaly detection including computer securit...
research
07/17/2019

Half a Percent of Labels is Enough: Efficient Animal Detection in UAV Imagery using Deep CNNs and Active Learning

We present an Active Learning (AL) strategy for re-using a deep Convolut...
research
01/25/2022

Little Help Makes a Big Difference: Leveraging Active Learning to Improve Unsupervised Time Series Anomaly Detection

Key Performance Indicators (KPI), which are essentially time series data...
research
07/08/2022

Active Learning-based Isolation Forest (ALIF): Enhancing Anomaly Detection in Decision Support Systems

The detection of anomalous behaviours is an emerging need in many applic...
research
06/06/2023

How to Select Which Active Learning Strategy is Best Suited for Your Specific Problem and Budget

In Active Learning (AL), a learner actively chooses which unlabeled exam...
research
10/19/2022

Estimating the Contamination Factor's Distribution in Unsupervised Anomaly Detection

Anomaly detection methods identify examples that do not follow the expec...

Please sign up or login with your details

Forgot password? Click here to reset