Active Anomaly Detection via Ensembles: Insights, Algorithms, and Interpretability

01/23/2019
by   Shubhomoy Das, et al.
0

Anomaly detection (AD) task corresponds to identifying the true anomalies from a given set of data instances. AD algorithms score the data instances and produce a ranked list of candidate anomalies, which are then analyzed by a human to discover the true anomalies. However, this process can be laborious for the human analyst when the number of false-positives is very high. Therefore, in many real-world AD applications including computer security and fraud prevention, the anomaly detector must be configurable by the human analyst to minimize the effort on false positives. In this paper, we study the problem of active learning to automatically tune ensemble of anomaly detectors to maximize the number of true anomalies discovered. We make four main contributions towards this goal. First, we present an important insight that explains the practical successes of AD ensembles and how ensembles are naturally suited for active learning. Second, we present several algorithms for active learning with tree-based AD ensembles. These algorithms help us to improve the diversity of discovered anomalies, generate rule sets for improved interpretability of anomalous instances, and adapt to streaming data settings in a principled manner. Third, we present a novel algorithm called GLocalized Anomaly Detection (GLAD) for active learning with generic AD ensembles. GLAD allows end-users to retain the use of simple and understandable global anomaly detectors by automatically learning their local relevance to specific data instances using label feedback. Fourth, we present extensive experiments to evaluate our insights and algorithms. Our results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, our active learning algorithms under the streaming-data setup are competitive with the batch setup.

READ FULL TEXT
research
09/17/2018

Active Anomaly Detection via Ensembles

In critical applications of anomaly detection including computer securit...
research
10/02/2018

GLAD: GLocalized Anomaly Detection via Active Feature Space Suppression

We propose an algorithm called GLAD (GLocalized Anomaly Detection) that ...
research
08/30/2017

Incorporating Feedback into Tree-based Anomaly Detection

Anomaly detectors are often used to produce a ranked list of statistical...
research
01/27/2021

Wisdom of the Contexts: Active Ensemble Learning for Contextual Anomaly Detection

In contextual anomaly detection (CAD), an object is only considered anom...
research
10/28/2022

Learning to Detect Interesting Anomalies

Anomaly detection algorithms are typically applied to static, unchanging...
research
03/01/2023

Implementing Active Learning in Cybersecurity: Detecting Anomalies in Redacted Emails

Research on email anomaly detection has typically relied on specially pr...

Please sign up or login with your details

Forgot password? Click here to reset