PIDForest: Anomaly Detection via Partial Identification

12/08/2019
by   Parikshit Gopalan, et al.
0

We consider the problem of detecting anomalies in a large dataset. We propose a framework called Partial Identification which captures the intuition that anomalies are easy to distinguish from the overwhelming majority of points by relatively few attribute values. Formalizing this intuition, we propose a geometric anomaly measure for a point that we call PIDScore, which measures the minimum density of data points over all subcubes containing the point. We present PIDForest: a random forest based algorithm that finds anomalies based on this definition. We show that it performs favorably in comparison to several popular anomaly detection methods, across a broad range of benchmarks. PIDForest also provides a succinct explanation for why a point is labelled anomalous, by providing a set of features and ranges for them which are relatively uncommon in the dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2018

Precision and Recall for Range-Based Anomaly Detection

Classical anomaly detection is principally concerned with point-based an...
research
01/18/2022

Antimodes and Graphical Anomaly Exploration via Depth Quantile Functions

Depth quantile functions (DQF) encode geometric information about a poin...
research
01/28/2015

A Neural Network Anomaly Detector Using the Random Cluster Model

The random cluster model is used to define an upper bound on a distance ...
research
01/02/2019

Anomaly Detection in Networks with Application to Financial Transaction Networks

This paper is motivated by the task of detecting anomalies in networks o...
research
05/23/2019

Approximate String Matching for DNS Anomaly Detection

In this paper we propose a novel approach to identify anomalies in DNS t...
research
05/23/2022

PIXAL: Anomaly Reasoning with Visual Analytics

Anomaly detection remains an open challenge in many application areas. W...
research
04/22/2017

Robust, Deep and Inductive Anomaly Detection

PCA is a classical statistical technique whose simplicity and maturity h...

Please sign up or login with your details

Forgot password? Click here to reset