On Predictive Explanation of Data Anomalies

10/18/2021
by   Nikolaos Myrtakis, et al.
0

Numerous algorithms have been proposed for detecting anomalies (outliers, novelties) in an unsupervised manner. Unfortunately, it is not trivial, in general, to understand why a given sample (record) is labelled as an anomaly and thus diagnose its root causes. We propose the following reduced-dimensionality, surrogate model approach to explain detector decisions: approximate the detection model with another one that employs only a small subset of features. Subsequently, samples can be visualized in this low-dimensionality space for human understanding. To this end, we develop PROTEUS, an AutoML pipeline to produce the surrogate model, specifically designed for feature selection on imbalanced datasets. The PROTEUS surrogate model can not only explain the training data, but also the out-of-sample (unseen) data. In other words, PROTEUS produces predictive explanations by approximating the decision surface of an unsupervised detector. PROTEUS is designed to return an accurate estimate of out-of-sample predictive performance to serve as a metric of the quality of the approximation. Computational experiments confirm the efficacy of PROTEUS to produce predictive explanations for different families of detectors and to reliably estimate their predictive performance in unseen data. Unlike several ad-hoc feature importance methods, PROTEUS is robust to high-dimensional data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2023

AD-MERCS: Modeling Normality and Abnormality in Unsupervised Anomaly Detection

Most anomaly detection systems try to model normal behavior and assume a...
research
10/27/2019

CXPlain: Causal Explanations for Model Interpretation under Uncertainty

Feature importance estimates that inform users about the degree to which...
research
10/03/2022

Unsupervised Model Selection for Time-series Anomaly Detection

Anomaly detection in time-series has a wide range of practical applicati...
research
10/15/2022

ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model

The need for interpretable models has fostered the development of self-e...
research
12/21/2022

DExT: Detector Explanation Toolkit

State-of-the-art object detectors are treated as black boxes due to thei...
research
09/07/2022

SmOOD: Smoothness-based Out-of-Distribution Detection Approach for Surrogate Neural Networks in Aircraft Design

Aircraft industry is constantly striving for more efficient design optim...
research
02/28/2020

Automatically matching topographical measurements of cartridge cases using a record linkage framework

Firing a gun leaves marks on cartridge cases which purportedly uniquely ...

Please sign up or login with your details

Forgot password? Click here to reset