Probabilistic Distance-Based Outlier Detection

05/16/2023
by   David Muhr, et al.
0

The scores of distance-based outlier detection methods are difficult to interpret, making it challenging to determine a cut-off threshold between normal and outlier data points without additional context. We describe a generic transformation of distance-based outlier scores into interpretable, probabilistic estimates. The transformation is ranking-stable and increases the contrast between normal and outlier data points. Determining distance relationships between data points is necessary to identify the nearest-neighbor relationships in the data, yet, most of the computed distances are typically discarded. We show that the distances to other data points can be used to model distance probability distributions and, subsequently, use the distributions to turn distance-based outlier scores into outlier probabilities. Our experiments show that the probabilistic transformation does not impact detection performance over numerous tabular and image benchmark datasets but results in interpretable outlier scores with increased contrast between normal and outlier samples. Our work generalizes to a wide range of distance-based outlier detection methods, and because existing distance computations are used, it adds no significant computational overhead.

READ FULL TEXT

page 1

page 14

page 15

page 16

research
06/16/2021

Comparison of Outlier Detection Techniques for Structured Data

An outlier is an observation or a data point that is far from rest of th...
research
05/16/2018

Towards Explaining Anomalies: A Deep Taylor Decomposition of One-Class Models

A common machine learning task is to discriminate between normal and ano...
research
09/15/2023

BANSAC: A dynamic BAyesian Network for adaptive SAmple Consensus

RANSAC-based algorithms are the standard techniques for robust estimatio...
research
05/10/2019

Relationship Detection Measures for Binary SoC Data

System-on-Chip (SoC) designs are used in every aspect of computing and t...
research
04/28/2020

A new effective and efficient measure for outlying aspect mining

Outlying Aspect Mining (OAM) aims to find the subspaces (a.k.a. aspects)...
research
08/14/2023

Quantifying Outlierness of Funds from their Categories using Supervised Similarity

Mutual fund categorization has become a standard tool for the investment...
research
07/18/2022

Outlier Explanation via Sum-Product Networks

Outlier explanation is the task of identifying a set of features that di...

Please sign up or login with your details

Forgot password? Click here to reset