Probabilistic Outlier Detection and Generation

12/22/2020
by   Stefano Giovanni Rizzo, et al.
0

A new method for outlier detection and generation is introduced by lifting data into the space of probability distributions which are not analytically expressible, but from which samples can be drawn using a neural generator. Given a mixture of unknown latent inlier and outlier distributions, a Wasserstein double autoencoder is used to both detect and generate inliers and outliers. The proposed method, named WALDO (Wasserstein Autoencoder for Learning the Distribution of Outliers), is evaluated on classical data sets including MNIST, CIFAR10 and KDD99 for detection accuracy and robustness. We give an example of outlier detection on a real retail sales data set and an example of outlier generation for simulating intrusion attacks. However we foresee many application scenarios where WALDO can be used. To the best of our knowledge this is the first work that studies both outlier detection and generation together.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2021

Comparison of Outlier Detection Techniques for Structured Data

An outlier is an observation or a data point that is far from rest of th...
research
06/07/2017

Outlier Detection Using Distributionally Robust Optimization under the Wasserstein Metric

We present a Distributionally Robust Optimization (DRO) approach to outl...
research
04/03/2023

Improving Autoencoder-based Outlier Detection with Adjustable Probabilistic Reconstruction Error and Mean-shift Outlier Scoring

Autoencoders were widely used in many machine learning tasks thanks to t...
research
06/05/2020

Generating Artificial Outliers in the Absence of Genuine Ones – a Survey

By definition, outliers are rarely observed in reality, making them diff...
research
03/01/2020

Clarifying the Hubble constant tension with a Bayesian hierarchical model of the local distance ladder

Estimates of the Hubble constant, $H_0$, from the local distance ladder ...
research
01/11/2023

ODIM: an efficient method to detect outliers via inlier-memorization effect of deep generative models

Identifying whether a given sample is an outlier or not is an important ...
research
05/12/2021

Autoencoding Under Normalization Constraints

Likelihood is a standard estimate for outlier detection. The specific ro...

Please sign up or login with your details

Forgot password? Click here to reset