Cluster Purging: Efficient Outlier Detection based on Rate-Distortion Theory

02/22/2023
by   Maximilian B. Toller, et al.
0

Rate-distortion theory-based outlier detection builds upon the rationale that a good data compression will encode outliers with unique symbols. Based on this rationale, we propose Cluster Purging, which is an extension of clustering-based outlier detection. This extension allows one to assess the representivity of clusterings, and to find data that are best represented by individual unique clusters. We propose two efficient algorithms for performing Cluster Purging, one being parameter-free, while the other algorithm has a parameter that controls representivity estimations, allowing it to be tuned in supervised setups. In an experimental evaluation, we show that Cluster Purging improves upon outliers detected from raw clusterings, and that Cluster Purging competes strongly against state-of-the-art alternatives.

READ FULL TEXT
research
01/05/2018

Clustering with Outlier Removal

Cluster analysis and outlier detection are strongly coupled tasks in dat...
research
03/03/2017

Outlier Cluster Formation in Spectral Clustering

Outlier detection and cluster number estimation is an important issue fo...
research
04/22/2011

Robust Clustering Using Outlier-Sparsity Regularization

Notwithstanding the popularity of conventional clustering algorithms suc...
research
10/15/2022

D.MCA: Outlier Detection with Explicit Micro-Cluster Assignments

How can we detect outliers, both scattered and clustered, and also expli...
research
06/09/2021

Deep Clustering based Fair Outlier Detection

In this paper, we focus on the fairness issues regarding unsupervised ou...
research
05/22/2017

Size Matters: Cardinality-Constrained Clustering and Outlier Detection via Conic Optimization

Plain vanilla K-means clustering is prone to produce unbalanced clusters...
research
01/31/2019

Generalized Dirichlet-process-means for f-separable distortion measures

DP-means clustering was obtained as an extension of K-means clustering. ...

Please sign up or login with your details

Forgot password? Click here to reset