Multi-level conformal clustering: A distribution-free technique for clustering and anomaly detection

10/17/2019
by   Ilia Nouretdinov, et al.
23

In this work we present a clustering technique called multi-level conformal clustering (MLCC). The technique is hierarchical in nature because it can be performed at multiple significance levels which yields greater insight into the data than performing it at just one level. We describe the theoretical underpinnings of MLCC, compare and contrast it with the hierarchical clustering algorithm, and then apply it to real world datasets to assess its performance. There are several advantages to using MLCC over more classical clustering techniques: Once a significance level has been set, MLCC is able to automatically select the number of clusters. Furthermore, thanks to the conformal prediction framework the resulting clustering model has a clear statistical meaning without any assumptions about the distribution of the data. This statistical robustness also allows us to perform clustering and anomaly detection simultaneously. Moreover, due to the flexibility of the conformal prediction framework, our algorithm can be used on top of many other machine learning algorithms.

READ FULL TEXT

page 14

page 16

page 20

research
08/20/2022

Improving Multilayer-Perceptron(MLP)-based Network Anomaly Detection with Birch Clustering on CICIDS-2017 Dataset

Machine learning algorithms have been widely used in intrusion detection...
research
11/01/2019

Integrated Clustering and Anomaly Detection (INCAD) for Streaming Data (Revised)

Most current clustering based anomaly detection methods use scoring sche...
research
10/16/2014

Multi-Level Anomaly Detection on Time-Varying Graph Data

This work presents a novel modeling and analysis framework for graph seq...
research
04/29/2012

Dissimilarity Clustering by Hierarchical Multi-Level Refinement

We introduce in this paper a new way of optimizing the natural extension...
research
06/12/2023

A Computational Theory and Semi-Supervised Algorithm for Clustering

A computational theory for clustering and a semi-supervised clustering a...
research
11/08/2022

Significance-Based Categorical Data Clustering

Although numerous algorithms have been proposed to solve the categorical...
research
10/29/2018

Application of Clustering Methods to Anomaly Detection in Fibrous Media

The paper considers the problem of anomaly detection in 3D images of fib...

Please sign up or login with your details

Forgot password? Click here to reset