A Multivariate Extreme Value Theory Approach to Anomaly Clustering and Visualization

07/17/2019
by   Maël Chiapino, et al.
0

In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X = (X1,. .. , X d) valued in R d , correspond to the simultaneous occurrence of extreme values for certain subgroups α ⊂ 1,. .. , d of variables Xj. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type α is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type α, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2021

Anomaly detection using principles of human perception

In the fields of statistics and unsupervised machine learning a fundamen...
research
03/31/2016

Sparse Representation of Multivariate Extremes with Applications to Anomaly Ranking

Extremes play a special role in Anomaly Detection. Beyond inference and ...
research
04/25/2020

Sparse Structures for Multivariate Extremes

Extreme value statistics provides accurate estimates for the small occur...
research
08/12/2019

Anomaly Detection in High Dimensional Data

The HDoutliers algorithm is a powerful unsupervised algorithm for detect...
research
09/16/2020

Discovering causal factors of drought in Ethiopia

Drought is a costly natural hazard, many aspects of which remain poorly ...
research
11/19/2020

Evaluation of investigational paradigms for the discovery of non-canonical astrophysical phenomena

Non-canonical phenomena - defined here as observables which are either i...
research
08/13/2020

Informative Clusters for Multivariate Extremes

Capturing the dependence structure of multivariate extreme data is a maj...

Please sign up or login with your details

Forgot password? Click here to reset