A Multivariate Extreme Value Theory Approach to Anomaly Clustering and Visualization

by   Maël Chiapino, et al.

In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X = (X1,. .. , X d) valued in R d , correspond to the simultaneous occurrence of extreme values for certain subgroups α ⊂ 1,. .. , d of variables Xj. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type α is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type α, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.


page 1

page 2

page 3

page 4


Anomaly detection using principles of human perception

In the fields of statistics and unsupervised machine learning a fundamen...

Sparse Representation of Multivariate Extremes with Applications to Anomaly Ranking

Extremes play a special role in Anomaly Detection. Beyond inference and ...

Sparse Structures for Multivariate Extremes

Extreme value statistics provides accurate estimates for the small occur...

Anomaly Detection in High Dimensional Data

The HDoutliers algorithm is a powerful unsupervised algorithm for detect...

Discovering causal factors of drought in Ethiopia

Drought is a costly natural hazard, many aspects of which remain poorly ...

Evaluation of investigational paradigms for the discovery of non-canonical astrophysical phenomena

Non-canonical phenomena - defined here as observables which are either i...

Informative Clusters for Multivariate Extremes

Capturing the dependence structure of multivariate extreme data is a maj...

Please sign up or login with your details

Forgot password? Click here to reset