Copula-based anomaly scoring and localization for large-scale, high-dimensional continuous data

by   Gabor Horvath, et al.

The anomaly detection method presented by this paper has a special feature: it does not only indicate whether an observation is anomalous or not but also tells what exactly makes an anomalous observation unusual. Hence, it provides support to localize the reason of the anomaly. The proposed approach is model-based; it relies on the multivariate probability distribution associated with the observations. Since the rare events are present in the tails of the probability distributions, we use copula functions, that are able to model the fat-tailed distributions well. The presented procedure scales well; it can cope with a large number of high-dimensional samples. Furthermore, our procedure can cope with missing values, too, which occur frequently in high-dimensional data sets. In the second part of the paper, we demonstrate the usability of the method through a case study, where we analyze a large data set consisting of the performance counters of a real mobile telecommunication network. Since such networks are complex systems, the signs of sub-optimal operation can remain hidden for a potentially long time. With the proposed procedure, many such hidden issues can be isolated and indicated to the network operator.


page 12

page 16

page 17

page 20

page 21

page 23


Anomaly Detection for High-Dimensional Data Using Large Deviations Principle

Most current anomaly detection methods suffer from the curse of dimensio...

Discovering Multiple Constraints that are Frequently Approximately Satisfied

Some high-dimensional data.sets can be modelled by assuming that there a...

Testing for Typicality with Respect to an Ensemble of Learned Distributions

Methods of performing anomaly detection on high-dimensional data sets ar...

Skip-GANomaly: Skip Connected and Adversarially Trained Encoder-Decoder Anomaly Detection

Despite inherent ill-definition, anomaly detection is a research endeavo...

Anomaly Detection in High Dimensional Data

The HDoutliers algorithm is a powerful unsupervised algorithm for detect...

Anomaly Detection and Localization based on Double Kernelized Scoring and Matrix Kernels

Anomaly detection is necessary for proper and safe operation of large-sc...

Beauty and Brains: Detecting Anomalous Pattern Co-Occurrences

Our world is filled with both beautiful and brainy people, but how often...

Please sign up or login with your details

Forgot password? Click here to reset