Anomaly Detection for High-Dimensional Data Using Large Deviations Principle

09/28/2021
by   Sreelekha Guggilam, et al.
0

Most current anomaly detection methods suffer from the curse of dimensionality when dealing with high-dimensional data. We propose an anomaly detection algorithm that can scale to high-dimensional data using concepts from the theory of large deviations. The proposed Large Deviations Anomaly Detection (LAD) algorithm is shown to outperform state of art anomaly detection methods on a variety of large and high-dimensional benchmark data sets. Exploiting the ability of the algorithm to scale to high-dimensional data, we propose an online anomaly detection method to identify anomalies in a collection of multivariate time series. We demonstrate the applicability of the online algorithm in identifying counties in the United States with anomalous trends in terms of COVID-19 related cases and deaths. Several of the identified anomalous counties correlate with counties with documented poor response to the COVID pandemic.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/06/2018

Adversarially Learned Anomaly Detection

Anomaly detection is a significant and hence well-studied problem. Howev...
research
08/12/2019

Anomaly Detection in High Dimensional Data

The HDoutliers algorithm is a powerful unsupervised algorithm for detect...
research
11/17/2020

Sub-clusters of Normal Data for Anomaly Detection

Anomaly detection in data analysis is an interesting but still challengi...
research
05/12/2020

Unsupervised Anomaly Detection via Deep Metric Learning with End-to-End Optimization

We investigate unsupervised anomaly detection for high-dimensional data ...
research
11/09/2020

Anomaly Detection of Mobility Data with Applications to COVID-19 Situational Awareness

This work introduces a live anomaly detection system for high frequency ...
research
12/04/2019

Copula-based anomaly scoring and localization for large-scale, high-dimensional continuous data

The anomaly detection method presented by this paper has a special featu...
research
12/15/2020

Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework

Since the label collecting is prohibitive and time-consuming, unsupervis...

Please sign up or login with your details

Forgot password? Click here to reset