Expected Similarity Estimation for Large-Scale Batch and Streaming Anomaly Detection

01/25/2016
by   Markus Schneider, et al.
0

We present a novel algorithm for anomaly detection on very large datasets and data streams. The method, named EXPected Similarity Estimation (EXPoSE), is kernel-based and able to efficiently compute the similarity between new data points and the distribution of regular data. The estimator is formulated as an inner product with a reproducing kernel Hilbert space embedding and makes no assumption about the type or shape of the underlying data distribution. We show that offline (batch) learning with EXPoSE can be done in linear time and online (incremental) learning takes constant time per instance and model update. Furthermore, EXPoSE can make predictions in constant time, while it requires only constant memory. In addition, we propose different methodologies for concept drift adaptation on evolving data streams. On several real datasets we demonstrate that our approach can compete with state of the art algorithms for anomaly detection while being an order of magnitude faster than most other approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2022

Adaptive Model Pooling for Online Deep Anomaly Detection from a Complex Evolving Data Stream

Online anomaly detection from a data stream is critical for the safety a...
research
10/11/2022

InQMAD: Incremental Quantum Measurement Anomaly Detection

Streaming anomaly detection refers to the problem of detecting anomalous...
research
07/20/2016

Supervised Anomaly Detection in Uncertain Pseudoperiodic Data Streams

Uncertain data streams have been widely generated in many Web applicatio...
research
06/07/2021

MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

Given a stream of entries over time in a multi-aspect data setting where...
research
07/27/2022

Concept Drift Challenge in Multimedia Anomaly Detection: A Case Study with Facial Datasets

Anomaly detection in multimedia datasets is a widely studied area. Yet, ...
research
02/05/2021

Exact Optimization of Conformal Predictors via Incremental and Decremental Learning

Conformal Predictors (CP) are wrappers around ML methods, providing erro...
research
05/20/2020

An Incremental Clustering Method for Anomaly Detection in Flight Data

Safety is a top priority for civil aviation. Data mining in digital Flig...

Please sign up or login with your details

Forgot password? Click here to reset