Anomaly Detection in Big Data

03/03/2022
by   Chandresh Kumar Maurya, et al.
0

Anomaly is defined as a state of the system that do not conform to the normal behavior. For example, the emission of neutrons in a nuclear reactor channel above the specified threshold is an anomaly. Big data refers to the data set that is high volume, streaming, heterogeneous, distributed and often sparse. Big data is not uncommon these days. For example, as per Internet live stats, the number of tweets posted per day has gone above 500 millions. Due to data explosion in data laden domains, traditional anomaly detection techniques developed for small data sets scale poorly on large-scale data sets. Therefore, we take an alternative approach to tackle anomaly detection in big data. Essentially, there are two ways to scale anomaly detection in big data. The first is based on the online learning and the second is based on the distributed learning. Our aim in the thesis is to tackle big data problems while detecting anomaly efficiently. To that end, we first take streaming issue of the big data and propose Passive-Aggressive GMEAN (PAGMEAN) algorithms. Although, online learning algorithm can scale well over large number of data points and dimensions, they can not process data when it is distributed at multiple locations; which is quite common these days. Therefore, we propose anomaly detection algorithm which is inherently distributed using ADMM. Finally, we present a case study on anomaly detection in nuclear power plant data.

READ FULL TEXT
research
10/12/2017

On the Runtime-Efficacy Trade-off of Anomaly Detection Techniques for Real-Time Streaming Data

Ever growing volume and velocity of data coupled with decreasing attenti...
research
04/18/2017

Anomaly detection and motif discovery in symbolic representations of time series

The advent of the Big Data hype and the consistent recollection of event...
research
01/15/2023

Efficient anomaly detection method for rooftop PV systems using big data and permutation entropy

The number of rooftop photovoltaic (PV) systems has significantly increa...
research
11/11/2019

RAD: On-line Anomaly Detection for Highly Unreliable Data

Classification algorithms have been widely adopted to detect anomalies f...
research
04/09/2018

Anomaly Detection for Industrial Big Data

As the Industrial Internet of Things (IIoT) grows, systems are increasin...
research
12/19/2018

Correlated Anomaly Detection from Large Streaming Data

Correlated anomaly detection (CAD) from streaming data is a type of grou...
research
06/20/2017

Arrays of (locality-sensitive) Count Estimators (ACE): High-Speed Anomaly Detection via Cache Lookups

Anomaly detection is one of the frequent and important subroutines deplo...

Please sign up or login with your details

Forgot password? Click here to reset