Enhancing Robustness of On-line Learning Models on Highly Noisy Data

03/19/2021
by   Zilong Zhao, et al.
14

Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT, cloud and face recognition, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the wild can be unreliable due to careless annotations or malicious data transformation for incorrect anomaly detection. In this paper, we extend a two-layer on-line data selection framework: Robust Anomaly Detector (RAD) with a newly designed ensemble prediction where both layers contribute to the final anomaly detection decision. To adapt to the on-line nature of anomaly detection, we consider additional features of conflicting opinions of classifiers, repetitive cleaning, and oracle knowledge. We on-line learn from incoming data streams and continuously cleanse the data, so as to adapt to the increasing learning capacity from the larger accumulated data set. Moreover, we explore the concept of oracle learning that provides additional information of true labels for difficult data points. We specifically focus on three use cases, (i) detecting 10 classes of IoT attacks, (ii) predicting 4 classes of task failures of big data jobs, and (iii) recognising 100 celebrities faces. Our evaluation results show that RAD can robustly improve the accuracy of anomaly detection, to reach up to 98.95 failures (i.e., +14 reach up to 77.51 proposed RAD and its extensions are general and can be applied to different anomaly detection algorithms.

READ FULL TEXT

page 1

page 9

page 12

page 16

research
11/11/2019

RAD: On-line Anomaly Detection for Highly Unreliable Data

Classification algorithms have been widely adopted to detect anomalies f...
research
02/14/2022

DeCorus: Hierarchical Multivariate Anomaly Detection at Cloud-Scale

Multivariate anomaly detection can be used to identify outages within la...
research
08/08/2021

Ensemble neuroevolution based approach for multivariate time series anomaly detection

Multivariate time series anomaly detection is a very common problem in t...
research
10/28/2022

Learning to Detect Interesting Anomalies

Anomaly detection algorithms are typically applied to static, unchanging...
research
03/17/2022

Context-Dependent Anomaly Detection with Knowledge Graph Embedding Models

Increasing the semantic understanding and contextual awareness of machin...
research
07/20/2023

Ensemble Learning based Anomaly Detection for IoT Cybersecurity via Bayesian Hyperparameters Sensitivity Analysis

The Internet of Things (IoT) integrates more than billions of intelligen...
research
02/23/2021

Neuroscience-Inspired Algorithms for the Predictive Maintenance of Manufacturing Systems

If machine failures can be detected preemptively, then maintenance and r...

Please sign up or login with your details

Forgot password? Click here to reset