Out-Of-Bag Anomaly Detection

09/20/2020
by   Egor Klevak, et al.
29

Data anomalies are ubiquitous in real world datasets, and can have an adverse impact on machine learning (ML) systems, such as automated home valuation. Detecting anomalies could make ML applications more responsible and trustworthy. However, the lack of labels for anomalies and the complex nature of real-world datasets make anomaly detection a challenging unsupervised learning problem. In this paper, we propose a novel model-based anomaly detection method, that we call Out-of- Bag anomaly detection, which handles multi-dimensional datasets consisting of numerical and categorical features. The proposed method decomposes the unsupervised problem into the training of a set of ensemble models. Out-of-Bag estimates are leveraged to derive an effective measure for anomaly detection. We not only demonstrate the state-of-the-art performance of our method through comprehensive experiments on benchmark datasets, but also show our model can improve the accuracy and reliability of an ML system as data pre-processing step via a case study on home valuation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2020

Using Ensemble Classifiers to Detect Incipient Anomalies

Incipient anomalies present milder symptoms compared to severe ones, and...
research
10/04/2022

Multiple Instance Learning for Detecting Anomalies over Sequential Real-World Datasets

Detecting anomalies over real-world datasets remains a challenging task....
research
04/18/2020

Anomaly Detection in Connected and Automated Vehicles using an Augmented State Formulation

In this paper we propose a novel observer-based method for anomaly detec...
research
03/01/2023

Implementing Active Learning in Cybersecurity: Detecting Anomalies in Redacted Emails

Research on email anomaly detection has typically relied on specially pr...
research
08/08/2022

Constructing Large-Scale Real-World Benchmark Datasets for AIOps

Recently, AIOps (Artificial Intelligence for IT Operations) has been wel...
research
06/13/2021

RadArnomaly: Protecting Radar Systems from Data Manipulation Attacks

Radar systems are mainly used for tracking aircraft, missiles, satellite...
research
05/31/2023

Quality In / Quality Out: Assessing Data quality in an Anomaly Detection Benchmark

Autonomous or self-driving networks are expected to provide a solution t...

Please sign up or login with your details

Forgot password? Click here to reset