Understanding the Effect of Bias in Deep Anomaly Detection

05/16/2021
by   Ziyu Ye, et al.
15

Anomaly detection presents a unique challenge in machine learning, due to the scarcity of labeled anomaly data. Recent work attempts to mitigate such problems by augmenting training of deep anomaly detection models with additional labeled anomaly samples. However, the labeled data often does not align with the target distribution and introduces harmful bias to the trained model. In this paper, we aim to understand the effect of a biased anomaly set on anomaly detection. Concretely, we view anomaly detection as a supervised learning task where the objective is to optimize the recall at a given false positive rate. We formally study the relative scoring bias of an anomaly detector, defined as the difference in performance with respect to a baseline anomaly detector. We establish the first finite sample rates for estimating the relative scoring bias for deep anomaly detection, and empirically validate our theoretical results on both synthetic and real-world datasets. We also provide an extensive empirical study on how a biased training anomaly set affects the anomaly score function and therefore the detection performance on different anomaly classes. Our study demonstrates scenarios in which the biased anomaly set can be useful or problematic, and provides a solid benchmark for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2019

Deep Anomaly Detection with Deviation Networks

Although deep learning has been applied to successfully address many dat...
research
08/31/2023

Deep Semi-Supervised Anomaly Detection for Finding Fraud in the Futures Market

Modern financial electronic exchanges are an exciting and fast-paced mar...
research
10/05/2020

Deep Anomaly Detection by Residual Adaptation

Deep anomaly detection is a difficult task since, in high dimensions, it...
research
10/06/2022

Env-Aware Anomaly Detection: Ignore Style Changes, Stay True to Content!

We introduce a formalization and benchmark for the unsupervised anomaly ...
research
04/14/2019

Should I Raise The Red Flag? A comprehensive survey of anomaly scoring methods toward mitigating false alarms

A general Intrusion Detection System (IDS) fundamentally acts based on a...
research
12/12/2020

Filtering DDoS Attacks from Unlabeled Network Traffic Data Using Online Deep Learning

DDoS attacks are simple, effective, and still pose a significant threat ...
research
03/27/2023

Disruption Precursor Onset Time Study Based on Semi-supervised Anomaly Detection

The full understanding of plasma disruption in tokamaks is currently lac...

Please sign up or login with your details

Forgot password? Click here to reset