LOSDD: Leave-Out Support Vector Data Description for Outlier Detection

12/27/2022
by   Daniel Boiar, et al.
0

Support Vector Machines have been successfully used for one-class classification (OCSVM, SVDD) when trained on clean data, but they work much worse on dirty data: outliers present in the training data tend to become support vectors, and are hence considered "normal". In this article, we improve the effectiveness to detect outliers in dirty training data with a leave-out strategy: by temporarily omitting one candidate at a time, this point can be judged using the remaining data only. We show that this is more effective at scoring the outlierness of points than using the slack term of existing SVM-based approaches. Identified outliers can then be removed from the data, such that outliers hidden by other outliers can be identified, to reduce the problem of masking. Naively, this approach would require training N individual SVMs (and training O(N^2) SVMs when iteratively removing the worst outliers one at a time), which is prohibitively expensive. We will discuss that only support vectors need to be considered in each step and that by reusing SVM parameters and weights, this incremental retraining can be accelerated substantially. By removing candidates in batches, we can further improve the processing time, although it obviously remains more costly than training a single SVM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2010

General Scaled Support Vector Machines

Support Vector Machines (SVMs) are popular tools for data mining tasks s...
research
12/07/2017

Using SVDD in SimpleMKL for 3D-Shapes Filtering

This paper proposes the adaptation of Support Vector Data Description (S...
research
01/19/2021

Utilizing Import Vector Machines to Identify Dangerous Pro-active Traffic Conditions

Traffic accidents have been a severe issue in metropolises with the deve...
research
09/11/2023

Boundary Peeling: Outlier Detection Method Using One-Class Peeling

Unsupervised outlier detection constitutes a crucial phase within data a...
research
08/10/2022

Classifier Transfer with Data Selection Strategies for Online Support Vector Machine Classification with Class Imbalance

Objective: Classifier transfers usually come with dataset shifts. To ove...
research
01/12/2018

Detecting abnormal events in video using Narrowed Motion Clusters

We formulate the abnormal event detection problem as an outlier detectio...
research
06/14/2020

Defending SVMs against Poisoning Attacks: the Hardness and DBSCAN Approach

Adversarial machine learning has attracted a great amount of attention i...

Please sign up or login with your details

Forgot password? Click here to reset