Outlier Detection Ensemble with Embedded Feature Selection

01/15/2020
by   Li Cheng, et al.
0

Feature selection places an important role in improving the performance of outlier detection, especially for noisy data. Existing methods usually perform feature selection and outlier scoring separately, which would select feature subsets that may not optimally serve for outlier detection, leading to unsatisfying performance. In this paper, we propose an outlier detection ensemble framework with embedded feature selection (ODEFS), to address this issue. Specifically, for each random sub-sampling based learning component, ODEFS unifies feature selection and outlier detection into a pairwise ranking formulation to learn feature subsets that are tailored for the outlier detection method. Moreover, we adopt the thresholded self-paced learning to simultaneously optimize feature selection and example selection, which is helpful to improve the reliability of the training set. After that, we design an alternate algorithm with proved convergence to solve the resultant optimization problem. In addition, we analyze the generalization error bound of the proposed framework, which provides theoretical guarantee on the method and insightful practical guidance. Comprehensive experimental results on 12 real-world datasets from diverse domains validate the superiority of the proposed ODEFS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2012

Improving feature selection algorithms using normalised feature histograms

The proposed feature selection method builds a histogram of the most sta...
research
03/18/2018

A Robust AUC Maximization Framework with Simultaneous Outlier Detection and Feature Selection for Positive-Unlabeled Classification

The positive-unlabeled (PU) classification is a common scenario in real-...
research
03/21/2021

Homophily Outlier Detection in Non-IID Categorical Data

Most of existing outlier detection methods assume that the outlier facto...
research
07/18/2022

Outlier Explanation via Sum-Product Networks

Outlier explanation is the task of identifying a set of features that di...
research
06/22/2021

Doubly Robust Feature Selection with Mean and Variance Outlier Detection and Oracle Properties

We propose a general approach to handle data contaminations that might d...
research
06/27/2012

Feature Selection via Probabilistic Outputs

This paper investigates two feature-scoring criteria that make use of es...
research
09/29/2021

Efficient Reinforced Feature Selection via Early Stopping Traverse Strategy

In this paper, we propose a single-agent Monte Carlo based reinforced fe...

Please sign up or login with your details

Forgot password? Click here to reset