Distance Rank Score: Unsupervised filter method for feature selection on imbalanced dataset

05/31/2023
by   Katarina Firdova, et al.
0

This paper presents a new filter method for unsupervised feature selection. This method is particularly effective on imbalanced multi-class dataset, as in case of clusters of different anomaly types. Existing methods usually involve the variance of the features, which is not suitable when the different types of observations are not represented equally. Our method, based on Spearman's Rank Correlation between distances on the observations and on feature values, avoids this drawback. The performance of the method is measured on several clustering problems and is compared with existing filter methods suitable for unsupervised data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2023

Graph-based Extreme Feature Selection for Multi-class Classification Tasks

When processing high-dimensional datasets, a common pre-processing step ...
research
01/31/2022

Compactness Score: A Fast Filter Method for Unsupervised Feature Selection

For feature engineering, feature selection seems to be an important rese...
research
09/05/2023

Graph-Based Automatic Feature Selection for Multi-Class Classification via Mean Simplified Silhouette

This paper introduces a novel graph-based filter method for automatic fe...
research
11/23/2021

Filter Methods for Feature Selection in Supervised Machine Learning Applications – Review and Benchmark

The amount of data for machine learning (ML) applications is constantly ...
research
03/08/2022

Model-free feature selection to facilitate automatic discovery of divergent subgroups in tabular data

Data-centric AI encourages the need of cleaning and understanding of dat...
research
02/19/2023

Topological Feature Selection: A Graph-Based Filter Feature Selection Approach

In this paper, we introduce a novel unsupervised, graph-based filter fea...
research
12/17/2020

Unsupervised clustering of coral reef bioacoustics

An unsupervised process is described for clustering automatic detections...

Please sign up or login with your details

Forgot password? Click here to reset