Feature Selection for multi-labeled variables via Dependency Maximization

02/10/2019
by   Salimeh Yasaei Sekeh, et al.
0

Feature selection and reducing the dimensionality of data is an essential step in data analysis. In this work, we propose a new criterion for feature selection that is formulated as conditional theoretical information between features given the labeled variable. Instead of using the standard mutual information measure based on Kullback-Leibler divergence, we use our proposed criterion to filter out redundant features. This approach results in an efficient and fast non-parametric implementation of feature selection as it can be directly estimated using a geometric measure of dependency, the global Friedman-Rafsky (FR) multivariate run test statistic constructed by a global minimal spanning tree (MST). We demonstrate the advantages of our proposed feature selection approach through simulation. In addition, the proposed feature selection method is applied to the MNIST data set and compared with [1].

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2017

Efficient Approximate Solutions to Mutual Information Based Global Feature Selection

Mutual Information (MI) is often used for feature selection when develop...
research
03/03/2022

Parallel feature selection based on the trace ratio criterion

The growth of data today poses a challenge in management and inference. ...
research
06/18/2012

Copula-based Kernel Dependency Measures

The paper presents a new copula based method for measuring dependence be...
research
04/28/2023

A feature selection method based on Shapley values robust to concept shift in regression

Feature selection is one of the most relevant processes in any methodolo...
research
12/13/2020

Active Feature Selection for the Mutual Information Criterion

We study active feature selection, a novel feature selection setting in ...
research
01/30/2020

TCMI: a non-parametric mutual-dependence estimator for multivariate continuous distributions

The identification of relevant features, i.e., the driving variables tha...
research
06/15/2023

A Hybrid Feature Selection and Construction Method for Detection of Wind Turbine Generator Heating Faults

Preprocessing of information is an essential step for the effective desi...

Please sign up or login with your details

Forgot password? Click here to reset