PPFS: Predictive Permutation Feature Selection

10/20/2021
by   Atif Hassan, et al.
0

We propose Predictive Permutation Feature Selection (PPFS), a novel wrapper-based feature selection method based on the concept of Markov Blanket (MB). Unlike previous MB methods, PPFS is a universal feature selection technique as it can work for both classification as well as regression tasks on datasets containing categorical and/or continuous features. We propose Predictive Permutation Independence (PPI), a new Conditional Independence (CI) test, which enables PPFS to be categorised as a wrapper feature selection method. This is in contrast to current filter based MB feature selection techniques that are unable to harness the advancements in supervised algorithms such as Gradient Boosting Machines (GBM). The PPI test is based on the knockoff framework and utilizes supervised algorithms to measure the association between an individual or a set of features and the target variable. We also propose a novel MB aggregation step that addresses the issue of sample inefficiency. Empirical evaluations and comparisons on a large number of datasets demonstrate that PPFS outperforms state-of-the-art Markov blanket discovery algorithms as well as, well-known wrapper methods. We also provide a sketch of the proof of correctness of our method. Implementation of this work is available at <https://github.com/atif-hassan/PyImpetus>

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2018

A Unified View of Causal and Non-causal Feature Selection

In this paper, we unify causal and non-causal feature feature selection ...
research
11/01/2018

The Holdout Randomization Test: Principled and Easy Black Box Feature Selection

We consider the problem of feature selection using black box predictive ...
research
07/26/2023

Using Markov Boundary Approach for Interpretable and Generalizable Feature Selection

Predictive power and generalizability of models depend on the quality of...
research
01/28/2019

Testing Conditional Predictive Independence in Supervised Learning Algorithms

We propose a general test of conditional independence. The conditional p...
research
11/10/2016

Feature Selection with the R Package MXM: Discovering Statistically-Equivalent Feature Subsets

The statistically equivalent signature (SES) algorithm is a method for f...
research
10/31/2019

Sobolev Independence Criterion

We propose the Sobolev Independence Criterion (SIC), an interpretable de...
research
08/14/2023

Radiomics-Informed Deep Learning for Classification of Atrial Fibrillation Sub-Types from Left-Atrium CT Volumes

Atrial Fibrillation (AF) is characterized by rapid, irregular heartbeats...

Please sign up or login with your details

Forgot password? Click here to reset