Using Markov Boundary Approach for Interpretable and Generalizable Feature Selection

07/26/2023
by   Anwesha Bhattacharyya, et al.
0

Predictive power and generalizability of models depend on the quality of features selected in the model. Machine learning (ML) models in banks consider a large number of features which are often correlated or dependent. Incorporation of these features may hinder model stability and prior feature screening can improve long term performance of the models. A Markov boundary (MB) of features is the minimum set of features that guarantee that other potential predictors do not affect the target given the boundary while ensuring maximal predictive accuracy. Identifying the Markov boundary is straightforward under assumptions of Gaussianity on the features and linear relationships between them. This paper outlines common problems associated with identifying the Markov boundary in structured data when relationships are non-linear, and predictors are of mixed data type. We have proposed a multi-group forward-backward selection strategy that not only handles the continuous features but addresses some of the issues with MB identification in a mixed data setup and demonstrated its capabilities on simulated and real datasets.

READ FULL TEXT
research
10/17/2019

Dropping forward-backward algorithms for feature selection

In this era of big data, feature selection techniques, which have long b...
research
10/20/2021

PPFS: Predictive Permutation Feature Selection

We propose Predictive Permutation Feature Selection (PPFS), a novel wrap...
research
09/12/2023

Learning Minimalistic Tsetlin Machine Clauses with Markov Boundary-Guided Pruning

A set of variables is the Markov blanket of a random variable if it cont...
research
02/27/2017

Nearly Maximally Predictive Features and Their Dimensions

Scientific explanation often requires inferring maximally predictive fea...
research
12/29/2022

On the utility of feature selection in building two-tier decision trees

Nowadays, feature selection is frequently used in machine learning when ...
research
06/15/2021

Employing an Adjusted Stability Measure for Multi-Criteria Model Fitting on Data Sets with Similar Features

Fitting models with high predictive accuracy that include all relevant b...
research
05/07/2022

Accuracy Convergent Field Predictors

Several predictive algorithms are described. Highlighted are variants th...

Please sign up or login with your details

Forgot password? Click here to reset