Finding Optimal Diverse Feature Sets with Alternative Feature Selection

07/21/2023
by   Jakob Bach, et al.
0

Feature selection is popular for obtaining small, interpretable, yet highly accurate prediction models. Conventional feature-selection methods typically yield one feature set only, which might not suffice in some scenarios. For example, users might be interested in finding alternative feature sets with similar prediction quality, offering different explanations of the data. In this article, we introduce alternative feature selection and formalize it as an optimization problem. In particular, we define alternatives via constraints and enable users to control the number and dissimilarity of alternatives. Next, we analyze the complexity of this optimization problem and show NP-hardness. Further, we discuss how to integrate conventional feature-selection methods as objectives. Finally, we evaluate alternative feature selection with 30 classification datasets. We observe that alternative feature sets may indeed have high prediction quality, and we analyze several factors influencing this outcome.

READ FULL TEXT
research
03/20/2017

Metalearning for Feature Selection

A general formulation of optimization problems in which various candidat...
research
09/06/2022

Handcrafted Feature Selection Techniques for Pattern Recognition: A Survey

The accuracy of a classifier, when performing Pattern recognition, is mo...
research
01/16/2013

Bayesian Classification and Feature Selection from Finite Data Sets

Feature selection aims to select the smallest subset of features for a s...
research
08/30/2019

Charge-Based Prison Term Prediction with Deep Gating Network

Judgment prediction for legal cases has attracted much research efforts ...
research
02/22/2021

Shapley values for feature selection: The good, the bad, and the axioms

The Shapley value has become popular in the Explainable AI (XAI) literat...
research
11/29/2018

Feature selection with optimal coordinate ascent (OCA)

In machine learning, Feature Selection (FS) is a major part of efficient...
research
11/16/2021

Outlier Detection as Instance Selection Method for Feature Selection in Time Series Classification

In order to allow machine learning algorithms to extract knowledge from ...

Please sign up or login with your details

Forgot password? Click here to reset