A feature selection method based on Shapley values robust to concept shift in regression

04/28/2023
by   Carlos Sebastián, et al.
0

Feature selection is one of the most relevant processes in any methodology for creating a statistical learning model. Generally, existing algorithms establish some criterion to select the most influential variables, discarding those that do not contribute any relevant information to the model. This methodology makes sense in a classical static situation where the joint distribution of the data does not vary over time. However, when dealing with real data, it is common to encounter the problem of the dataset shift and, specifically, changes in the relationships between variables (concept shift). In this case, the influence of a variable cannot be the only indicator of its quality as a regressor of the model, since the relationship learned in the traning phase may not correspond to the current situation. Thus, we propose a new feature selection methodology for regression problems that takes this fact into account, using Shapley values to study the effect that each variable has on the predictions. Five examples are analysed: four correspond to typical situations where the method matches the state of the art and one example related to electricity price forecasting where a concept shift phenomenon has occurred in the Iberian market. In this case the proposed algorithm improves the results significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2019

Feature Selection for multi-labeled variables via Dependency Maximization

Feature selection and reducing the dimensionality of data is an essentia...
research
08/29/2016

Relevant based structure learning for feature selection

Feature selection is an important task in many problems occurring in pat...
research
11/07/2016

Reinforcement Learning Approach for Parallelization in Filters Aggregation Based Feature Selection Algorithms

One of the classical problems in machine learning and data mining is fea...
research
01/16/2013

Feature Selection and Dualities in Maximum Entropy Discrimination

Incorporating feature selection into a classification or regression meth...
research
09/04/2017

Random Subspace with Trees for Feature Selection Under Memory Constraints

Dealing with datasets of very high dimension is a major challenge in mac...
research
06/27/2017

Unsupervised Feature Selection Based on Space Filling Concept

The paper deals with the adaptation of a new measure for the unsupervise...

Please sign up or login with your details

Forgot password? Click here to reset