Feature and Variable Selection in Classification
The amount of information in the form of features and variables avail- able to machine learning algorithms is ever increasing. This can lead to classifiers that are prone to overfitting in high dimensions, high di- mensional models do not lend themselves to interpretable results, and the CPU and memory resources necessary to run on high-dimensional datasets severly limit the applications of the approaches. Variable and feature selection aim to remedy this by finding a subset of features that in some way captures the information provided best. In this paper we present the general methodology and highlight some specific approaches.
READ FULL TEXT