What is Feature Selection?
Feature selection is one of the two processes of feature reduction, the other being feature extraction. Feature selection is the process by which a subset of relevant features, or variables, are selected from a larger data set for constructing models. Variable selection, attribute selection or variable subset selection are all other names used for feature selection. The main focus of feature selection is to choose features that represent the data set well by excluding redundant and irrelevant data. This is in contrast to feature extraction in which new features are created as functions of the original features. What is the same is that feature selection and feature extraction both ensure the machine learning model is using the most relevant and non-redundant data set possible.
Why is this Useful?
Feature selection is useful because it simplifies the learning models making interpretation of the model and the results easier for the user. Another benefit of feature selection is the reduction in processing time which translates to shorter training time for the machine due to using just the relevant subset of data. The curse of dimensionality can also be avoided because feature selection can decrease the number of dimensions of the data. The curse of dimensionality is a phenomenon where a dataset is described in so many dimensions (or by so many features) that the data points become sparse and approach statistical insignificance. Feature selection reduces the number of dimensions and can potentially make the data statistically significant enough to avoid the curse.
Practical Uses of Feature Selection
Bag-of-Words – A technique for natural language processing that extracts the words (features) used in a sentence, document, website, etc. and classifies them by frequency of use. Feature selection is used to target specific words for the vocabulary of the learning model. This technique can also be applied to image processing.