Feature reduction

What is Feature Reduction?

Feature reduction, also known as dimensionality reduction, is the process of reducing the number of features in a resource heavy computation without losing important information. Reducing the number of features means the number of variables is reduced making the computer’s work easier and faster. Feature reduction can be divided into two processes: feature selection and feature extraction. There are many techniques by which feature reduction is accomplished. Some of the most popular are generalized discriminant analysis, autoencoders, non-negative matrix factorization, and principal component analysis.

Why is this Useful?

The purpose of using feature reduction is to reduce the number of features (or variables) that the computer must process to perform its function. Feature reduction leads to the need for fewer resources to complete computations or tasks. Less computation time and less storage capacity needed means the computer can do more work. During machine learning, feature reduction removes multicollinearity resulting in improvement of the machine learning model in use. 

Another benefit of feature reduction is that it makes data easier to visualize for humans, particularly when the data is reduced to two or three dimensions which can be easily displayed graphically. An interesting problem that feature reduction can help with is called the curse of dimensionality. This refers to a group of phenomena in which a problem will have so many dimensions that the data becomes sparse. Feature reduction is used to decrease the number of dimensions, making the data less sparse and more statistically significant for machine learning applications.