DPDR: A novel machine learning method for the Decision Process for Dimensionality Reduction

This paper discusses the critical decision process of extracting or selecting the features in a supervised learning context. It is often confusing to find a suitable method to reduce dimensionality. There are pros and cons to deciding between a feature selection and feature extraction according to the data's nature and the user's preferences. Indeed, the user may want to emphasize the results toward integrity or interpretability and a specific data resolution. This paper proposes a new method to choose the best dimensionality reduction method in a supervised learning context. It also helps to drop or reconstruct the features until a target resolution is reached. This target resolution can be user-defined, or it can be automatically defined by the method. The method applies a regression or a classification, evaluates the results, and gives a diagnosis about the best dimensionality reduction process in this specific supervised learning context. The main algorithms used are the Random Forest algorithms (RF), the Principal Component Analysis (PCA) algorithm, and the multilayer perceptron (MLP) neural network algorithm. Six use cases are presented, and every one is based on some well-known technique to generate synthetic data. This research discusses each choice that can be made in the process, aiming to clarify the issues about the entire decision process of selecting or extracting the features.

READ FULL TEXT

page 5

page 8

research
11/20/2021

Feature selection or extraction decision process for clustering using PCA and FRSD

This paper concerns the critical decision process of extracting or selec...
research
10/11/2017

Dimensionality Reduction Ensembles

Ensemble learning has had many successes in supervised learning, but it ...
research
01/26/2023

SparCA: Sparse Compressed Agglomeration for Feature Extraction and Dimensionality Reduction

The most effective dimensionality reduction procedures produce interpret...
research
10/10/2010

Multi-Objective Genetic Programming Projection Pursuit for Exploratory Data Modeling

For classification problems, feature extraction is a crucial process whi...
research
02/03/2014

Applying Supervised Learning Algorithms and a New Feature Selection Method to Predict Coronary Artery Disease

From a fresh data science perspective, this thesis discusses the predict...
research
06/07/2021

Evaluating Meta-Feature Selection for the Algorithm Recommendation Problem

With the popularity of Machine Learning (ML) solutions, algorithms and d...
research
07/02/2017

Dimensionality reduction with missing values imputation

In this study, we propose a new statical approach for high-dimensionalit...

Please sign up or login with your details

Forgot password? Click here to reset