Feature selection in high-dimensional dataset using MapReduce

09/07/2017
by   Claudio Reggiani, et al.
0

This paper describes a distributed MapReduce implementation of the minimum Redundancy Maximum Relevance algorithm, a popular feature selection method in bioinformatics and network inference problems. The proposed approach handles both tall/narrow and wide/short datasets. We further provide an open source implementation based on Hadoop/Spark, and illustrate its scalability on datasets involving millions of observations or features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2019

Maximum Relevance and Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform

In machine learning applications for online product offerings and market...
research
10/13/2016

An Information Theoretic Feature Selection Framework for Big Data under Apache Spark

With the advent of extremely high dimensional datasets, dimensionality r...
research
08/21/2022

Scalable mRMR feature selection to handle high dimensional datasets: Vertical partitioning based Iterative MapReduce framework

While building machine learning models, Feature selection (FS) stands ou...
research
11/17/2022

An Advantage Using Feature Selection with a Quantum Annealer

Feature selection is a technique in statistical prediction modeling that...
research
07/27/2023

MVMR-FS : Non-parametric feature selection algorithm based on Maximum inter-class Variation and Minimum Redundancy

How to accurately measure the relevance and redundancy of features is an...
research
11/16/2021

On the utility of power spectral techniques with feature selection techniques for effective mental task classification in noninvasive BCI

In this paper classification of mental task-root Brain-Computer Interfac...
research
05/26/2008

DimReduction - Interactive Graphic Environment for Dimensionality Reduction

Feature selection is a pattern recognition approach to choose important ...

Please sign up or login with your details

Forgot password? Click here to reset