A Hybrid Two-layer Feature Selection Method Using GeneticAlgorithm and Elastic Net

01/30/2020
by   Fatemeh Amini, et al.
18

Feature selection, as a critical pre-processing step for machine learning, aims at determining representative predictors from a high-dimensional feature space dataset to improve the prediction accuracy. However, the increase in feature space dimensionality, comparing to the number of observations, poses a severe challenge to many existing feature selection methods with respect to computational efficiency and prediction performance. This paper presents a new hybrid two-layer feature selection approach that combines a wrapper and an embedded method in constructing an appropriate subset of predictors. In the first layer of the proposed method, the Genetic Algorithm(GA) has been adopted as a wrapper to search for the optimal subset of predictors, which aims to reduce the number of predictors and the prediction error. As one of the meta-heuristic approaches, GA is selected due to its computational efficiency; however, GAs do not guarantee the optimality. To address this issue, a second layer is added to the proposed method to eliminate any remaining redundant/irrelevant predictors to improve the prediction accuracy. Elastic Net(EN) has been selected as the embedded method in the second layer because of its flexibility in adjusting the penalty terms in regularization process and time efficiency. This hybrid two-layer approach has been applied on a Maize genetic dataset from NAM population, which consists of multiple subsets of datasets with different ratio of the number of predictors to the number of observations. The numerical results confirm the superiority of the proposed model.

READ FULL TEXT

page 16

page 17

page 20

research
08/08/2020

A Novel Community Detection Based Genetic Algorithm for Feature Selection

The selection of features is an essential data preprocessing stage in da...
research
12/30/2018

Space Expansion of Feature Selection for Designing more Accurate Error Predictors

Approximate computing is being considered as a promising design paradigm...
research
10/21/2022

A GA-like Dynamic Probability Method With Mutual Information for Feature Selection

Feature selection plays a vital role in promoting the classifier's perfo...
research
05/22/2019

Selection of a Minimal Number of Significant Porcine SNPs by an Information Gain and Genetic Algorithm Hybrid Model

A panel of large number of common Single Nucleotide Polymorphisms (SNPs)...
research
11/16/2014

HIPAD - A Hybrid Interior-Point Alternating Direction algorithm for knowledge-based SVM and feature selection

We consider classification tasks in the regime of scarce labeled trainin...
research
09/29/2021

Deep neural networks with controlled variable selection for the identification of putative causal genetic variants

Deep neural networks (DNN) have been used successfully in many scientifi...

Please sign up or login with your details

Forgot password? Click here to reset