Feature Selection integrated Deep Learning for Ultrahigh Dimensional and Highly Correlated Feature Space

09/15/2022
by   Arkaprabha Ganguli, et al.
5

In recent years, deep learning has been a topic of interest in almost all disciplines due to its impressive empirical success in analyzing complex data sets, such as imaging, genetics, climate, and medical data. While most of the developments are treated as black-box machines, there is an increasing interest in interpretable, reliable, and robust deep learning models applicable to a broad class of applications. Feature-selected deep learning is proven to be promising in this regard. However, the recent developments do not address the situations of ultra-high dimensional and highly correlated feature selection in addition to the high noise level. In this article, we propose a novel screening and cleaning strategy with the aid of deep learning for the cluster-level discovery of highly correlated predictors with a controlled error rate. A thorough empirical evaluation over a wide range of simulated scenarios demonstrates the effectiveness of the proposed method by achieving high power while having a minimal number of false discoveries. Furthermore, we implemented the algorithm in the riboflavin (vitamin B_2) production dataset in the context of understanding the possible genetic association with riboflavin production. The gain of the proposed methodology is illustrated by achieving lower prediction error compared to other state-of-the-art methods.

READ FULL TEXT
research
09/04/2018

DeepPINK: reproducible feature selection in deep neural networks

Deep learning has become increasingly popular in both supervised and uns...
research
05/24/2019

Deep-gKnock: nonlinear group-feature selection with deep neural network

Feature selection is central to contemporary high-dimensional data analy...
research
02/21/2023

Stepdown SLOPE for Controlled Feature Selection

Sorted L-One Penalized Estimation (SLOPE) has shown the nice theoretical...
research
11/22/2018

Feature Selection for Survival Analysis with Competing Risks using Deep Learning

Deep learning models for survival analysis have gained significant atten...
research
10/13/2020

Neural Gaussian Mirror for Controlled Feature Selection in Neural Networks

Deep neural networks (DNNs) have become increasingly popular and achieve...
research
06/21/2022

BiometricBlender: Ultra-high dimensional, multi-class synthetic data generator to imitate biometric feature space

The lack of freely available (real-life or synthetic) high or ultra-high...
research
05/21/2021

Sheaves as a Framework for Understanding and Interpreting Model Fit

As data grows in size and complexity, finding frameworks which aid in in...

Please sign up or login with your details

Forgot password? Click here to reset