FsNet: Feature Selection Network on High-dimensional Biological Data

01/23/2020
by   Dinesh Singh, et al.
26

Biological data are generally high-dimensional and require efficient machine learning methods that are well generalized and scalable to discover their complex nonlinear patterns. The recent advances in the domain of artificial intelligence and machine learning can be attributed to deep neural networks (DNNs) because they accomplish a variety of tasks in computer vision and natural language processing. However, standard DNNs are not suitable for handling high-dimensional data and data with small number of samples because they require a large pool of computing resources as well as plenty of samples to learn a large number of parameters. In particular, although interpretability is important for high-dimensional biological data such as gene expression data, a nonlinear feature selection algorithm for DNN models has not been fully investigated. In this paper, we propose a novel nonlinear feature selection method called the Feature Selection Network (FsNet), which is a scalable concrete neural network architecture, under high-dimensional and small number of samples setups. Specifically, our network consists of a selector layer that uses a concrete random variable for discrete feature selection and a supervised deep neural network regularized with the reconstruction loss. Because a large number of parameters in the selector and reconstruction layer can easily cause overfitting under a limited number of samples, we use two tiny networks to predict the large virtual weight matrices of the selector and reconstruction layers. The experimental results on several real-world high-dimensional biological datasets demonstrate the efficacy of the proposed approach.

READ FULL TEXT
research
11/28/2022

Weight Predictor Network with Feature Selection for Small Sample Tabular Biomedical Data

Tabular biomedical data is often high-dimensional but with a very small ...
research
01/26/2019

Sparse evolutionary Deep Learning with over one million artificial neurons on commodity hardware

Microarray gene expression has widely attracted the eyes of the public a...
research
08/14/2016

Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data

Machine learning methods are used to discover complex nonlinear relation...
research
09/28/2016

Towards the effectiveness of Deep Convolutional Neural Network based Fast Random Forest Classifier

Deep Learning is considered to be a quite young in the area of machine l...
research
06/04/2015

Classification with many classes: challenges and pluses

The objective of the paper is to study accuracy of multi-class classific...
research
05/23/2019

forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature extraction

A unique challenge in predictive model building for omics data has been ...
research
07/25/2011

An iterative feature selection method for GRNs inference by exploring topological properties

An important problem in bioinformatics is the inference of gene regulato...

Please sign up or login with your details

Forgot password? Click here to reset