Ensembled sparse-input hierarchical networks for high-dimensional datasets

05/11/2020
by   Jean Feng, et al.
0

Neural networks have seen limited use in prediction for high-dimensional data with small sample sizes, because they tend to overfit and require tuning many more hyperparameters than existing off-the-shelf machine learning methods. With small modifications to the network architecture and training procedure, we show that dense neural networks can be a practical data analysis tool in these settings. The proposed method, Ensemble by Averaging Sparse-Input Hierarchical networks (EASIER-net), appropriately prunes the network structure by tuning only two L1-penalty parameters, one that controls the input sparsity and another that controls the number of hidden layers and nodes. The method selects variables from the true support if the irrelevant covariates are only weakly correlated with the response; otherwise, it exhibits a grouping effect, where strongly correlated covariates are selected at similar rates. On a collection of real-world datasets with different sizes, EASIER-net selected network architectures in a data-adaptive manner and achieved higher prediction accuracy than off-the-shelf methods on average.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2017

Sparse-Input Neural Networks for High-dimensional Nonparametric Regression and Classification

Neural networks are usually not the tool of choice for nonparametric hig...
research
02/01/2013

Regression shrinkage and grouping of highly correlated predictors with HORSES

Identifying homogeneous subgroups of variables can be challenging in hig...
research
12/22/2018

Random Projection in Deep Neural Networks

This work investigates the ways in which deep learning methods can benef...
research
07/13/2021

For high-dimensional hierarchical models, consider exchangeability of effects across covariates instead of across datasets

Hierarchical Bayesian methods enable information sharing across multiple...
research
12/29/2021

Model Averaging for Support Vector Machine by J-fold Cross-Validation

Support vector machine (SVM) is a classical tool to deal with classifica...
research
01/16/2019

Smooth Adjustment for Correlated Effects

This paper considers a high dimensional linear regression model with cor...

Please sign up or login with your details

Forgot password? Click here to reset