Statistical Learning using Sparse Deep Neural Networks in Empirical Risk Minimization

08/12/2021
by   Shujie Ma, et al.
0

We consider a sparse deep ReLU network (SDRN) estimator obtained from empirical risk minimization with a Lipschitz loss function in the presence of a large number of features. Our framework can be applied to a variety of regression and classification problems. The unknown target function to estimate is assumed to be in a Sobolev space with mixed derivatives. Functions in this space only need to satisfy a smoothness condition rather than having a compositional structure. We develop non-asymptotic excess risk bounds for our SDRN estimator. We further derive that the SDRN estimator can achieve the same minimax rate of estimation (up to logarithmic factors) as one-dimensional nonparametric regression when the dimension of the features is fixed, and the estimator has a suboptimal rate when the dimension grows with the sample size. We show that the depth and the total number of nodes and weights of the ReLU network need to grow as the sample size increases to ensure a good performance, and also investigate how fast they should increase with the sample size. These results provide an important theoretical guidance and basis for empirical studies by deep neural networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2022

How do noise tails impact on deep ReLU networks?

This paper investigates the stability of deep ReLU neural networks for n...
research
05/26/2019

Nonregular and Minimax Estimation of Individualized Thresholds in High Dimension with Binary Responses

Given a large number of covariates Z, we consider the estimation of a hi...
research
05/22/2019

On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces

Deep learning has been applied to various tasks in the field of machine ...
research
01/01/2022

Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

Learning operators between infinitely dimensional spaces is an important...
research
11/13/2019

On the Shattering Coefficient of Supervised Learning Algorithms

The Statistical Learning Theory (SLT) provides the theoretical backgroun...
research
03/28/2023

Bayesian Free Energy of Deep ReLU Neural Network in Overparametrized Cases

In many research fields in artificial intelligence, it has been shown th...
research
02/02/2019

Complexity, Statistical Risk, and Metric Entropy of Deep Nets Using Total Path Variation

For any ReLU network there is a representation in which the sum of the a...

Please sign up or login with your details

Forgot password? Click here to reset