Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation

05/09/2019
by   Colin Wei, et al.
20

Existing Rademacher complexity bounds for neural networks rely only on norm control of the weight matrices and depend exponentially on depth via a product of the matrix norms. Lower bounds show that this exponential dependence on depth is unavoidable when no additional properties of the training data are considered. We suspect that this conundrum comes from the fact that these bounds depend on the training data only through the margin. In practice, many data-dependent techniques such as Batchnorm improve the generalization performance. We obtain tighter Rademacher complexity bounds by considering additional data-dependent properties of the network: the sizes of the hidden layers of the network, and the norms of the Jacobians of each layer with respect to the previous layers. Our bounds scale polynomially in depth when these empirical quantities are small, as is usually the case in practice. To obtain these bounds, we develop general tools for making a composition of functions Lipschitz by augmentation and then covering this augmented function. Inspired by our theory, we directly regularize the network's Jacobians during training and empirically demonstrate that this improves test performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2017

Size-Independent Sample Complexity of Neural Networks

We study the sample complexity of learning neural networks, by providing...
research
10/09/2019

Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin

For linear classifiers, the relationship between (normalized) output mar...
research
06/03/2023

On Size-Independent Sample Complexity of ReLU Networks

We study the sample complexity of learning ReLU neural networks from the...
research
08/08/2022

On Rademacher Complexity-based Generalization Bounds for Deep Learning

In this paper, we develop some novel bounds for the Rademacher complexit...
research
06/26/2017

Spectrally-normalized margin bounds for neural networks

This paper presents a margin-based multiclass generalization bound for n...
research
09/28/2020

Learning Deep ReLU Networks Is Fixed-Parameter Tractable

We consider the problem of learning an unknown ReLU network with respect...
research
10/22/2019

Global Capacity Measures for Deep ReLU Networks via Path Sampling

Classical results on the statistical complexity of linear models have co...

Please sign up or login with your details

Forgot password? Click here to reset