Towards Quantifying Intrinsic Generalization of Deep ReLU Networks

10/18/2019
by   Shaeke Salman, et al.
0

Understanding the underlying mechanisms that enable the empirical successes of deep neural networks is essential for further improving their performance and explaining such networks. Towards this goal, a specific question is how to explain the "surprising" behavior of the same over-parametrized deep neural networks that can generalize well on real datasets and at the same time "memorize" training samples when the labels are randomized. In this paper, we demonstrate that deep ReLU networks generalize from training samples to new points via piece-wise linear interpolation. We provide a quantified analysis on the generalization ability of a deep ReLU network: Given a fixed point x and a fixed direction in the input space S, there is always a segment such that any point on the segment will be classified the same as the fixed point x. We call this segment the generalization interval. We show that the generalization intervals of a ReLU network behave similarly along pairwise directions between samples of the same label in both real and random cases on the MNIST and CIFAR-10 datasets. This result suggests that the same interpolation mechanism is used in both cases. Additionally, for datasets using real labels, such networks provide a good approximation of the underlying manifold in the data, where the changes are much smaller along tangent directions than along normal directions. On the other hand, however, for datasets with random labels, generalization intervals along mid-lines of triangles with the same label are much smaller than those on the datasets with real labels, suggesting different behaviors along other directions. Our systematic experiments demonstrate for the first time that such deep neural networks generalize through the same interpolation and explain the differences between their performance on datasets with real and random labels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2018

On the importance of single directions for generalization

Despite their ability to memorize large datasets, deep neural networks o...
research
02/23/2018

Sensitivity and Generalization in Neural Networks: an Empirical Study

In practice it is often found that large over-parameterized neural netwo...
research
03/22/2018

Gradient Descent Quantizes ReLU Network Features

Deep neural networks are often trained in the over-parametrized regime (...
research
07/15/2022

Error analysis for deep neural network approximations of parametric hyperbolic conservation laws

We derive rigorous bounds on the error resulting from the approximation ...
research
01/24/2019

Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks

Recent works have cast some light on the mystery of why deep nets fit an...
research
05/26/2022

Training ReLU networks to high uniform accuracy is intractable

Statistical learning theory provides bounds on the necessary number of t...

Please sign up or login with your details

Forgot password? Click here to reset