Over-parametrized deep neural networks do not generalize well

12/09/2019
by   Michael Kohler, et al.
0

Recently it was shown in several papers that backpropagation is able to find the global minimum of the empirical risk on the training data using over-parametrized deep neural networks. In this paper a similar result is shown for deep neural networks with the sigmoidal squasher activation function in a regression setting, and a lower bound is presented which proves that these networks do not generalize well on a new data in the sense that they do not achieve the optimal minimax rate of convergence for estimation of smooth regression functions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset