Over-parametrized deep neural networks do not generalize well

12/09/2019
by   Michael Kohler, et al.
0

Recently it was shown in several papers that backpropagation is able to find the global minimum of the empirical risk on the training data using over-parametrized deep neural networks. In this paper a similar result is shown for deep neural networks with the sigmoidal squasher activation function in a regression setting, and a lower bound is presented which proves that these networks do not generalize well on a new data in the sense that they do not achieve the optimal minimax rate of convergence for estimation of smooth regression functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2020

Analysis of the rate of convergence of fully connected deep neural network regression estimates with smooth activation function

This article contributes to the current statistical theory of deep neura...
research
05/31/2021

Representation Learning Beyond Linear Prediction Functions

Recent papers on the theory of representation learning has shown the imp...
research
07/20/2021

Estimation of a regression function on a manifold by fully connected deep neural networks

Estimation of a regression function from independent and identically dis...
research
05/17/2018

DNN or k-NN: That is the Generalize vs. Memorize Question

This paper studies the relationship between the classification performed...
research
12/31/2022

Smooth Mathematical Function from Compact Neural Networks

This is paper for the smooth function approximation by neural networks (...
research
06/22/2021

The Rate of Convergence of Variation-Constrained Deep Neural Networks

Multi-layer feedforward networks have been used to approximate a wide ra...
research
02/15/2023

Excess risk bound for deep learning under weak dependence

This paper considers deep neural networks for learning weakly dependent ...

Please sign up or login with your details

Forgot password? Click here to reset