Do Deeper Convolutional Networks Perform Better?

10/19/2020
by   Eshaan Nichani, et al.
0

Over-parameterization is a recent topic of much interest in the machine learning community. While over-parameterized neural networks are capable of perfectly fitting (interpolating) training data, these networks often perform well on test data, thereby contradicting classical learning theory. Recent work provided an explanation for this phenomenon by introducing the double descent curve, showing that increasing model capacity past the interpolation threshold can lead to a decrease in test error. In line with this, it was recently shown empirically and theoretically that increasing neural network capacity through width leads to double descent. In this work, we analyze the effect of increasing depth on test performance. In contrast to what is observed for increasing width, we demonstrate through a variety of classification experiments on CIFAR10 and ImageNet32 using ResNets and fully-convolutional networks that test performance worsens beyond a critical depth. We posit an explanation for this phenomenon by drawing intuition from the principle of minimum norm solutions in linear networks.

READ FULL TEXT
research
03/08/2021

Asymptotics of Ridge Regression in Convolutional Models

Understanding generalization and estimation error of estimators for simp...
research
03/24/2023

Double Descent Demystified: Identifying, Interpreting Ablating the Sources of a Deep Learning Puzzle

Double descent is a surprising phenomenon in machine learning, in which ...
research
12/06/2021

Multi-scale Feature Learning Dynamics: Insights for Double Descent

A key challenge in building theoretical foundations for deep learning is...
research
07/15/2023

Does Double Descent Occur in Self-Supervised Learning?

Most investigations into double descent have focused on supervised model...
research
03/14/2022

Phenomenology of Double Descent in Finite-Width Neural Networks

`Double descent' delineates the generalization behaviour of models depen...
research
08/17/2022

Superior generalization of smaller models in the presence of significant label noise

The benefits of over-parameterization in achieving superior generalizati...
research
09/25/2019

Benefit of Interpolation in Nearest Neighbor Algorithms

The over-parameterized models attract much attention in the era of data ...

Please sign up or login with your details

Forgot password? Click here to reset