The Nonlinearity Coefficient - Predicting Overfitting in Deep Neural Networks

06/01/2018
by   George Philipp, et al.
0

For a long time, designing neural architectures that exhibit high performance was considered a dark art that required expert hand-tuning. One of the few well-known guidelines for architecture design is the avoidance of exploding gradients, though even this guideline has remained relatively vague and circumstantial. We introduce the nonlinearity coefficient (NLC), a measurement of the complexity of the function computed by a neural network that is based on the magnitude of the gradient. Via an extensive empirical study, we show that the NLC is a powerful predictor of test error and that attaining a right-sized NLC is essential for optimal performance. The NLC exhibits a range of intriguing and important properties. It is closely tied to the amount of information gained from computing a single network gradient. It is tied to the error incurred when replacing the nonlinearity operations in the network with linear operations. It is not susceptible to the confounders of multiplicative scaling, additive bias and layer width. It is stable from layer to layer. Hence, we argue that the NLC is the first robust predictor of overfitting in deep networks.

READ FULL TEXT
research
05/25/2021

The Nonlinearity Coefficient - A Practical Guide to Neural Architecture Design

In essence, a neural network is an arbitrary differentiable, parametrize...
research
08/25/2021

The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks

The recent success of neural network models has shone light on a rather ...
research
09/26/2019

Sequential Training of Neural Networks with Gradient Boosting

This paper presents a novel technique based on gradient boosting to trai...
research
09/15/2022

Understanding Deep Neural Function Approximation in Reinforcement Learning via ε-Greedy Exploration

This paper provides a theoretical study of deep neural function approxim...
research
11/28/2019

Reproducible Evaluation of Neural Network Based Channel Estimators And Predictors Using A Generic Dataset

A low-complexity neural network based approach for channel estimation wa...
research
06/22/2021

The Rate of Convergence of Variation-Constrained Deep Neural Networks

Multi-layer feedforward networks have been used to approximate a wide ra...
research
02/10/2015

A HMAX with LLC for visual recognition

Today's high performance deep artificial neural networks (ANNs) rely hea...

Please sign up or login with your details

Forgot password? Click here to reset