An analytic theory of shallow networks dynamics for hinge loss classification

06/19/2020
by   Franco Pellegrini, et al.
0

Neural networks have been shown to perform incredibly well in classification tasks over structured high-dimensional datasets. However, the learning dynamics of such networks is still poorly understood. In this paper we study in detail the training dynamics of a simple type of neural network: a single hidden layer trained to perform a classification task. We show that in a suitable mean-field limit this case maps to a single-node learning problem with a time-dependent dataset determined self-consistently from the average nodes population. We specialize our theory to the prototypical case of a linearly separable dataset and a linear hinge loss, for which the dynamics can be explicitly solved. This allow us to address in a simple setting several phenomena appearing in modern networks such as slowing down of training dynamics, crossover between rich and lazy learning, and overfitting. Finally, we asses the limitations of mean-field theory by studying the case of large but finite number of nodes and of training samples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2021

A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization

The training dynamics of two-layer neural networks with batch normalizat...
research
10/05/2022

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

It is unclear how changing the learning rule of a deep neural network al...
research
08/21/2020

A Dynamical Central Limit Theorem for Shallow Neural Networks

Recent theoretical work has characterized the dynamics of wide shallow n...
research
02/23/2021

Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed

A recent series of theoretical works showed that the dynamics of neural ...
research
12/02/2019

Capacity of the covariance perceptron

The classical perceptron is a simple neural network that performs a bina...
research
10/22/2020

Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime

We study the problem of policy optimization for infinite-horizon discoun...
research
03/19/2018

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

We analyze numerically the training dynamics of deep neural networks (DN...

Please sign up or login with your details

Forgot password? Click here to reset