SGD Distributional Dynamics of Three Layer Neural Networks

12/30/2020
by   Victor Luo, et al.
0

With the rise of big data analytics, multi-layer neural networks have surfaced as one of the most powerful machine learning methods. However, their theoretical mathematical properties are still not fully understood. Training a neural network requires optimizing a non-convex objective function, typically done using stochastic gradient descent (SGD). In this paper, we seek to extend the mean field results of Mei et al. (2018) from two-layer neural networks with one hidden layer to three-layer neural networks with two hidden layers. We will show that the SGD dynamics is captured by a set of non-linear partial differential equations, and prove that the distributions of weights in the two hidden layers are independent. We will also detail exploratory work done based on simulation and real-world data.

READ FULL TEXT

page 25

page 26

research
04/18/2018

A Mean Field View of the Landscape of Two-Layers Neural Networks

Multi-layer neural networks are among the most powerful models in machin...
research
06/18/2019

Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup

Deep neural networks achieve stellar generalisation even when they have ...
research
03/11/2019

Mean Field Analysis of Deep Neural Networks

We analyze multi-layer neural networks in the asymptotic regime of simul...
research
02/17/2022

Training neural networks using monotone variational inequality

Despite the vast empirical success of neural networks, theoretical under...
research
04/22/2021

Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes

Linear interpolation between initial neural network parameters and conve...
research
02/21/2023

SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics

We investigate the time complexity of SGD learning on fully-connected ne...
research
10/11/2015

Neural Networks with Few Multiplications

For most deep learning algorithms training is notoriously time consuming...

Please sign up or login with your details

Forgot password? Click here to reset