Multilevel Initialization for Layer-Parallel Deep Neural Network Training

12/19/2019
by   Eric C. Cyr, et al.
0

This paper investigates multilevel initialization strategies for training very deep neural networks with a layer-parallel multigrid solver. The scheme is based on the continuous interpretation of the training problem as a problem of optimal control, in which neural networks are represented as discretizations of time-dependent ordinary differential equations. A key goal is to develop a method able to intelligently initialize the network parameters for the very deep networks enabled by scalable layer-parallel training. To do this, we apply a refinement strategy across the time domain, that is equivalent to refining in the layer dimension. The resulting refinements create deep networks, with good initializations for the network parameters coming from the coarser trained networks. We investigate the effectiveness of such multilevel "nested iteration" strategies for network training, showing supporting numerical evidence of reduced run time for equivalent accuracy. In addition, we study whether the initialization strategies provide a regularizing effect on the overall training process and reduce sensitivity to hyperparameters and randomness in initial network parameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2018

Layer-Parallel Training of Deep Residual Neural Networks

Residual neural networks (ResNets) are a promising class of deep neural ...
research
11/11/2022

Multilevel-in-Layer Training for Deep Neural Network Regression

A common challenge in regression is that for many problems, the degrees ...
research
04/13/2020

Multilevel Minimization for Deep Residual Networks

We present a new multilevel minimization framework for the training of d...
research
02/17/2021

Multilevel Monte Carlo learning

In this work, we study the approximation of expected values of functiona...
research
12/22/2018

Random Projection in Deep Neural Networks

This work investigates the ways in which deep learning methods can benef...
research
03/27/2018

Incremental Training of Deep Convolutional Neural Networks

We propose an incremental training method that partitions the original n...
research
06/25/2018

Pushing the boundaries of parallel Deep Learning -- A practical approach

This work aims to assess the state of the art of data parallel deep neur...

Please sign up or login with your details

Forgot password? Click here to reset