AutoInit: Automatic Initialization via Jacobian Tuning

06/27/2022
by   Tianyu He, et al.
0

Good initialization is essential for training Deep Neural Networks (DNNs). Oftentimes such initialization is found through a trial and error approach, which has to be applied anew every time an architecture is substantially modified, or inherited from smaller size networks leading to sub-optimal initialization. In this work we introduce a new and cheap algorithm, that allows one to find a good initialization automatically, for general feed-forward DNNs. The algorithm utilizes the Jacobian between adjacent network blocks to tune the network hyperparameters to criticality. We solve the dynamics of the algorithm for fully connected networks with ReLU and derive conditions for its convergence. We then extend the discussion to more general architectures with BatchNorm and residual connections. Finally, we apply our method to ResMLP and VGG architectures, where the automatic one-shot initialization found by our method shows good performance on vision tasks.

READ FULL TEXT

page 7

page 18

research
05/25/2020

Fractional moment-preserving initialization schemes for training fully-connected neural networks

A common approach to initialization in deep neural networks is to sample...
research
04/20/2020

Revisiting Initialization of Neural Networks

Good initialization of weights is crucial for effective training of deep...
research
06/20/2023

Principles for Initialization and Architecture Selection in Graph Neural Networks with ReLU Activations

This article derives and validates three principles for initialization a...
research
10/29/2017

Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics

Deep neural networks (DNNs) form the backbone of almost every state-of-t...
research
10/05/2021

On the Impact of Stable Ranks in Deep Nets

A recent line of work has established intriguing connections between the...
research
10/12/2022

Towards Theoretically Inspired Neural Initialization Optimization

Automated machine learning has been widely explored to reduce human effo...
research
06/10/2019

Scaling Laws for the Principled Design, Initialization and Preconditioning of ReLU Networks

In this work, we describe a set of rules for the design and initializati...

Please sign up or login with your details

Forgot password? Click here to reset