ZerO Initialization: Initializing Residual Networks with only Zeros and Ones

10/25/2021
by   Jiawei Zhao, et al.
18

Deep neural networks are usually initialized with random weights, with adequately selected initial variance to ensure stable signal propagation during training. However, there is no consensus on how to select the variance, and this becomes challenging especially as the number of layers grows. In this work, we replace the widely used random weight initialization with a fully deterministic initialization scheme ZerO, which initializes residual networks with only zeros and ones. By augmenting the standard ResNet architectures with a few extra skip connections and Hadamard transforms, ZerO allows us to start the training from zeros and ones entirely. This has many benefits such as improving reproducibility (by reducing the variance over different experimental runs) and allowing network training without batch normalization. Surprisingly, we find that ZerO achieves state-of-the-art performance over various image classification datasets, including ImageNet, which suggests random weights may be unnecessary for modern network initialization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2017

Deep Residual Networks and Weight Initialization

Residual Network (ResNet) is the state-of-the-art architecture that real...
research
01/27/2019

Fixup Initialization: Residual Learning Without Normalization

Normalization layers are a staple in state-of-the-art deep neural networ...
research
07/17/2022

Improving Deep Neural Network Random Initialization Through Neuronal Rewiring

The deep learning literature is continuously updated with new architectu...
research
11/03/2016

Demystifying ResNet

The Residual Network (ResNet), proposed in He et al. (2015), utilized sh...
research
07/22/2020

Rethinking CNN Models for Audio Classification

In this paper, we show that ImageNet-Pretrained standard deep CNN models...
research
07/02/2020

Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?

Deep neural networks are typically initialized with random weights, with...
research
06/05/2019

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Residual networks (ResNet) and weight normalization play an important ro...

Please sign up or login with your details

Forgot password? Click here to reset