Data-driven Weight Initialization with Sylvester Solvers

05/02/2021
by   Debasmit Das, et al.
0

In this work, we propose a data-driven scheme to initialize the parameters of a deep neural network. This is in contrast to traditional approaches which randomly initialize parameters by sampling from transformed standard distributions. Such methods do not use the training data to produce a more informed initialization. Our method uses a sequential layer-wise approach where each layer is initialized using its input activations. The initialization is cast as an optimization problem where we minimize a combination of encoding and decoding losses of the input activations, which is further constrained by a user-defined latent code. The optimization problem is then restructured into the well-known Sylvester equation, which has fast and efficient gradient-free solutions. Our data-driven method achieves a boost in performance compared to random initialization methods, both before start of training and after training is over. We show that our proposed method is especially effective in few-shot and fine-tuning settings. We conclude this paper with analyses on time complexity and the effect of different latent codes on the recognition performance.

READ FULL TEXT
research
02/05/2019

Linear Inequality Constraints for Neural Network Activations

We propose a method to impose linear inequality constraints on neural ne...
research
02/01/2017

PCA-Initialized Deep Neural Networks Applied To Document Image Analysis

In this paper, we present a novel approach for initializing deep neural ...
research
10/19/2017

Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks

In this paper, we present a novel approach to perform deep neural networ...
research
06/03/2018

Minnorm training: an algorithm for training overcomplete deep neural networks

In this work, we propose a new training method for finding minimum weigh...
research
06/03/2018

Minnorm training: an algorithm for training over-parameterized deep neural networks

In this work, we propose a new training method for finding minimum weigh...
research
02/26/2021

Layer-Wise Interpretation of Deep Neural Networks Using Identity Initialization

The interpretability of neural networks (NNs) is a challenging but essen...
research
09/01/2011

An Efficient Codebook Initialization Approach for LBG Algorithm

In VQ based image compression technique has three major steps namely (i)...

Please sign up or login with your details

Forgot password? Click here to reset