Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics

10/29/2017
by   Saiprasad Koturwar, et al.
0

Deep neural networks (DNNs) form the backbone of almost every state-of-the-art technique in the fields such as computer vision, speech processing, and text analysis. The recent advances in computational technology have made the use of DNNs more practical. Despite the overwhelming performances by DNN and the advances in computational technology, it is seen that very few researchers try to train their models from the scratch. Training of DNNs still remains a difficult and tedious job. The main challenges that researchers face during training of DNNs are the vanishing/exploding gradient problem and the highly non-convex nature of the objective function which has up to million variables. The approaches suggested in He and Xavier solve the vanishing gradient problem by providing a sophisticated initialization technique. These approaches have been quite effective and have achieved good results on standard datasets, but these same approaches do not work very well on more practical datasets. We think the reason for this is not making use of data statistics for initializing the network weights. Optimizing such a high dimensional loss function requires careful initialization of network weights. In this work, we propose a data dependent initialization and analyze its performance against the standard initialization techniques such as He and Xavier. We performed our experiments on some practical datasets and the results show our algorithm's superior classification accuracy.

READ FULL TEXT

page 5

page 7

page 10

page 12

page 14

research
12/22/2018

Random Projection in Deep Neural Networks

This work investigates the ways in which deep learning methods can benef...
research
12/21/2020

Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent

Deep neural networks (DNNs) have achieved remarkable success in computer...
research
06/27/2022

AutoInit: Automatic Initialization via Jacobian Tuning

Good initialization is essential for training Deep Neural Networks (DNNs...
research
07/20/2016

On the Modeling of Error Functions as High Dimensional Landscapes for Weight Initialization in Learning Networks

Next generation deep neural networks for classification hosted on embedd...
research
12/06/2019

Physics-Informed Neural Networks for Multiphysics Data Assimilation with Application to Subsurface Transport

Data assimilation for parameter and state estimation in subsurface trans...
research
11/15/2021

Evolving Deep Neural Networks for Collaborative Filtering

Collaborative Filtering (CF) is widely used in recommender systems to mo...
research
05/27/2021

Hamiltonian Deep Neural Networks Guaranteeing Non-vanishing Gradients by Design

Deep Neural Networks (DNNs) training can be difficult due to vanishing a...

Please sign up or login with your details

Forgot password? Click here to reset