Hamiltonian Deep Neural Networks Guaranteeing Non-vanishing Gradients by Design

05/27/2021
by   Clara Lucía Galimberti, et al.
42

Deep Neural Networks (DNNs) training can be difficult due to vanishing and exploding gradients during weight optimization through backpropagation. To address this problem, we propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretization of continuous-time Hamiltonian systems and include several existing architectures based on ordinary differential equations. Our main result is that a broad set of H-DNNs ensures non-vanishing gradients by design for an arbitrary network depth. This is obtained by proving that, using a semi-implicit Euler discretization scheme, the backward sensitivity matrices involved in gradient computations are symplectic. We also provide an upper bound to the magnitude of sensitivity matrices, and show that exploding gradients can be either controlled through regularization or avoided for special architectures. Finally, we enable distributed implementations of backward and forward propagation algorithms in H-DNNs by characterizing appropriate sparsity constraints on the weight matrices. The good performance of H-DNNs is demonstrated on benchmark classification problems, including image classification with the MNIST dataset.

READ FULL TEXT

page 1

page 10

page 12

research
04/27/2021

A unified framework for Hamiltonian deep neural networks

Training deep neural networks (DNNs) can be difficult due to the occurre...
research
03/21/2023

Universal Approximation Property of Hamiltonian Deep Neural Networks

This paper investigates the universal approximation capabilities of Hami...
research
06/20/2021

Better Training using Weight-Constrained Stochastic Dynamics

We employ constraints to control the parameter space of deep neural netw...
research
05/16/2016

Alternating optimization method based on nonnegative matrix factorizations for deep neural networks

The backpropagation algorithm for calculating gradients has been widely ...
research
06/22/2020

Bidirectional Self-Normalizing Neural Networks

The problem of exploding and vanishing gradients has been a long-standin...
research
10/29/2017

Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics

Deep neural networks (DNNs) form the backbone of almost every state-of-t...
research
03/22/2022

On Robust Classification using Contractive Hamiltonian Neural ODEs

Deep neural networks can be fragile and sensitive to small input perturb...

Please sign up or login with your details

Forgot password? Click here to reset