On the Locality of the Natural Gradient for Deep Learning

05/21/2020
by   Nihat Ay, et al.
0

We study the natural gradient method for learning in deep Bayesian networks, including neural networks. There are two natural geometries associated with such learning systems consisting of visible and hidden units. One geometry is related to the full system, the other one to the visible sub-system. These two geometries imply different natural gradients. In a first step, we demonstrate a great simplification of the natural gradient with respect to the first geometry, due to locality properties of the Fisher information matrix. This simplification does not directly translate to a corresponding simplification with respect to the second geometry. We develop the theory for studying the relation between the two versions of the natural gradient and outline a method for the simplification of the natural gradient with respect to the second geometry based on the first one. This method suggests to incorporate a recognition model as an auxiliary model for the efficient application of the natural gradient method in deep networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2019

A Formalization of The Natural Gradient Method for General Similarity Measures

In optimization, the natural gradient method is well-known for likelihoo...
research
07/20/2023

On the Fisher-Rao Gradient of the Evidence Lower Bound

This article studies the Fisher-Rao gradient, also referred to as the na...
research
08/22/2018

Fisher Information and Natural Gradient Learning of Random Deep Networks

A deep neural network is a hierarchical nonlinear model transforming inp...
research
02/13/2022

Understanding Natural Gradient in Sobolev Spaces

While natural gradients have been widely studied from both theoretical a...
research
06/10/2021

Quantum Natural Gradient for Variational Bayes

Variational Bayes (VB) is a critical method in machine learning and stat...
research
06/14/2021

NG+ : A Multi-Step Matrix-Product Natural Gradient Method for Deep Learning

In this paper, a novel second-order method called NG+ is proposed. By fo...
research
08/15/2020

Natural Wake-Sleep Algorithm

The benefits of using the natural gradient are well known in a wide rang...

Please sign up or login with your details

Forgot password? Click here to reset