A Coordinate-Free Construction of Scalable Natural Gradient

08/30/2018
by   Kevin Luk, et al.
6

Most neural networks are trained using first-order optimization methods, which are sensitive to the parameterization of the model. Natural gradient descent is invariant to smooth reparameterizations because it is defined in a coordinate-free way, but tractable approximations are typically defined in terms of coordinate systems, and hence may lose the invariance properties. We analyze the invariance properties of the Kronecker-Factored Approximate Curvature (K-FAC) algorithm by constructing the algorithm in a coordinate-free way. We explicitly construct a Riemannian metric under which the natural gradient matches the K-FAC update; invariance to affine transformations of the activations follows immediately. We extend our framework to analyze the invariance properties of K-FAC applied to convolutional networks and recurrent neural networks, as well as metrics other than the usual Fisher metric.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/03/2016

A Kronecker-factored approximate Fisher matrix for convolution layers

Second-order optimization methods such as natural gradient descent have ...
research
12/03/2014

New insights and perspectives on the natural gradient method

Natural gradient descent is an optimization method traditionally motivat...
research
06/30/2022

Invariance Properties of the Natural Gradient in Overparametrised Systems

The natural gradient field is a vector field that lives on a model equip...
research
03/04/2018

Accelerating Natural Gradient with Higher-Order Invariance

An appealing property of the natural gradient is that it is invariant to...
research
03/04/2013

Riemannian metrics for neural networks I: feedforward networks

We describe four algorithms for neural network training, each adapted to...
research
12/29/2010

Affine-invariant geodesic geometry of deformable 3D shapes

Natural objects can be subject to various transformations yet still pres...
research
06/10/2022

Diffeomorphic Counterfactuals with Generative Models

Counterfactuals can explain classification decisions of neural networks ...

Please sign up or login with your details

Forgot password? Click here to reset