Eigenvalue-corrected Natural Gradient Based on a New Approximation

11/27/2020
by   Kai-Xin Gao, et al.
0

Using second-order optimization methods for training deep neural networks (DNNs) has attracted many researchers. A recently proposed method, Eigenvalue-corrected Kronecker Factorization (EKFAC) (George et al., 2018), proposes an interpretation of viewing natural gradient update as a diagonal method, and corrects the inaccurate re-scaling factor in the Kronecker-factored eigenbasis. Gao et al. (2020) considers a new approximation to the natural gradient, which approximates the Fisher information matrix (FIM) to a constant multiplied by the Kronecker product of two matrices and keeps the trace equal before and after the approximation. In this work, we combine the ideas of these two methods and propose Trace-restricted Eigenvalue-corrected Kronecker Factorization (TEKFAC). The proposed method not only corrects the inexact re-scaling factor under the Kronecker-factored eigenbasis, but also considers the new approximation method and the effective damping technique proposed in Gao et al. (2020). We also discuss the differences and relationships among the Kronecker-factored approximations. Empirically, our method outperforms SGD with momentum, Adam, EKFAC and TKFAC on several DNNs.

READ FULL TEXT
research
11/21/2020

A Trace-restricted Kronecker-Factored Approximation to Natural Gradient

Second-order optimization methods have the ability to accelerate converg...
research
11/30/2018

Eigenvalue Corrected Noisy Natural Gradient

Variational Bayesian neural networks combine the flexibility of deep lea...
research
01/16/2013

Revisiting Natural Gradient for Deep Networks

We evaluate natural gradient, an algorithm originally proposed in Amari ...
research
05/30/2023

KrADagrad: Kronecker Approximation-Domination Gradient Preconditioned Stochastic Optimization

Second order stochastic optimizers allow parameter update step size and ...
research
04/15/2020

A New Constrained Optimization Model for Solving the Nonsymmetric Stochastic Inverse Eigenvalue Problem

The stochastic inverse eigenvalue problem aims to reconstruct a stochast...
research
10/19/2021

A New Extension of Chubanov's Method to Symmetric Cones

We propose a new variant of Chubanov's method for solving the feasibilit...
research
03/29/2012

Corrected Kriging update formulae for batch-sequential data assimilation

Recently, a lot of effort has been paid to the efficient computation of ...

Please sign up or login with your details

Forgot password? Click here to reset