Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

03/09/2023
by   Hanlin Yu, et al.
0

Stochastic-gradient sampling methods are often used to perform Bayesian inference on neural networks. It has been observed that the methods in which notions of differential geometry are included tend to have better performances, with the Riemannian metric improving posterior exploration by accounting for the local curvature. However, the existing methods often resort to simple diagonal metrics to remain computationally efficient. This loses some of the gains. We propose two non-diagonal metrics that can be used in stochastic-gradient samplers to improve convergence and exploration but have only a minor computational overhead over diagonal metrics. We show that for fully connected neural networks (NNs) with sparsity-inducing priors and convolutional NNs with correlated priors, using these metrics can provide improvements. For some other choices the posterior is sufficiently easy also for the simpler metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2016

Practical Riemannian Neural Networks

We provide the first experimental results on non-synthetic datasets for ...
research
10/02/2020

A variable metric mini-batch proximal stochastic recursive gradient algorithm with diagonal Barzilai-Borwein stepsize

Variable metric proximal gradient methods with different metric selectio...
research
06/17/2021

Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity

Understanding the implicit bias of training algorithms is of crucial imp...
research
11/21/2016

Scalable Adaptive Stochastic Optimization Using Random Projections

Adaptive stochastic gradient methods such as AdaGrad have gained popular...
research
05/25/2017

Diagonal Rescaling For Neural Networks

We define a second-order neural network stochastic gradient training alg...
research
12/14/2022

Mechanics of geodesics in Information geometry

In this article we attempt to formulate Riemannian and Randers-Finsler m...
research
10/15/2019

Variable Metric Proximal Gradient Method with Diagonal Barzilai-Borwein Stepsize

Variable metric proximal gradient (VM-PG) is a widely used class of conv...

Please sign up or login with your details

Forgot password? Click here to reset