KaFiStO: A Kalman Filtering Framework for Stochastic Optimization

07/07/2021
by   Aram Davtyan, et al.
0

Optimization is often cast as a deterministic problem, where the solution is found through some iterative procedure such as gradient descent. However, when training neural networks the loss function changes over (iteration) time due to the randomized selection of a subset of the samples. This randomization turns the optimization problem into a stochastic one. We propose to consider the loss as a noisy observation with respect to some reference optimum. This interpretation of the loss allows us to adopt Kalman filtering as an optimizer, as its recursive formulation is designed to estimate unknown parameters from noisy measurements. Moreover, we show that the Kalman Filter dynamical model for the evolution of the unknown parameters can be used to capture the gradient dynamics of advanced methods such as Momentum and Adam. We call this stochastic optimization method KaFiStO. KaFiStO is an easy to implement, scalable, and efficient method to train neural networks. We show that it also yields parameter estimates that are on par with or better than existing optimization algorithms across several neural network architectures and machine learning tasks, such as computer vision and language modeling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2018

Kalman Gradient Descent: Adaptive Variance Reduction in Stochastic Optimization

We introduce Kalman Gradient Descent, a stochastic optimization algorith...
research
12/22/2020

Stochastic Gradient Variance Reduction by Solving a Filtering Problem

Deep neural networks (DNN) are typically optimized using stochastic grad...
research
07/28/2015

Training recurrent networks online without backtracking

We introduce the "NoBackTrack" algorithm to train the parameters of dyna...
research
10/11/2020

Three-Dimensional Swarming Using Cyclic Stochastic Optimization

In this paper we simulate an ensemble of cooperating, mobile sensing age...
research
02/10/2020

Stochastic Online Optimization using Kalman Recursion

We study the Extended Kalman Filter in constant dynamics, offering a bay...
research
07/23/2018

Particle Filtering Methods for Stochastic Optimization with Application to Large-Scale Empirical Risk Minimization

There is a recent interest in developing statistical filtering methods f...
research
05/13/2018

Dyna: A Method of Momentum for Stochastic Optimization

An algorithm is presented for momentum gradient descent optimization bas...

Please sign up or login with your details

Forgot password? Click here to reset