SGB: Stochastic Gradient Bound Method for Optimizing Partition Functions

11/03/2020
by   Jing Wang, et al.
0

This paper addresses the problem of optimizing partition functions in a stochastic learning setting. We propose a stochastic variant of the bound majorization algorithm that relies on upper-bounding the partition function with a quadratic surrogate. The update of the proposed method, that we refer to as Stochastic Partition Function Bound (SPFB), resembles scaled stochastic gradient descent where the scaling factor relies on a second order term that is however different from the Hessian. Similarly to quasi-Newton schemes, this term is constructed using the stochastic approximation of the value of the function and its gradient. We prove sub-linear convergence rate of the proposed method and show the construction of its low-rank variant (LSPFB). Experiments on logistic regression demonstrate that the proposed schemes significantly outperform SGD. We also discuss how to use quadratic partition function bound for efficient training of deep learning models and in non-convex optimization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2020

Non-Asymptotic Bounds for Zeroth-Order Stochastic Optimization

We consider the problem of optimizing an objective function with and wit...
research
09/05/2013

Semistochastic Quadratic Bound Methods

Partition functions arise in a variety of settings, including conditiona...
research
05/22/2023

Faster Differentially Private Convex Optimization via Second-Order Methods

Differentially private (stochastic) gradient descent is the workhorse of...
research
12/14/2015

Preconditioned Stochastic Gradient Descent

Stochastic gradient descent (SGD) still is the workhorse for many practi...
research
06/10/2022

Stochastic Zeroth order Descent with Structured Directions

We introduce and analyze Structured Stochastic Zeroth order Descent (S-S...
research
11/02/2020

Asynchronous Parallel Stochastic Quasi-Newton Methods

Although first-order stochastic algorithms, such as stochastic gradient ...
research
05/26/2015

Surrogate Functions for Maximizing Precision at the Top

The problem of maximizing precision at the top of a ranked list, often d...

Please sign up or login with your details

Forgot password? Click here to reset