Accelerating Stochastic Gradient Descent Using Antithetic Sampling

10/07/2018
by   Jingchang Liu, et al.
0

(Mini-batch) Stochastic Gradient Descent is a popular optimization method which has been applied to many machine learning applications. But a rather high variance introduced by the stochastic gradient in each step may slow down the convergence. In this paper, we propose the antithetic sampling strategy to reduce the variance by taking advantage of the internal structure in dataset. Under this new strategy, stochastic gradients in a mini-batch are no longer independent but negatively correlated as much as possible, while the mini-batch stochastic gradient is still an unbiased estimator of full gradient. For the binary classification problems, we just need to calculate the antithetic samples in advance, and reuse the result in each iteration, which is practical. Experiments are provided to confirm the effectiveness of the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2014

Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling

Stochastic Gradient Descent (SGD) is a popular optimization method which...
research
06/14/2022

MBGDT:Robust Mini-Batch Gradient Descent

In high dimensions, most machine learning method perform fragile even th...
research
04/27/2020

The Impact of the Mini-batch Size on the Variance of Gradients in Stochastic Gradient Descent

The mini-batch stochastic gradient descent (SGD) algorithm is widely use...
research
05/30/2023

Stochastic Gradient Langevin Dynamics Based on Quantized Optimization

Stochastic learning dynamics based on Langevin or Levy stochastic differ...
research
12/13/2018

Stochastic Gradient Descent for Spectral Embedding with Implicit Orthogonality Constraint

In this paper, we propose a scalable algorithm for spectral embedding. T...
research
05/24/2016

Learning a Metric Embedding for Face Recognition using the Multibatch Method

This work is motivated by the engineering task of achieving a near state...
research
02/05/2016

Reducing Runtime by Recycling Samples

Contrary to the situation with stochastic gradient descent, we argue tha...

Please sign up or login with your details

Forgot password? Click here to reset