Gradient Descent, Stochastic Optimization, and Other Tales

05/02/2022
by   Jun Lu, et al.
0

The goal of this paper is to debunk and dispel the magic behind black-box optimizers and stochastic optimizers. It aims to build a solid foundation on how and why the techniques work. This manuscript crystallizes this knowledge by deriving from simple intuitions, the mathematics behind the strategies. This tutorial doesn't shy away from addressing both the formal and informal aspects of gradient descent and stochastic optimization methods. By doing so, it hopes to provide readers with a deeper understanding of these techniques as well as the when, the how and the why of applying these algorithms. Gradient descent is one of the most popular algorithms to perform optimization and by far the most common way to optimize machine learning tasks. Its stochastic version receives attention in recent years, and this is particularly true for optimizing deep neural networks. In deep neural networks, the gradient followed by a single sample or a batch of samples is employed to save computational resources and escape from saddle points. In 1951, Robbins and Monro published A stochastic approximation method, one of the first modern treatments on stochastic optimization that estimates local gradients with a new batch of samples. And now, stochastic optimization has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this article is to give a self-contained introduction to concepts and mathematical tools in gradient descent and stochastic optimization.

READ FULL TEXT
research
10/29/2018

Kalman Gradient Descent: Adaptive Variance Reduction in Stochastic Optimization

We introduce Kalman Gradient Descent, a stochastic optimization algorith...
research
07/09/2019

SVGD: A Virtual Gradients Descent Method for Stochastic Optimization

Inspired by dynamic programming, we propose Stochastic Virtual Gradient ...
research
02/23/2022

Exploring Classic Quantitative Strategies

The goal of this paper is to debunk and dispel the magic behind the blac...
research
05/20/2014

Convex Optimization: Algorithms and Complexity

This monograph presents the main complexity theorems in convex optimizat...
research
09/25/2022

Stochastic Gradient Descent Captures How Children Learn About Physics

As children grow older, they develop an intuitive understanding of the p...
research
07/03/2018

Stochastic optimization approaches to learning concise representations

We propose and study a method for learning interpretable features via st...
research
04/07/2020

Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning

This paper investigates the stochastic optimization problem with a focus...

Please sign up or login with your details

Forgot password? Click here to reset