DeepAI AI Chat
Log In Sign Up

First-order and second-order variants of the gradient descent: a unified framework

by   Thomas Pierrot, et al.
Sorbonne Université

In this paper, we provide an overview of first-order and second-order variants of the gradient descent methods commonly used in machine learning. We propose a general framework in which 6 of these methods can be interpreted as different instances of the same approach. These methods are the vanilla gradient descent, the classical and generalized Gauss-Newton methods, the natural gradient descent method, the gradient covariance matrix approach, and Newton's method. Besides interpreting these methods within a single framework, we explain their specificities and show under which conditions some of them coincide.


page 1

page 2

page 3

page 4


Structured second-order methods via natural gradient descent

In this paper, we propose new structured second-order methods and struct...

Learning Preconditioners on Lie Groups

We study two types of preconditioners and preconditioned stochastic grad...

Gravilon: Applications of a New Gradient Descent Method to Machine Learning

Gradient descent algorithms have been used in countless applications sin...

Inertial Newton Algorithms Avoiding Strict Saddle Points

We study the asymptotic behavior of second-order algorithms mixing Newto...

Revisiting Natural Gradient for Deep Networks

We evaluate natural gradient, an algorithm originally proposed in Amari ...

Generalized Optimization: A First Step Towards Category Theoretic Learning Theory

The Cartesian reverse derivative is a categorical generalization of reve...

A block coordinate descent optimizer for classification problems exploiting convexity

Second-order optimizers hold intriguing potential for deep learning, but...