Walking in the Shadow: A New Perspective on Descent Directions for Constrained Minimization

06/15/2020
by   Hassan Mortagy, et al.
0

Descent directions such as movement towards Frank-Wolfe vertices, away steps, in-face away steps and pairwise directions have been an important design consideration in conditional gradient descent (CGD) variants. In this work, we attempt to demystify the impact of movement in these directions towards attaining constrained minimizers. The best local direction of descent is the directional derivative of the projection of the gradient, which we refer to as the shadow of the gradient. We show that the continuous-time dynamics of moving in the shadow are equivalent to those of PGD however non-trivial to discretize. By projecting gradients in PGD, one not only ensures feasibility but also is able to "wrap" around the convex region. We show that Frank-Wolfe (FW) vertices in fact recover the maximal wrap one can obtain by projecting gradients, thus providing a new perspective to these steps. We also claim that the shadow steps give the best direction of descent emanating from the convex hull of all possible away-vertices. Opening up the PGD movements in terms of shadow steps gives linear convergence, dependent on the number of faces. We combine these insights into a novel SHADOW-CG method that uses FW steps (i.e., wrap around the polytope) and shadow steps (i.e., optimal local descent direction), while enjoying linear convergence. Our analysis develops properties of directional derivatives of projections (which may be of independent interest), while providing a unifying view of various descent directions in the CGD literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2018

Blended Conditional Gradients: the unconditioning of conditional gradients

We present a blended conditional gradient approach for minimizing a smoo...
research
07/23/2023

Swarm-Based Optimization with Random Descent

We extend our study of the swarm-based gradient descent method for non-c...
research
09/29/2018

Directional Analysis of Stochastic Gradient Descent via von Mises-Fisher Distributions in Deep learning

Although stochastic gradient descent (SGD) is a driving force behind the...
research
06/19/2019

Locally Accelerated Conditional Gradients

Conditional gradient methods form a class of projection-free first-order...
research
05/29/2023

Learning Two-Layer Neural Networks, One (Giant) Step at a Time

We study the training dynamics of shallow neural networks, investigating...
research
03/13/2020

Boosting Frank-Wolfe by Chasing Gradients

The Frank-Wolfe algorithm has become a popular first-order optimization ...
research
10/19/2020

Practical Frank-Wolfe algorithms

In the last decade there has been a resurgence of interest in Frank-Wolf...

Please sign up or login with your details

Forgot password? Click here to reset