Interpolating Between Gradient Descent and Exponentiated Gradient Using Reparameterized Gradient Descent

02/24/2020
by   Ehsan Amid, et al.
0

Continuous-time mirror descent (CMD) can be seen as the limit case of the discrete-time MD update when the step-size is infinitesimally small. In this paper, we focus on the geometry of the primal and dual CMD updates and introduce a general framework for reparameterizing one CMD update as another. Specifically, the reparameterized update also corresponds to a CMD, but on the composite loss w.r.t. the new variables, and the original variables are obtained via the reparameterization map. We employ these results to introduce a new family of reparameterizations that interpolate between the two commonly used updates, namely the continuous-time gradient descent (GD) and unnormalized exponentiated gradient (EGU), while extending to many other well-known updates. In particular, we show that for the underdetermined linear regression problem, these updates generalize the known behavior of GD and EGU, and provably converge to the minimum L_2-τ-norm solution for τ∈[0,1]. Our new results also have implications for the regularized training of neural networks to induce sparsity.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset