Cyclic Block Coordinate Descent With Variance Reduction for Composite Nonconvex Optimization

12/09/2022
by   Xufeng Cai, et al.
0

Nonconvex optimization is central in solving many machine learning problems, in which block-wise structure is commonly encountered. In this work, we propose cyclic block coordinate methods for nonconvex optimization problems with non-asymptotic gradient norm guarantees. Our convergence analysis is based on a gradient Lipschitz condition with respect to a Mahalanobis norm, inspired by a recent progress on cyclic block coordinate methods. In deterministic settings, our convergence guarantee matches the guarantee of (full-gradient) gradient descent, but with the gradient Lipschitz constant being defined w.r.t. the Mahalanobis norm. In stochastic settings, we use recursive variance reduction to decrease the per-iteration cost and match the arithmetic operation complexity of current optimal stochastic full-gradient methods, with a unified analysis for both finite-sum and infinite-sum cases. We further prove the faster, linear convergence of our methods when a Polyak-Łojasiewicz (PŁ) condition holds for the objective function. To the best of our knowledge, our work is the first to provide variance-reduced convergence guarantees for a cyclic block coordinate method. Our experimental results demonstrate the efficacy of the proposed variance-reduced cyclic scheme in training deep neural nets.

READ FULL TEXT
research
03/19/2016

Stochastic Variance Reduction for Nonconvex Optimization

We study nonconvex finite-sum problems and analyze stochastic variance r...
research
11/20/2017

Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks

We present a stochastic first-order optimization algorithm, named BCSC, ...
research
02/26/2021

Fast Cyclic Coordinate Dual Averaging with Extrapolation for Generalized Variational Inequalities

We propose the Cyclic cOordinate Dual avEraging with extRapolation (CODE...
research
02/11/2020

Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems

We propose an accelerated version of stochastic variance reduced coordin...
research
03/28/2023

Accelerated Cyclic Coordinate Dual Averaging with Extrapolation for Composite Convex Optimization

Exploiting partial first-order information in a cyclic way is arguably t...
research
04/23/2019

Semi-Cyclic Stochastic Gradient Descent

We consider convex SGD updates with a block-cyclic structure, i.e. where...
research
11/18/2019

Coordinate-wise Armijo's condition

Let z=(x,y) be coordinates for the product space R^m_1×R^m_2. Let f:R^m_...

Please sign up or login with your details

Forgot password? Click here to reset