
CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression
Due to the high communication cost in distributed and federated learning...
A Short Note of PAGE: Optimal Convergence Rates for Nonconvex Optimization
In this note, we first recall the nonconvex problem setting and introduc...
ANITA: An Optimal Loopless Accelerated VarianceReduced Gradient Method
We propose a novel accelerated variancereduced gradient method called A...
ZeroSARAH: Efficient Nonconvex FiniteSum Optimization with Zero Full Gradient Computation
We propose ZeroSARAH – a novel variant of the variancereduced method SA...
MARINA: Faster NonConvex Distributed Learning with Compression
We develop and analyze MARINA: a new communication efficient method for ...
PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization
In this paper, we propose a novel stochastic gradient estimator—ProbAbil...
A Unified Analysis of Stochastic Gradient Methods for Nonconvex Federated Optimization
In this paper, we study the performance of a large family of SGD variant...
Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization
Due to the high communication cost in distributed and federated learning...
A unified variancereduced accelerated gradient method for convex optimization
We propose a novel randomized incremental gradient algorithm, namely, VA...
Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization
Variance reduction techniques like SVRG provide simple and fast algorith...
SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
We analyze stochastic gradient algorithms for optimizing nonconvex probl...
Learning Twolayer Neural Networks with Symmetric Inputs
We give a new algorithm for learning a twolayer neural network under a ...
A Fast Polynomialtime PrimalDual Projection Algorithm for Linear Programming
Traditionally, there are several polynomial algorithms for linear progra...
A Fast AndersonChebyshev Mixing Method for Nonlinear Optimization
Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
An AndersonChebyshev Mixing Method for Nonlinear Optimization
Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
Stochastic Gradient Hamiltonian Monte Carlo with Variance Reduction for Bayesian Inference
Gradientbased Monte Carlo sampling algorithms, like Langevin dynamics a...
Gradient Boosting With PieceWise Linear Regression Trees
Gradient boosting using decision trees as base learners, so called Gradi...
A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization
We analyze stochastic gradient algorithms for optimizing nonconvex, nons...
