
CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression
Due to the high communication cost in distributed and federated learning...
read it

A Short Note of PAGE: Optimal Convergence Rates for Nonconvex Optimization
In this note, we first recall the nonconvex problem setting and introduc...
read it

ANITA: An Optimal Loopless Accelerated VarianceReduced Gradient Method
We propose a novel accelerated variancereduced gradient method called A...
read it

ZeroSARAH: Efficient Nonconvex FiniteSum Optimization with Zero Full Gradient Computation
We propose ZeroSARAH – a novel variant of the variancereduced method SA...
read it

MARINA: Faster NonConvex Distributed Learning with Compression
We develop and analyze MARINA: a new communication efficient method for ...
read it

PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization
In this paper, we propose a novel stochastic gradient estimator—ProbAbil...
read it

A Unified Analysis of Stochastic Gradient Methods for Nonconvex Federated Optimization
In this paper, we study the performance of a large family of SGD variant...
read it

Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization
Due to the high communication cost in distributed and federated learning...
read it

A unified variancereduced accelerated gradient method for convex optimization
We propose a novel randomized incremental gradient algorithm, namely, VA...
read it

Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization
Variance reduction techniques like SVRG provide simple and fast algorith...
read it

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
We analyze stochastic gradient algorithms for optimizing nonconvex probl...
read it

Learning Twolayer Neural Networks with Symmetric Inputs
We give a new algorithm for learning a twolayer neural network under a ...
read it

A Fast Polynomialtime PrimalDual Projection Algorithm for Linear Programming
Traditionally, there are several polynomial algorithms for linear progra...
read it

A Fast AndersonChebyshev Mixing Method for Nonlinear Optimization
Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
read it

An AndersonChebyshev Mixing Method for Nonlinear Optimization
Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
read it

Stochastic Gradient Hamiltonian Monte Carlo with Variance Reduction for Bayesian Inference
Gradientbased Monte Carlo sampling algorithms, like Langevin dynamics a...
read it

Gradient Boosting With PieceWise Linear Regression Trees
Gradient boosting using decision trees as base learners, so called Gradi...
read it

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization
We analyze stochastic gradient algorithms for optimizing nonconvex, nons...
read it
Zhize Li
is this you? claim profile