Peter Richtarik

research

∙ 06/28/2023

Towards a Better Theoretical Understanding of Independent Subnetwork Training

Modern advancements in large-scale machine learning would be impossible ...

0 Egor Shulgin, et al. ∙

research

∙ 06/06/2023

Understanding Progressive Training Through the Framework of Randomized Coordinate Descent

We propose a Randomized Progressive Training algorithm (RPT) – a stochas...

0 Rafał Szlendak, et al. ∙

research

∙ 06/05/2023

Improving Accelerated Federated Learning with Compression and Importance Sampling

Federated Learning is a collaborative training framework that leverages ...

0 Michał Grudzień, et al. ∙

research

∙ 05/30/2023

Clip21: Error Feedback for Gradient Clipping

Motivated by the increasing popularity and importance of large-scale tra...

0 Sarit Khirirat, et al. ∙

research

∙ 05/29/2023

Global-QSGD: Practical Floatless Quantization for Distributed Learning with Theoretical Guarantees

Efficient distributed training is a principal driver of recent advances ...

0 Jihao Xin, et al. ∙

research

∙ 05/25/2023

A Guide Through the Zoo of Biased SGD

Stochastic Gradient Descent (SGD) is arguably the most important single ...

0 Yury Demidovich, et al. ∙

research

∙ 05/24/2023

Momentum Provably Improves Error Feedback!

Due to the high communication overhead when training machine learning mo...

0 Ilyas Fatkhullin, et al. ∙

research

∙ 03/08/2023

ELF: Federated Langevin Algorithms with Primal, Dual and Bidirectional Compression

Federated sampling algorithms have recently gained great popularity in t...

0 Avetik Karagulyan, et al. ∙

research

∙ 02/20/2023

TAMUNA: Accelerated Federated Learning with Local Training and Partial Participation

In federated learning, a large number of users are involved in a global ...

0 Laurent Condat, et al. ∙

research

∙ 01/17/2023

Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes

In this work, we consider the problem of minimizing the sum of Moreau en...

0 Konstantin Mishchenko, et al. ∙

research

∙ 12/29/2022

Can 5th Generation Local Training Methods Support Client Sampling? Yes!

The celebrated FedAvg algorithm of McMahan et al. (2017) is based on thr...

0 Michał Grudzień, et al. ∙

research

∙ 10/31/2022

Adaptive Compression for Communication-Efficient Distributed Training

We propose Adaptive Compressed Gradient Descent (AdaCGD) - a novel optim...

0 Maksim Makarenko, et al. ∙

research

∙ 10/28/2022

GradSkip: Communication-Accelerated Local Gradient Methods with Better Computational Complexity

In this work, we study distributed optimization algorithms that reduce t...

0 Artavazd Maranjyan, et al. ∙

research

∙ 10/24/2022

Provably Doubly Accelerated Federated Learning: The First Theoretically Successful Combination of Local Training and Compressed Communication

In the modern paradigm of federated learning, a large number of users ar...

0 Laurent Condat, et al. ∙

research

∙ 10/02/2022

Improved Stein Variational Gradient Descent with Importance Weights

Stein Variational Gradient Descent (SVGD) is a popular sampling algorith...

0 Lukang Sun, et al. ∙

research

∙ 09/30/2022

EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression

The starting point of this paper is the discovery of a novel and simple ...

0 Kaja Gruntkowska, et al. ∙

research

∙ 09/16/2022

Minibatch Stochastic Three Points Method for Unconstrained Smooth Minimization

In this paper, we propose a new zero order optimization method called mi...

0 Soumia Boucherouite, et al. ∙

research

∙ 08/10/2022

Adaptive Learning Rates for Faster Stochastic Gradient Methods

In this work, we propose new adaptive step size strategies that improve ...

7 Samuel Horvath, et al. ∙

research

∙ 07/08/2022

Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with Inexact Prox

Inspired by a recent breakthrough of Mishchenko et al (2022), who for th...

8 Abdurakhmon Sadiev, et al. ∙

research

∙ 06/21/2022

Shifted Compression Framework: Generalizations and Improvements

Communication is one of the key bottlenecks in the distributed training ...

10 Egor Shulgin, et al. ∙

research

∙ 06/20/2022

A Note on the Convergence of Mirrored Stein Variational Gradient Descent under (L_0,L_1)-Smoothness Condition

In this note, we establish a descent lemma for the population limit Mirr...

0 Lukang Sun, et al. ∙

research

∙ 06/07/2022

Distributed Newton-Type Methods with Communication Compression and Bernoulli Aggregation

Despite their high computation and communication costs, Newton-type meth...

0 Rustem Islamov, et al. ∙

research

∙ 06/06/2022

Certified Robustness in Federated Learning

Federated learning has recently gained significant attention and popular...

3 Motasem Alfarra, et al. ∙

research

∙ 06/02/2022

Federated Learning with a Sampling Algorithm under Isoperimetry

Federated learning uses a set of techniques to efficiently distribute th...

15 Lukang Sun, et al. ∙

research

∙ 06/01/2022

Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top

Byzantine-robustness has been gaining a lot of attention due to the grow...

0 Eduard Gorbunov, et al. ∙

research

∙ 06/01/2022

Convergence of Stein Variational Gradient Descent under a Weaker Smoothness Condition

Stein Variational Gradient Descent (SVGD) is an important alternative to...

0 Lukang Sun, et al. ∙

research

∙ 05/31/2022

A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting

We present a new method that includes three key components of distribute...

7 Alexander Tyurin, et al. ∙

research

∙ 05/08/2022

Federated Random Reshuffling with Compression and Variance Reduction

Random Reshuffling (RR), which is a variant of Stochastic Gradient Desce...

6 Grigory Malinovsky, et al. ∙

research

∙ 04/27/2022

FedShuffle: Recipes for Better Use of Local Work in Federated Learning

The practice of applying several local updates before aggregation across...

1 Samuel Horvath, et al. ∙

research

∙ 02/18/2022

ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!

We introduce ProxSkip – a surprisingly simple and provably efficient met...

5 Konstantin Mishchenko, et al. ∙

research

∙ 02/07/2022

FL_PyTorch: optimization research simulator for federated learning

Federated Learning (FL) has emerged as a promising technique for edge de...

80 Konstantin Burlachenko, et al. ∙

research

∙ 02/02/2022

DASHA: Distributed Nonconvex Optimization with Communication Compression, Optimal Oracle Complexity, and No Client Synchronization

We develop and analyze DASHA: a new family of methods for nonconvex dist...

0 Alexander Tyurin, et al. ∙

research

∙ 02/02/2022

3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation

We propose and study a new class of gradient communication mechanisms fo...

2 Peter Richtarik, et al. ∙

research

∙ 01/31/2022

BEER: Fast O(1/T) Rate for Decentralized Nonconvex Optimization with Communication Compression

Communication efficiency has been widely recognized as the bottleneck fo...

12 Haoyu Zhao, et al. ∙

research

∙ 01/26/2022

Server-Side Stepsizes and Sampling Without Replacement Provably Help in Federated Optimization

We present a theoretical study of server-side optimization in federated ...

5 Grigory Malinovsky, et al. ∙

research

∙ 12/24/2021

Faster Rates for Compressed Federated Learning with Client-Variance Reduction

Due to the communication bottleneck in distributed and federated learnin...

16 Haoyu Zhao, et al. ∙

research

∙ 11/22/2021

FLIX: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning

Federated Learning (FL) is an increasingly popular machine learning para...

14 Elnur Gasanov, et al. ∙

research

∙ 11/02/2021

Basis Matters: Better Communication-Efficient Second Order Methods for Federated Learning

Recent advances in distributed optimization have shown that Newton-type ...

1 Xun Qian, et al. ∙

research

∙ 10/07/2021

Permutation Compressors for Provably Faster Distributed Nonconvex Optimization

We study the MARINA method of Gorbunov et al (2021) – the current state-...

3 Rafał Szlendak, et al. ∙

research

∙ 10/07/2021

EF21 with Bells Whistles: Practical Algorithmic Extensions of Modern Error Feedback

First proposed by Seide (2014) as a heuristic, error feedback (EF) is a ...

7 Ilyas Fatkhullin, et al. ∙

research

∙ 09/11/2021

Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information

We present a novel adaptive optimization algorithm for large-scale machi...

12 Majid Jahani, et al. ∙

research

∙ 08/10/2021

FedPAGE: A Fast Local Stochastic Gradient Method for Communication-Efficient Federated Learning

Federated Averaging (FedAvg, also known as Local-SGD) (McMahan et al., 2...

15 Haoyu Zhao, et al. ∙

research

∙ 07/20/2021

CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression

Due to the high communication cost in distributed and federated learning...

5 Zhize Li, et al. ∙

research

∙ 06/09/2021

EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback

Error feedback (EF), also known as error compensation, is an immensely p...

0 Peter Richtarik, et al. ∙

research

∙ 06/08/2021

Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks

We consider the task of minimizing the sum of smooth and strongly convex...

0 Dmitry Kovalev, et al. ∙

research

∙ 06/07/2021

Smoothness-Aware Quantization Techniques

Distributed machine learning has become an indispensable tool for traini...

0 Bokun Wang, et al. ∙

research

∙ 06/06/2021

Complexity Analysis of Stein Variational Gradient Descent Under Talagrand's Inequality T1

We study the complexity of Stein Variational Gradient Descent (SVGD), wh...

5 Adil Salim, et al. ∙

research

∙ 06/06/2021

MURANA: A Generic Framework for Stochastic Variance-Reduced Optimization

We propose a generic variance-reduced algorithm, which we call MUltiple ...

12 Laurent Condat, et al. ∙

research

∙ 06/05/2021

FedNL: Making Newton-Type Methods Applicable to Federated Learning

Inspired by recent work of Islamov et al (2021), we propose a family of ...

0 Mher Safaryan, et al. ∙

research

∙ 04/19/2021

Random Reshuffling with Variance Reduction: New Analysis and Better Rates

Virtually all state-of-the-art methods for training supervised machine l...

11 Grigory Malinovsky, et al. ∙

Peter Richtarik

Featured Co-authors

Sign in with Google

Consider DeepAI Pro