
QuasiGlobal Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data
Decentralized training of deep learning models is a key element for enab...
read it

Learning from History for Byzantine Robust Optimization
Byzantine robustness has received significant attention recently given i...
read it

Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning
Federated learning is a challenging optimization problem due to the hete...
read it

PowerGossip: Practical LowRank Communication Compression in Decentralized Deep Learning
Lossy gradient compression has become a practical tool to overcome the c...
read it

ByzantineRobust Learning on Heterogeneous Datasets via Resampling
In Byzantine robust distributed optimization, a central server wants to ...
read it

Secure ByzantineRobust Machine Learning
Increasingly machine learning systems are being deployed to edge servers...
read it

Why ADAM Beats SGD for Attention Models
While stochastic gradient descent (SGD) is still the de facto algorithm ...
read it

SCAFFOLD: Stochastic Controlled Averaging for OnDevice Federated Learning
Federated learning is a key scenario in modern largescale machine learn...
read it

The ErrorFeedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication
We analyze (stochastic) gradient descent (SGD) with delayed updates on s...
read it

Amplifying Rényi Differential Privacy via Shuffling
Differential privacy is a useful tool to build machine learning models w...
read it

PowerSGD: Practical LowRank Gradient Compression for Distributed Optimization
We study gradient compression methods to alleviate the communication bot...
read it

Accelerating Gradient Boosting Machine
Gradient Boosting Machine (GBM) is an extremely powerful supervised lear...
read it

Error Feedback Fixes SignSGD and other Gradient Compression Schemes
Signbased algorithms (e.g. signSGD) have been proposed as a biased grad...
read it

Efficient Greedy Coordinate Descent for Composite Problems
Coordinate descent with random coordinate selection is the current state...
read it

Global linear convergence of Newton's method without strongconvexity or Lipschitz gradients
We show that Newton's method converges globally at a linear rate for obj...
read it
Sai Praneeth Karimireddy
is this you? claim profile