
Decentralized Composite Optimization with Compression
Decentralized optimization and communication compression have exhibited ...
read it

Accelerating Gossip SGD with Periodic Global Averaging
Communication overhead hinders the scalability of largescale distribute...
read it

Removing Data Heterogeneity Influence Enhances Network Topology Dependence of Decentralized SGD
We consider decentralized stochastic optimization problems where a netwo...
read it

On the Comparison between Cyclic Sampling and Random Reshuffling
When applying a stochastic/incremental algorithm, one must choose the or...
read it

DecentLaM: Decentralized Momentum SGD for Largebatch Deep Training
The scale of deep learning nowadays calls for efficient distributed trai...
read it

Differentiable Network Adaption with Elastic Search Space
In this paper we propose a novel network adaption method called Differen...
read it

Incorporating Convolution Designs into Visual Transformers
Motivated by the success of Transformers in natural language processing ...
read it

Learning N:M Finegrained Structured Sparse Neural Networks From Scratch
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compr...
read it

Dynamic Graph: Learning Instanceaware Connectivity for Neural Networks
One practice of employing deep neural networks is to apply the same arch...
read it

Learning Connectivity of Neural Networks from a Topological Perspective
Seeking effective neural networks is a critical and practical field in d...
read it

ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems and GANs
Despite remarkable empirical success, the training dynamics of generativ...
read it

A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization
Decentralized optimization is a promising paradigm that finds various ap...
read it

On the Performance of Exact Diffusion over Adaptive Networks
Various biascorrection methods such as EXTRA, DIGing, and exact diffusi...
read it

Dynamic Average Diffusion with randomized Coordinate Updates
This work derives and analyzes an online learning strategy for tracking ...
read it

MultiAgent Fully Decentralized Value Function Learning with Linear Convergence Rates
This work develops a fully decentralized multiagent algorithm for polic...
read it

MultiAgent Fully Decentralized OffPolicy Learning with Linear Convergence Rates
In this paper we develop a fully decentralized algorithm for policy eval...
read it

Learning Under Distributed Features
This work studies the problem of learning under both large data and larg...
read it

Walkman: A CommunicationEfficient RandomWalk Algorithm for Decentralized Optimization
This paper addresses consensus optimization problems in a multiagent ne...
read it

A CommunicationEfficient RandomWalk Algorithm for Decentralized Optimization
This paper addresses consensus optimization problem in a multiagent net...
read it

Stochastic Learning under Random Reshuffling
In empirical risk optimization, it has been observed that stochastic gra...
read it

Efficient VarianceReduced Learning for Fully Decentralized OnDevice Intelligence
This work develops a fully decentralized variancereduced learning algor...
read it

Convergence of VarianceReduced Stochastic Learning under Random Reshuffling
Several useful variancereduced stochastic gradient algorithms, such as ...
read it

On the Influence of Momentum Acceleration on Online Learning
The article examines in some detail the convergence rate and meansquare...
read it

Online Dual Coordinate Ascent Learning
The stochastic dual coordinateascent (SDCA) technique is a useful alte...
read it
Kun Yuan
is this you? claim profile