Two-Stage Coded Federated Edge Learning: A Dynamic Partial Gradient Coding Perspective

by   Xinghan Wang, et al.

Federated edge learning (FEL) can training a global model from terminal nodes' local dataset, which can make full use of the computing resources of terminal nodes and performs more extensive and efficient machine learning on terminal nodes with protecting user information requirements. Performance of FEL will be suffered from long delay or fault decision as the master collects partial gradients from stragglers which cannot return correct results within a deadline. Inspired by this, in this paper, we propose a novel coded FEL to mitigate stragglers for synchronous gradient with a two-stage dynamic scheme, where we start with part of workers for a duration of before starting the second stage, and on completion of at the first stage, we start remaining workers in the second stage. In particular, the computation latency and transmission latency is essential and should be quantitatively analyzed. Then the dynamically coded coefficients scheme is proposed which is based on historical information including worker completion time. For performance optimization of FEL, a Lyapunov function is designed to maximize admission data balancing fairness and two stage dynamic coding scheme is designed to maximize arrival data among workers. Experimental evidence verifies the derived properties and demonstrates that our proposed solution achieves a better performance for practical network parameters and benchmark datasets in terms of accuracy and resource utilization in the FEL system.


page 1

page 2

page 3

page 4


Gradient Coding with Clustering and Multi-message Communication

Gradient descent (GD) methods are commonly employed in machine learning ...

Gradient Coding with Dynamic Clustering for Straggler Mitigation

In distributed synchronous gradient descent (GD) the main performance bo...

Gradient Coding with Dynamic Clustering for Straggler-Tolerant Distributed Learning

Distributed implementations are crucial in speeding up large scale machi...

Nested Gradient Codes for Straggler Mitigation in Distributed Machine Learning

We consider distributed learning in the presence of slow and unresponsiv...

Harmonic Coding: An Optimal Linear Code for Privacy-Preserving Gradient-Type Computation

We consider the problem of distributedly computing a general class of fu...

DSAG: A mixed synchronous-asynchronous iterative method for straggler-resilient learning

We consider straggler-resilient learning. In many previous works, e.g., ...

Optimization-based Block Coordinate Gradient Coding for Mitigating Partial Stragglers in Distributed Learning

Gradient coding schemes effectively mitigate full stragglers in distribu...

Please sign up or login with your details

Forgot password? Click here to reset