Coded Elastic Computing

12/16/2018
by   Yaoqing Yang, et al.
0

Cloud providers have recently introduced low-priority machines to reduce the cost of computations. Exploiting such opportunity for machine learning tasks is challenging inasmuch as low-priority machines can elastically leave (through preemption) and join the computation at any time. In this paper, we design a new technique called coded elastic computing enabling distributed machine learning computations over elastic resources. The proposed technique allows machines to transparently leave the computation without sacrificing the algorithm-level performance, and, at the same time, flexibly reduce the workload at existing machines when new machines join the computation. Thanks to the redundancy provided by encoding, our approach is able to achieve similar computational cost as the original (uncoded) method when all machines are present; the cost gracefully increases when machines are preempted and reduces when machines join. We test the performance of the proposed technique on two mini-benchmark experiments, namely elastic matrix multiplications and linear regression. Our preliminary experimental results show improvements over several existing techniques.

READ FULL TEXT
research
01/12/2020

Heterogeneous Computation Assignments in Coded Elastic Computing

We study the optimal design of a heterogeneous coded elastic computing (...
research
06/19/2022

Hierarchical coded elastic computing

Elasticity is offered by cloud service providers to exploit under-utiliz...
research
10/02/2019

Optimizing the Transition Waste in Coded Elastic Computing

Distributed computing, in which a resource-intensive task is divided int...
research
07/20/2021

A New Design Framework for Heterogeneous Uncoded Storage Elastic Computing

Elasticity is one important feature in modern cloud computing systems an...
research
07/18/2021

A Practical Algorithm Design and Evaluation for Heterogeneous Elastic Computing with Stragglers

Our extensive real measurements over Amazon EC2 show that the virtual in...
research
08/12/2020

Coded Elastic Computing on Machines with Heterogeneous Storage and Computation Speed

We study the optimal design of heterogeneous Coded Elastic Computing (CE...
research
06/03/2020

ToGCom: An Asymmetric Sybil Defense

Proof-of-work (PoW) is one of the most common techniques to defend again...

Please sign up or login with your details

Forgot password? Click here to reset