DeepAI AI Chat
Log In Sign Up

Autonomous Task Dropping Mechanism to Achieve Robustness in Heterogeneous Computing Systems

by   Ali Mokhtari, et al.
University of Louisiana at Lafayette

Robustness of a distributed computing system is defined as the ability to maintain its performance in the presence of uncertain parameters. Uncertainty is a key problem in heterogeneous (and even homogeneous) distributed computing systems that perturbs system robustness. Notably, the performance of these systems is perturbed by uncertainty in both task execution time and arrival. Accordingly, our goal is to make the system robust against these uncertainties. Considering task execution time as a random variable, we use probabilistic analysis to develop an autonomous proactive task dropping mechanism to attain our robustness goal. Specifically, we provide a mathematical model that identifies the optimality of a task dropping decision, so that the system robustness is maximized. Then, we leverage the mathematical model to develop a task dropping heuristic that achieves the system robustness within a feasible time complexity. Although the proposed model is generic and can be applied to any distributed system, we concentrate on heterogeneous computing (HC) systems that have a higher degree of exposure to uncertainty than homogeneous systems. Experimental results demonstrate that the autonomous proactive dropping mechanism can improve the system robustness by up to 20


Robust Dynamic Resource Allocation via Probabilistic Task Pruning in Heterogeneous Computing Systems

In heterogeneous distributed computing (HC) systems, diversity can exist...

Improving Robustness of Heterogeneous Serverless Computing Systems Via Probabilistic Task Pruning

Cloud-based serverless computing is an increasingly popular computing pa...

Coded Distributed Computing: Performance Limits and Code Designs

We consider the problem of coded distributed computing where a large lin...

Coded Computing via Binary Linear Codes: Designs and Performance Limits

We consider the problem of coded distributed computing where a large lin...

Uncertainty-Aware Task Allocation for Distributed Autonomous Robots

This paper addresses task-allocation problems with uncertainty in situat...