Towards General Distributed Resource Selection

01/08/2018
by   Ming Tai Ha, et al.
0

The advantages of distributing workloads and utilizing multiple distributed resources are now well established. The type and degree of heterogeneity of distributed resources is increasing, and thus determining how to distribute the workloads becomes increasingly difficult, in particular with respect to the selection of suitable resources. We formulate and investigate the resource selection problem in a way that it is agnostic of specific task and resource properties, and which is generalizable to range of metrics. Specifically, we developed a model to describe the requirements of tasks and to estimate the cost of running that task on an arbitrary resource using baseline measurements from a reference machine. We integrated our cost model with the Condor matchmaking algorithm to enable resource selection. Experimental validation of our model shows that it provides execution time estimates with 157-171 on XSEDE resources and 18-31 model to select resources for a bag-of-tasks of up to 1024 GROMACS MD simulations across the target resources. Experiments show that using the model's estimates reduces the workload's time-to-completion up to 85 compared to the random distribution of workload across the same resources.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/02/2023

Hardware Abstractions and Hardware Mechanisms to Support Multi-Task Execution on Coarse-Grained Reconfigurable Arrays

Domain-specific accelerators are used in various computing systems rangi...
research
02/01/2023

Task Placement and Resource Allocation for Edge Machine Learning: A GNN-based Multi-Agent Reinforcement Learning Paradigm

Machine learning (ML) tasks are one of the major workloads in today's ed...
research
04/07/2021

Pilot-Edge: Distributed Resource Management Along the Edge-to-Cloud Continuum

Many science and industry IoT applications necessitate data processing a...
research
01/12/2022

Gridiron: A Technique for Augmenting Cloud Workloads with Network Bandwidth Requirements

Cloud applications use more than just server resources, they also requir...
research
11/24/2017

Technical Report: A Trace-Based Performance Study of Autoscaling Workloads of Workflows in Datacenters

To improve customer experience, datacenter operators offer support for s...
research
12/14/2020

WISE: A Computer System Performance Index Scoring Framework

The performance levels of a computing machine running a given workload c...
research
03/11/2021

Compiler-Guided Throughput Scheduling for Many-core Machines

Modern ARM-based servers such as ThunderX and ThunderX2 offer a tremendo...

Please sign up or login with your details

Forgot password? Click here to reset