MXDAG: A Hybrid Abstraction for Cluster Applications

07/15/2021
by   Weitao Wang, et al.
0

Distributed applications, such as database queries and distributed training, consist of both compute and network tasks. DAG-based abstraction primarily targets compute tasks and has no explicit network-level scheduling. In contrast, Coflow abstraction collectively schedules network flows among compute tasks but lacks the end-to-end view of the application DAG. Because of the dependencies and interactions between these two types of tasks, it is sub-optimal to only consider one of them. We argue that co-scheduling of both compute and network tasks can help applications towards the globally optimal end-to-end performance. However, none of the existing abstractions can provide fine-grained information for co-scheduling. We propose MXDAG, an abstraction to treat both compute and network tasks explicitly. It can capture the dependencies and interactions of both compute and network tasks leading to improved application performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2019

Metaflow: A DAG-Based Network Abstraction for Distributed Applications

In the past decade, increasingly network scheduling techniques have been...
research
12/23/2019

Jupiter: A Networked Computing Architecture

In the era of Internet of Things, there is an increasing demand for netw...
research
09/16/2020

PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems

In the past decade, high performance compute capabilities exhibited by h...
research
09/13/2021

Fine-grained Entity Typing via Label Reasoning

Conventional entity typing approaches are based on independent classific...
research
09/03/2021

AppSlice: A system for application-centric design of 5G and edge computing applications

Applications that use edge computing and 5G to improve response times co...
research
10/05/2020

Randomized Value Functions via Posterior State-Abstraction Sampling

State abstraction has been an essential tool for dramatically improving ...
research
09/04/2017

Abstraction of Linear Consensus Networks with Guaranteed Systemic Performance Measures

A proper abstraction of a large-scale linear consensus network with a de...

Please sign up or login with your details

Forgot password? Click here to reset