Runtime vs Scheduler: Analyzing Dask's Overheads

10/21/2020
by   Stanislav Böhm, et al.
0

Dask is a distributed task framework which is commonly used by data scientists to parallelize Python code on computing clusters with little programming effort. It uses a sophisticated work-stealing scheduler which has been hand-tuned to execute task graphs as efficiently as possible. But is scheduler optimization a worthwhile effort for Dask? Our paper shows on many real world task graphs that even a completely random scheduler is surprisingly competitive with its built-in scheduler and that the main bottleneck of Dask lies in its runtime overhead. We develop a drop-in replacement for the Dask central server written in Rust which is backwards compatible with existing Dask programs. Thanks to its efficient runtime, our server implementation is able to scale up to larger clusters than Dask and consistently outperforms it on a variety of task graphs, despite the fact that it uses a simpler scheduling algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/07/2020

Asynchronous Runtime with Distributed Manager for Task-based Programming Models

Parallel task-based programming models, like OpenMP, allow application d...
research
08/30/2023

Specx: a C++ task-based runtime system for heterogeneous distributed architectures

Parallelization is needed everywhere, from laptops and mobile phones to ...
research
11/06/2020

Task-Graph Scheduling Extensions for Efficient Synchronization and Communication

Task graphs have been studied for decades as a foundation for scheduling...
research
08/18/2018

Compiler Enhanced Scheduling for OpenMP for Heterogeneous Multiprocessors

Scheduling in Asymmetric Multicore Processors (AMP), a special case of H...
research
09/22/2020

TaskTorrent: a Lightweight Distributed Task-Based Runtime System in C++

We present TaskTorrent, a lightweight distributed task-based runtime in ...
research
08/15/2019

Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance

We present Task Bench, a parameterized benchmark designed to explore the...

Please sign up or login with your details

Forgot password? Click here to reset