Log In Sign Up

Task-Graph Scheduling Extensions for Efficient Synchronization and Communication

by   Seonmyeong Bak, et al.

Task graphs have been studied for decades as a foundation for scheduling irregular parallel applications and incorporated in programming models such as OpenMP. While many high-performance parallel libraries are based on task graphs, they also have additional scheduling requirements, such as synchronization from inner levels of data parallelism and internal blocking communications. In this paper, we extend task-graph scheduling to support efficient synchronization and communication within tasks. Our scheduler avoids deadlock and oversubscription of worker threads, and refines victim selection to increase the overlap of sibling tasks. To the best of our knowledge, our approach is the first to combine gang-scheduling and work-stealing in a single runtime. Our approach has been evaluated on the SLATE highperformance linear algebra library. Relative to the LLVM OMP runtime, our runtime demonstrates performance improvements of up to 13.82 Cholesky, respectively, evaluated across different configurations.


page 2

page 9


Supporting OpenMP 5.0 Tasks in hpxMP – A study of an OpenMP implementation within Task Based Runtime Systems

OpenMP has been the de facto standard for single node parallelism for mo...

UniFuzz: Optimizing Distributed Fuzzing via Dynamic Centralized Task Scheduling

Fuzzing is one of the most efficient technology for vulnerability detect...

Accelerating Task-based Iterative Applications

Task-based programming models have risen in popularity as an alternative...

Increasing the Degree of Parallelism Using Speculative Execution in Task-based Runtime Systems

Task-based programming models have demonstrated their efficiency in the ...

Runtime vs Scheduler: Analyzing Dask's Overheads

Dask is a distributed task framework which is commonly used by data scie...

Mitigating inefficient task mappings with an Adaptive Resource-Moldable Scheduler (ARMS)

Efficient runtime task scheduling on complex memory hierarchy becomes in...

Scheduling of Real-Time Tasks with Multiple Critical Sections in Multiprocessor Systems

The performance of multiprocessor synchronization and locking protocols ...