Accelerating Task-based Iterative Applications

08/12/2022
by   David Alvarez, et al.
0

Task-based programming models have risen in popularity as an alternative to traditional fork-join parallelism. They are better suited to write applications with irregular parallelism that can present load imbalance. However, these programming models suffer from overheads related to task creation, scheduling and dependency management, limiting performance and scalability when tasks become too small. At the same time, many HPC applications implement iterative methods or multi-step simulations that create the same directed acyclic graphs of tasks on each iteration. By giving application programmers a way to express that a specific loop is creating the same task pattern on each iteration, we can create a single task DAG once and transform it into a cyclic graph. This cyclic graph is then reused for successive iterations, minimizing task creation and dependency management overhead. This paper presents the taskiter, a new construct we propose for the OmpSs-2 and OpenMP programming models, allowing the use of directed cyclic task graphs (DCTG) to minimize runtime overheads. Moreover, we present a simple immediate successor locality-aware heuristic that minimizes task scheduling overhead by bypassing the runtime task scheduler. We evaluate the implementation of the taskiter and the immediate successor heuristic in 8 iterative benchmarks. Using small task granularities, we obtain an average speedup of 3.7x over the reference OmpSs-2 implementation and an average of 5x and 7.46x speedup over the LLVM and GCC OpenMP runtimes, respectively.

READ FULL TEXT

page 7

page 9

research
05/17/2021

Advanced Synchronization Techniques for Task-based Runtime Systems

Task-based programming models like OmpSs-2 and OpenMP provide a flexible...
research
11/06/2020

Task-Graph Scheduling Extensions for Efficient Synchronization and Communication

Task graphs have been studied for decades as a foundation for scheduling...
research
09/07/2020

Asynchronous Runtime with Distributed Manager for Task-based Programming Models

Parallel task-based programming models, like OpenMP, allow application d...
research
03/12/2018

Increasing the Degree of Parallelism Using Speculative Execution in Task-based Runtime Systems

Task-based programming models have demonstrated their efficiency in the ...
research
08/28/2016

A Generalization of the Directed Graph Layering Problem

The Directed Layering Problem (DLP) solves a step of the widely used lay...
research
08/15/2019

Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance

We present Task Bench, a parameterized benchmark designed to explore the...
research
11/02/2021

Towards Enabling I/O Awareness in Task-based Programming Models

Storage systems have not kept the same technology improvement rate as co...

Please sign up or login with your details

Forgot password? Click here to reset