Evaluating the Performance of Speculative DOACROSS Loop Parallelization with taskloop

02/10/2023
by   Juan Salamanca, et al.
0

OpenMP provides programmers with directives to parallelize DOALL loops such as parallel for and, more recently, taskloop for task-based parallelism. On the other hand, when it is possible to prove that a loop is DOACROSS, programmers can try to parallelize it through parallel for and to use the OpenMP ordered directive to mark the region of the loop that has to be executed sequentially. However, when neither of the previous two cases can be proven, programmers have to be conservative and assume that the loop is DOACROSS (actually may DOACROSS). Previous work proposed speculative support for taskloop (tls clause) and thus made it possible to parallelize may DOACROSS loops exploiting task-based parallelism and the fact that many of them are computationally intensive and DOALL at runtime. This paper proposes Speculative Task Execution (STE) through the addition of speculative privatizations to taskloop tls with two novel clauses: spec_private and spec_reduction. We also present a performance comparison between taskloop-tls with speculative privatizations vs. ordered that reveals that, for certain loops, slowdowns using OpenMP DOACROSS can be transformed in speed-ups of up to 1.87x by applying speculative parallelization of tasks.

READ FULL TEXT
research
11/05/2017

HPX Smart Executors

The performance of many parallel applications depends on loop-level para...
research
05/07/2022

Can We Run in Parallel? Automating Loop Parallelization for TornadoVM

With the advent of multi-core systems, GPUs and FPGAs, loop parallelizat...
research
04/01/2019

Modular Synthesis of Divide-and-Conquer Parallelism for Nested Loops (Extended Version)

We propose a methodology for automatic generation of divide-and-conquer ...
research
05/18/2022

A Novel Loop Fission Technique Inspired by Implicit Computational Complexity

This work explores an unexpected application of Implicit Computational C...
research
06/03/2019

Exploiting nested task-parallelism in the ℋ-LU factorization

We address the parallelization of the LU factorization of hierarchical m...
research
05/15/2023

A Direct-Style Effect Notation for Sequential and Parallel Programs

Modeling sequential and parallel composition of effectful computations h...
research
11/29/2022

Maximal Atomic irRedundant Sets: a Usage-based Dataflow Partitioning Algorithm

Programs admitting a polyhedral representation can be transformed in man...

Please sign up or login with your details

Forgot password? Click here to reset