Generalizing Hierarchical Parallelism
Since the days of OpenMP 1.0 computer hardware has become more complex, typically by specializing compute units for coarse- and fine-grained parallelism in incrementally deeper hierarchies of parallelism. Newer versions of OpenMP reacted by introducing new mechanisms for querying or controlling its individual levels, each time adding another concept such as places, teams, and progress groups. In this paper we propose going back to the roots of OpenMP in the form of nested parallelism for a simpler model and more flexible handling of arbitrary deep hardware hierarchies.
READ FULL TEXT