Analysis of Work-Stealing and Parallel Cache Complexity

11/09/2021
by   Yan Gu, et al.
0

Parallelism has become extremely popular over the past decade, and there have been a lot of new parallel algorithms and software. The randomized work-stealing (RWS) scheduler plays a crucial role in this ecosystem. In this paper, we study two important topics related to the randomized work-stealing scheduler. Our first contribution is a simplified, classroom-ready version of analysis for the RWS scheduler. The theoretical efficiency of the RWS scheduler has been analyzed for a variety of settings, but most of them are quite complicated. In this paper, we show a new analysis, which we believe is easy to understand, and can be especially useful in education. We avoid using the potential function in the analysis, and we assume a highly asynchronous setting, which is more realistic for today's parallel machines. Our second and main contribution is some new parallel cache complexity for algorithms using the RWS scheduler. Although the sequential I/O model has been well-studied over the past decades, so far very few results have extended it to the parallel setting. The parallel cache bounds of many existing algorithms are affected by a polynomial of the span, which causes a significant overhead for high-span algorithms. Our new analysis decouples the span from the analysis of the parallel cache complexity. This allows us to show new parallel cache bounds for a list of classic algorithms. Our results are only a polylogarithmic factor off the lower bounds, and significantly improve previous results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2023

Cache-Oblivious Parallel Convex Hull in the Binary Forking Model

We present two cache-oblivious sorting-based convex hull algorithms in t...
research
09/25/2018

Improved Parallel Cache-Oblivious Algorithms for Dynamic Programming and Linear Algebra

For many cache-oblivious algorithms for dynamic programming and linear a...
research
04/27/2020

In-Place Parallel-Partition Algorithms using Exclusive-Read-and-Write Memory: An In-Place Algorithm With Provably Optimal Cache Behavior

We present an in-place algorithm for the parallel partition problem that...
research
08/01/2020

Data Oblivious Algorithms for Multicores

As secure processors such as Intel SGX (with hyperthreading) become wide...
research
05/14/2021

Efficient Parallel Self-Adjusting Computation

Self-adjusting computation is an approach for automatically producing dy...
research
05/25/2022

Many Sequential Iterative Algorithms Can Be Parallel and (Nearly) Work-efficient

To design efficient parallel algorithms, some recent papers showed that ...
research
03/11/2019

Optimal Parallel Algorithms in the Binary-Forking Model

In this paper we develop optimal algorithms in the binary-forking model ...

Please sign up or login with your details

Forgot password? Click here to reset