Parsl: Pervasive Parallel Programming in Python

05/06/2019
by   Yadu Babuji, et al.
0

High-level programming languages such as Python are increasingly used to provide intuitive interfaces to libraries written in lower-level languages and for assembling applications from various components. This migration towards orchestration rather than implementation, coupled with the growing need for parallel computing (e.g., due to big data and the end of Moore's law), necessitates rethinking how parallelism is expressed in programs. Here, we present Parsl, a parallel scripting library that augments Python with simple, scalable, and flexible constructs for encoding parallelism. These constructs allow Parsl to construct a dynamic dependency graph of components that it can then execute efficiently on one or many processors. Parsl is designed for scalability, with an extensible set of executors tailored to different use cases, such as low-latency, high-throughput, or extreme-scale execution. We show, via experiments on the Blue Waters supercomputer, that Parsl executors can allow Python scripts to execute components with as little as 5 ms of overhead, scale to more than 250 000 workers across more than 8000 nodes, and process upward of 1200 tasks per second. Other Parsl features simplify the construction and execution of composite programs by supporting elastic provisioning and scaling of infrastructure, fault-tolerant execution, and integrated wide-area data management. We show that these capabilities satisfy the needs of many-task, interactive, online, and machine learning applications in fields such as biology, cosmology, and materials science.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2022

Extended Abstract: Productive Parallel Programming with Parsl

Parsl is a parallel programming library for Python that aims to make it ...
research
07/06/2021

Toward Interlanguage Parallel Scripting for Distributed-Memory Scientific Computing

Scripting languages such as Python and R have been widely adopted as too...
research
10/26/2018

AutoParallel: A Python module for automatic parallelization and distributed execution of affine loop nests

The last improvements in programming languages, programming models, and ...
research
05/15/2020

kiwiPy: Robust, high-volume, messaging for big-data and computational science workflows

In this work we present kiwiPy, a Python library designed to support rob...
research
05/07/2019

Programming Unikernels in the Large via Functor Driven Development

Compiling applications as unikernels allows them to be tailored to diver...
research
08/14/2019

Serverless Supercomputing: High Performance Function as a Service for Science

Growing data volumes and velocities are driving exciting new methods acr...
research
11/13/2018

Task Graph Transformations for Latency Tolerance

The Integrative Model for Parallelism (IMP) derives a task graph from a ...

Please sign up or login with your details

Forgot password? Click here to reset