Serverless Supercomputing: High Performance Function as a Service for Science

08/14/2019
by   Ryan Chard, et al.
0

Growing data volumes and velocities are driving exciting new methods across the sciences in which data analytics and machine learning are increasingly intertwined with research. These new methods require new approaches for scientific computing in which computation is mobile, so that, for example, it can occur near data, be triggered by events (e.g., arrival of new data), or be offloaded to specialized accelerators. They also require new design approaches in which monolithic applications can be decomposed into smaller components, that may in turn be executed separately and on the most efficient resources. To address these needs we propose funcX---a high-performance function-as-a-service (FaaS) platform that enables intuitive, flexible, efficient, scalable, and performant remote function execution on existing infrastructure including clouds, clusters, and supercomputers. It allows users to register and then execute Python functions without regard for the physical resource location, scheduler architecture, or virtualization technology on which the function is executed---an approach we refer to as "serverless supercomputing." We motivate the need for funcX in science, describe our prototype implementation, and demonstrate, via experiments on two supercomputers, that funcX can process millions of functions across more than 65000 concurrent workers. We also outline five scientific scenarios in which funcX has been deployed and highlight the benefits of funcX in these scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2020

funcX: A Federated Function Serving Fabric for Science

Exploding data volumes and velocities, new computational methods and pla...
research
09/23/2022

funcX: Federated Function as a Service for Science

funcX is a distributed function as a service (FaaS) platform that enable...
research
06/25/2021

RFaaS: RDMA-Enabled FaaS Platform for Serverless High-Performance Computing

The rigid MPI programming model and batch scheduling dominate high-perfo...
research
08/20/2022

MLExchange: A web-based platform enabling exchangeable machine learning workflows

Machine learning (ML) algorithms are showing a growing trend in helping ...
research
05/06/2019

Parsl: Pervasive Parallel Programming in Python

High-level programming languages such as Python are increasingly used to...
research
09/05/2022

ScalSALE: Scalable SALE Benchmark Framework for Supercomputers

Supercomputers worldwide provide the necessary infrastructure for ground...
research
05/13/2021

Toward Real-time Analysis of Experimental Science Workloads on Geographically Distributed Supercomputers

Massive upgrades to science infrastructure are driving data velocities u...

Please sign up or login with your details

Forgot password? Click here to reset