Towards Persistent Memory based Stateful Serverless Computing for Big Data Applications

09/04/2023
by   Yuze Li, et al.
0

The Function-as-a-service (FaaS) computing model has recently seen significant growth especially for highly scalable, event-driven applications. The easy-to-deploy and cost-efficient fine-grained billing of FaaS is highly attractive to big data applications. However, the stateless nature of serverless platforms poses major challenges when supporting stateful I/O intensive workloads such as a lack of native support for stateful execution, state sharing, and inter-function communication. In this paper, we explore the feasibility of performing stateful big data analytics on serverless platforms and improving I/O throughput of functions by using modern storage technologies such as Intel Optane DC Persistent Memory (PMEM). To this end, we propose Marvel, an end-to-end architecture built on top of the popular serverless platform, Apache OpenWhisk and Apache Hadoop. Marvel makes two main contributions: (1) enable stateful function execution on OpenWhisk by maintaining state information in an in-memory caching layer; and (2) provide access to PMEM backed HDFS storage for faster I/O performance. Our evaluation shows that Marvel reduces the overall execution time of big data applications by up to 86.6

READ FULL TEXT
research
02/01/2018

Big Data Dwarfs: Towards Fully Understanding Big Data Analytics Workloads

Though the big data benchmark suites like BigDataBench and CloudSuite ha...
research
11/20/2021

Freeing Compute Caches from Serialization and Garbage Collection in Managed Big Data Analytics

Managed analytics frameworks (e.g., Spark) cache intermediate results in...
research
11/04/2021

Auto Tuning of Hadoop and Spark parameters

Data of the order of terabytes, petabytes, or beyond is known as Big Dat...
research
03/16/2018

Serverless Data Analytics with Flint

Serverless architectures organized around loosely-coupled function invoc...
research
02/14/2022

Short-lived Datacenter

Serverless platforms have attracted attention due to their promise of el...
research
02/21/2020

Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing

Serverless computing is an excellent fit for big data processing because...
research
06/30/2020

Lachesis: Automated Generation of Persistent Partitionings for UDF-Centric Analytics

Persistent partitioning is effective in avoiding expensive shuffling ope...

Please sign up or login with your details

Forgot password? Click here to reset