Stateful Entities: Object-oriented Cloud Applications as Distributed Dataflows

11/18/2021
by   Wouter Zorgdrager, et al.
0

Programming stateful cloud applications remains a very painful experience. Instead of focusing on the business logic, programmers spend most of their time dealing with distributed systems considerations, with the most important being consistency, load balancing, failure management, recovery, and scalability. At the same time, we witness an unprecedented adoption of modern dataflow systems such as Apache Flink, Google Dataflow, and Timely Dataflow. These systems are now performant and fault-tolerant, and they offer excellent state management primitives. With this line of work, we aim at investigating the opportunities and limits of compiling general-purpose programs into stateful dataflows. Given a set of easy-to-follow code conventions, programmers can author stateful entities, a programming abstraction embedded in Python. We present a compiler pipeline named StateFlow, to analyze the abstract syntax tree of a Python application and rewrite it into an intermediate representation based on stateful dataflow graphs. StateFlow compiles that intermediate representation to a target execution system: Apache Flink and Beam, AWS Lambda, Flink's Statefun, and Cloudburst. Through an experimental evaluation, we demonstrate that the code generated by StateFlow incurs minimal overhead. While developing and deploying our prototype, we came to observe important limitations of current dataflow systems in executing cloud applications at scale.

READ FULL TEXT
research
06/09/2021

Visualizing The Intermediate Representation of Just-in-Time Compilers

Just-in-Time (JIT) compilers are used by many modern programming systems...
research
06/10/2022

Object as a Service (OaaS): Enabling Object Abstraction in Serverless Clouds

Function as a Service (FaaS) paradigm is getting widespread and is envis...
research
02/25/2019

Reliable State Machines: A Framework for Programming Reliable Cloud Services

Building reliable applications for the cloud is challenging because of u...
research
12/12/2018

STEP : A Distributed Multi-threading Framework Towards Efficient Data Analytics

Various general-purpose distributed systems have been proposed to cope w...
research
08/25/2021

Visualizing JIT Compiler Graphs

Just-in-time (JIT) compilers are used by many modern programming systems...
research
07/09/2020

A Programming Model for Hybrid Workflows: combining Task-based Workflows and Dataflows all-in-one

This paper tries to reduce the effort of learning, deploying, and integr...

Please sign up or login with your details

Forgot password? Click here to reset