Streaming vs. Functions: A Cost Perspective on Cloud Event Processing

In cloud event processing, data generated at the edge is processed in real-time by cloud resources. Both distributed stream processing (DSP) and Function-as-a-Service (FaaS) have been proposed to implement such event processing applications. FaaS emphasizes fast development and easy operation, while DSP emphasizes efficient handling of large data volumes. Despite their architectural differences, both can be used to model and implement loosely-coupled job graphs. In this paper, we consider the selection of FaaS and DSP from a cost perspective. We implement stateless and stateful workflows from the Theodolite benchmarking suite using cloud FaaS and DSP. In an extensive evaluation, we show how application type, cloud service provider, and runtime environment can influence the cost of application deployments and derive decision guidelines for cloud engineers.

READ FULL TEXT VIEW PDF
12/19/2019

Resource- and Message Size-Aware Scheduling of Stream Processing at the Edge with application to Realtime Microscopy

Whilst computational resources at the cloud edge can be leveraged to imp...
03/12/2022

Stateless or stateful FaaS? I'll take both!

Serverless computing has emerged as a very popular cloud technology, tog...
05/15/2019

Towards a Security-Aware Benchmarking Framework for Function-as-a-Service

In a world, where complexity increases on a daily basis the Function-as-...
05/17/2021

A Two-Sided Matching Model for Data Stream Processing in the Cloud-Fog Continuum

Latency-sensitive and bandwidth-intensive stream processing applications...
12/28/2020

SeBS: A Serverless Benchmark Suite for Function-as-a-Service Computing

Function-as-a-Service (FaaS) is one of the most promising directions for...
05/25/2018

The Architectural Implications of Microservices in the Cloud

Cloud services have recently undergone a shift from monolithic applicati...
11/27/2019

Serverless seismic imaging in the cloud

This abstract presents a serverless approach to seismic imaging in the c...

Code Repositories

cloud-event-processing-costs

None


view repo

1 Introduction

The increasing degree of data generation at the edge, e.g., by web clients or IoT devices, has led to a growing demand for live data and event processing in the cloud [1, 2, 3]. Today, the most popular paradigms for this are distributed stream processing (DSP) and Function-as-a-Service (FaaS) [4, 5]. In both paradigms, data processing applications are modeled as loosely-coupled graphs of data operations.

In DSP, this graph is a network of operators, deployed on a stream processing engine running on a distributed cluster of compute nodes. The stream processing engine partitions incoming data across nodes for horizontal scalability, hence, parallelizing the data processing workflow for the developer [1, 6]. Typical examples of stream processing engines include Apache Flink111https://flink.apache.org/ and Google Cloud Dataflow222https://cloud.google.com/dataflow/ [7, 8]. FaaS platforms, e.g., AWS Lambda333https://aws.amazon.com/lambda/ and Google Cloud Functions444https://cloud.google.com/functions/, allow developers to deploy small, stateless functions on managed infrastructure that are billed per invocation and run duration. These functions can also be chained to build larger applications, e.g., through synchronous invocations or asynchronously by sharing state in a database [9, 10]. The managed approach promises high elasticity and scalability for developers and allows cloud service providers to allocate their infrastructure more efficiently [11, 12].

Despite their architectural differences, both DSP and FaaS can be used to model the loosely-coupled job graphs that underlie cloud data processing [13, 5]. Beyond some qualitative concerns, different billing models introduce a cost dimension that should be taken into account when designing data processing applications and choosing between paradigms [14, 11]. In this paper, we quantify this cost dimension through cost benchmarking [15] to let application developers and cloud engineers make more informed decisions when designing event processing applications. We make the following contributions:

  • We present an application-centric benchmark with both stateful and stateless applications for cost-benchmarking of DSP and FaaS environments (Section 3).

  • In experiments, we analyze the impact of processing paradigm, type of application, execution environment, and choice of cloud provider on the cost of an event processing application deployment (Section 4).

  • We provide decision guidelines for application developers based on our quantitative data (Section 5).

  • We discuss the limitations of our work and derive avenues for future work (Section 6).

We make our implementation available as open-source

555https://github.com/pfandzelter/cloud-event-processing-costs to enable other researchers and practitioners to conduct their own experiments.

2 Background

While the concept of cloud computing is well-established in both research and industry, paradigms for cloud applications are constantly evolving. In this section, we give an overview of distributed stream processing and Functions-as-a-Service, two of today’s most common cloud data processing paradigms [1, 5], and introduce the related terminology.

2.1 Distributed Stream Processing

Most distributed stream processing engines extend the well-known MapReduce pattern [16] with support for processing continuous data streams. In modern DSP engines, developers define dataflow graphs (called pipelines or jobs) of operators using a declarative programming model [7, 6]. Prominent examples for DSP engines are the open source projects Apache Flink [17], Apache Samza [18], and Apache Kafka Streams [19], or cloud services such as Google Cloud Dataflow [20]. Apache Beam666https://beam.apache.org/ is a framework providing a unified programming model [20] to define dataflow graphs, which can be executed by many stream processing engines.

DSP engines are deployed as clusters of multiple instances (e.g., on different computing nodes). To enable horizontal scalability, data streams between operators are partitioned and operators are scheduled on multiple instances, where each operator instance processes only a portion of the data. The key idea is that state should only be maintained locally in an operator instance. Additionally, DSP engines often use periodic checkpointing and require durable, replayable data sources to ensure fault tolerance.

While stream processing engines have traditionally been operated as long-running clusters on virtual machines, they are now often deployed in standalone, cloud-native applications. In particular, containerization techniques and Kubernetes, the de-facto standard for container orchestration [21], are used to reduce the operational complexity when running DSP jobs at large scale; managed Kubernetes services are provided on all major cloud platforms. In addition to a (fixed) cluster management fee, users of such services are billed variable cost for the allocated VMs or, more recently, for the actual resource usage of containers.

2.2 Function-as-a-Service

In the FaaS programming model, developers deploy applications in the form of individual functions to a FaaS platform that handles event-driven code invocation and horizontal scaling. Function infrastructure is completely handled by the cloud service provider, i.e., “serverless”, and consumers pay per request based on the resources consumed by a function [12, 22]. Functions can be implemented in a number of programming languages and can be invoked by web requests, IoT sensor readings, database updates, and even other functions, so-called function chaining [23, 24].

A key element to horizontal scalability is that function instances logically exist only for the duration of a single invocation and do not support any state beyond that execution [10]. To support stateful applications, functions usually leverage serverless, pay-per-request cloud datastores such as Google Cloud Firestore777https://cloud.google.com/firestore/ or AWS DynamoDB888https://aws.amazon.com/dynamodb/ [25, 26].

In combination with lightweight virtualization techniques, such as containers or microVMs [27, 28], FaaS platforms can quickly spin up new and destroy old function instances, enabling rapid elasticity. The low management burden for the consumer and the wide range of possible applications are clear advantages for developers. For cloud service providers, the fine-grained execution of functions enables a more efficient allocation of their infrastructure [12].

3 Cost Benchmark

Both FaaS and DSP can be used to build cloud event processing applications. To quantify the cost dimension of the decision between the two paradigms when building such an application, we introduce a new application-centric cost benchmark that can be applied to any cloud processing paradigm. The proposed benchmark comprises an application implementing two example use-cases, which could easily be extended, the load generator which creates requests for the application, and its configuration. The system under test (SUT) in each benchmark is either a FaaS or a DSP platform. In this section, we present our proposed benchmark and the methodology for executing it.

3.1 Cloud Event Processing Use-Case

Our example application is derived from the Theodolite suite of stream processing scalability benchmarks [4]. Both use-cases are designed for Industrial Internet of Things (IIoT) event processing in the context of a smart factory, where sensors at the edge produce large amounts of data that require real-time event processing in the cloud [29]. We chose both a stateless and a stateful use-case in order to quantify the impact of state management on application cost.

Stateless Storage

Figure 1: In the stateless storage use-case (UC1), input data is transformed and then persisted in a storage backend.

The first use-case (UC1) persists incoming data in a cloud database. Such an operation is often required for archiving events and making them accessible to other applications. As shown in Fig. 1, incoming events are first transformed to match the data format, required by the database API, and then written to that database system. Since each data item is treated individually, no state is maintained within the application.

Stateful Sliding Window Aggregation

Figure 2: In the stateful sliding window aggregation use-case (UC2), incoming data items are grouped in fixed-size time windows based on their key. Within a window, data is aggregated and then forwarded for further processing, e.g., to persist it.

As a second use-case, we chose a sliding window data aggregation (UC2). Such aggregations are used in many scenarios, e.g., to derive a smoothed trend. As illustrated in Fig. 2, incoming data is first windowed in a sliding window. Data within a window is then aggregated by computing summary statistics, yielding a moving aggregation. Results of this windowed aggregation can then be used in further data processing. In this use-case, the windowing of data requires application state.

3.2 Benchmark Methodology

Executing the benchmark for a platform entails threes steps:

  1. An application containing the two use-cases is implemented for the chosen platform.

  2. For different rates of data ingress, i.e., workload levels, load is generated against the application.

  3. The total cost of running the application is measured for the duration of processing a constant data rate.

Application Implementation

Our benchmark first requires that the two use-cases are implemented for a chosen SUT, such as a DSP in a specific cloud environment or a FaaS offering. Although we present some implementations in Section 4, we cannot provide a generic, ready-to-use implementation for all possible SUTs, as implementation details are highly SUT-specific. The application implementation should also conform to any best practices for the chosen SUT to support a fair comparison [15].

The benchmark is not restricted to evaluating only the difference between DSP and FaaS, but can also be used to evaluate other scenarios: Users might, e.g., use the benchmark to compare two different stream processing engines, compare the same engine deployed on different cloud providers, or compare the same FaaS functions using different event triggers. We explore such options in our experiments in Section 4.

Load Generation

Load is generated through a dedicated load generator deployed within the same cloud datacenter as the SUT using the Theodolite load generators. The load generators emulate a number of sensors that send data in a fixed interval in an open workload model [15]

, i.e., requests are non-blocking. By varying the number of sensors that are emulated by the load generator, we can achieve cost estimates for varying request loads. To simplify our cost calculations, we set the fixed interval at one second so that, e.g., 500 emulated sensors lead to a load of 500 requests/s.

A constant arrival rate does not necessarily reflect real world data ingress patterns. However, the goal of our benchmark is not to measure scalability or elasticity of a given platform but rather to explore the cost of operating an application for a given data rate that may reflect an average rate over time.

Cost Measurement

The result of our benchmark is an hourly cost estimate for the implementation of a use-case for a given level of constant load. To yield such an estimate, we can leverage different kinds of information provided by cloud platforms or measurements. For experiments on FaaS platforms with pay-per-request pricing models, cost estimates can be derived by extrapolating from small-scale environments, as the cost can be expected to scale linearly with the number of requests for current cloud pricing models. Additionally, any costs for database reads and writes can be derived by tracking database access and calculating the resulting cost based on per-request cost of the database system used. To achieve a cost estimate for a DSP deployment, we benchmark multiple infrastructure configurations until the least expensive deployment, which can still handle the configured load without an increase in event consumer lag, is found [30, 31].

4 Experiments

In this section, we present an extensive evaluation of different cloud event processing deployments. After an initial comparison of DSP and FaaS (Section 4.1), we use our benchmark to explore the parameter space. Specifically, we evaluate the impact of chosen event passing paradigm (Section 4.2), cloud service provider (Sections 4.4 and 4.3), FaaS runtime environment (Section 4.5), DSP engine choice (Section 4.6), a serverless DSP offering (Section 4.7), and a managed Kubernetes service (Section 4.8).

4.1 Baseline: Cloud Stream Processing and Functions

As our baseline, we compare Google Cloud Functions and Apache Flink, running Apache Beam pipelines on Google Kubernetes Engine (GKE).

(a) UC1 as FaaS Implementation
(b) UC1 as Streaming Implementation
(c) UC2 as FaaS Implementation
(d) UC2 as Streaming Implementation
Figure 3: Implementations in our baseline benchmarks: To aggregate data across multiple events, the FaaS implementation is connected to a Firestore database to persist state. As Apache Beam running on top of Apache Flink cannot process HTTP requests directly, we add an HTTP bridge and Apache Kafka.

Implementation

In the stateless storage use-case (UC1), client events are sent over HTTP and stored in Google Cloud Firestore (see Fig. (a)a). As necessary for Apache Flink, HTTP events are enqueued in Apache Kafka by a middleware prior to processing (see Fig. (b)b). Cloud Functions, on the other hand, can directly expose an HTTP endpoint.

The stateful windowed aggregation application (UC2) also receives events over HTTP, but results are emitted to the output log of the respective platform. In a real application, a further stateless operation such as UC1 might be performed afterwards, yet our goal here is to study the stateful operator in isolation. For our implementation with Flink, we use the built-in window aggregation mechanisms with RocksDB as state backend (see Fig. (d)d). To support stateful windowed aggregation on stateless functions, we store intermediate window state in a Google Cloud Firestore collection for each window (see Fig. (c)c). Both implementations are configured to aggregate data over windows of 30 seconds, with a new window starting every 3 seconds. This results in 10 windows per emulated sensor that are maintained in parallel.

As Apache Flink and its operators are implemented in Java, we also use the Java 11 runtime for our cloud functions to account for effects caused by programming language or runtime. We set the function memory to 256 MB, which is the smallest amount that can support a function execution without running into memory errors. This also limits our per-function compute resources to 0.1667 vCPU.

For our streaming implementation, we deploy Flink in a GKE cluster with different numbers of e2-standard-4 virtual machines. The overall deployment consists of one coordinating Flink jobmanager, varying numbers of Flink taskmanagers, a three-node Apache Kafka cluster, a component redirecting incoming HTTP requests to Kafka as well as some additional components for monitoring and cluster management. To ensure a reasonable degree of fault tolerance, Flink is configured with a 30-second checkpointing interval and each Kafka partition is replicated across three brokers.

All experiments are conducted in the europe-west-3 (Frankfurt) Google Cloud region, with the load generators deployed on e2-highcpu-4 virtual machines on Google Compute Engine in the same region.

Results

(a) UC1 Costs
(b) UC2 Costs
Figure 4: The cost benchmark results of our baseline comparison of Apache Flink and Google Cloud Functions show how application costs scale with request load. The overhead of operating a Kubernetes cluster for Apache Flink leads to higher costs compared to Cloud Functions at lower request loads. The request rate at which Cloud Functions become less economical than stream processing with Flink depends on the type of function: 200 req/s for UC1 and 5 req/s for UC2.

We show the results of our baseline evaluation in Fig. 4. For the application that we consider, costs scale linearly with request loads, yet at different rates. This is expected for functions, which are billed by request and where requests can be processed independently. In essence, FaaS is variable cost only. In stream processing, we instead observe a pattern of steps, which can be seen in Fig. (a)a (and more pronounced at a larger scale in Fig. (b)b). This is a result of a more coarsely grained allocation of resources, i.e., servers that need to be added to the cluster. Additionally, there is a minimum cost of running the cluster, which is the cost of a single server. Overall, this means that DSP costs are here a combination of fixed cost and variable cost which need to be added in batches. This leads to the intersection of function and cluster costs at a specific request level (200 req/s for UC1 and 5 req/s for UC2): At a request rate below this level, the fixed cost of running a single-server cluster is higher than paying per request for FaaS functions. Beyond this request rate, the overhead of operating full servers in a cluster is negligible compared to the premium of serverless functions.

(a) UC1
(b) UC2
Figure 5: Breaking down the costs per requests of our cloud function implementations of the two applications in our benchmarks, we see that database access is the major cost factor. While this does not impact UC1, where both the FaaS and DSP implementation write to Cloud Firestore and thus incur identical database access costs, storing intermediate state in UC2 accounts for 83.0% of the total cost of operating the FaaS implementation. Neither the choice of Cloud Platform, of programming language, nor of endpoint change this result significantly: AWS Lambda is 6.4% more expensive than our baseline as a result of increased DynamoDB access cost, while the choice of language runtime only changes function duration costs, which are marginal compared to Firestore access costs.

Interestingly, the break-even point is at a higher load rate for the stateless UC1 than for the stateful aggregation in UC2. For the cloud function implementation of UC2, the largest share of costs per request are caused by writes (62.2%) and reads (20.8%) to Cloud Firestore, as shown in Fig. 5. This database access is required to store intermediate state – in our implementation, each window is stored as a database entry, leading to ten read and write requests for each function invocation. In the streaming implementation, on the other hand, there is no such database access required since all state is maintained inside the Flink taskmanagers.

Takeaway for Platform Choice

Our baseline experiments show that FaaS is an economical choice over DSP for stateless applications with low to medium event arrival rates, in our case from 0 to 200 req/s. For stateful applications, where functions need to store intermediate state in a database, the cost of database access makes FaaS infeasible for anything but low-rate event processing.

4.2 Impact of Pub/Sub in FaaS and Streaming

While we use HTTP endpoints for sensors in our baseline evaluation, this does not necessarily reflect all IoT environments, where data distribution paradigms such as publish/subscribe are more common [32]. We thus further quantify the impact of endpoint choice on DSP and FaaS costs.

Implementation

We extend our baseline implementation with support for Google Cloud Pub/Sub999https://cloud.google.com/pubsub/. For our function implementation, this requires adding an event trigger and application logic for event parsing. In our Apache Flink setup, we replace the previous HTTP middleware and the Apache Kafka deployment with a direct connection to Google Cloud Pub/Sub, using the PubSubIO connectors provided by Apache Beam. Instead of sending JSON objects as done with our HTTP implementation, we send binary encoded Apache Avro101010https://avro.apache.org/ records via Pub/Sub.

Results

As shown in Fig. 5, using Cloud Pub/Sub has a noticeable effect on the execution duration of our FaaS implementations, especially in UC1, where processing costs increase by 154.6%. This effect is less pronounced for UC2, where duration increases by 8.6%. One possible explanation for this effect is an increased overhead caused by message parsing compared to HTTP, where request data is passed to our function directly as JSON rather than encoded. However, due to the relatively high costs of database access, this has only a small impact on total costs (12.9% increase for UC1 and 1.4% increase for UC2). At less than $0.04 per 1,000,000 messages, the cost per Cloud Pub/Sub message is two orders of magnitude smaller than costs incurred by message processing.

(a) UC1 Costs
(b) UC2 Costs
Figure 6: Costs increase approximately linearly for all evaluated streaming deployments. However, Google Cloud Dataflow has considerably lower costs than the other streaming engines.
(a) UC1
(b) UC2
Figure 7: Averaging the cost per request over all evaluated load profiles, we see that, similar to FaaS, writing to a database is the largest cost factor for UC1 on all deployments. For UC2, costs are similar independent of the cloud provider, endpoint, and streaming engine, but instance costs are considerably lower for Dataflow and higher for GKE Autopilot.

Figure 6 shows how costs increase with increasing load when using Cloud Pub/Sub in our Apache Flink implementation. Pub/Sub introduces an additional cost factor to the overall deployment. These costs increase at a steeper rate than the costs for the Kubernetes cluster: While the share of Pub/Sub costs in total costs is 1.5% for UC1 and 2.9% for UC2 at a load intensity of 100 req/s, it grows to 2.6% and 17.5%, respectively, at a load of 1,000 req/s. On the other hand, these additional costs are compensated by the slightly higher loads which Flink can process with Pub/Sub before requiring an additional virtual machine. Figure 7 shows that, averaged over all evaluated load profiles, costs for processing messages from Pub/Sub are similar to redirecting HTTP requests via Kafka.

Takeaway for Endpoint Choice

Our experiments show that there is no clear difference in costs when choosing Pub/Sub or HTTP, neither in DSP nor in FaaS. However, small savings are possible when using a transform method that simplifies processing. Hence, it does not seem to be reasonable to add a dedicated message transform layer just to save costs.

4.3 Different FaaS Platforms

In our baseline FaaS evaluation, we use Google Cloud Functions, yet other cloud providers offer their own serverless platforms that may have different runtime behavior and pricing, impacting the cost results of our experiments. In this experiment, we thus compare our Google Cloud Function implementation with an implementation on AWS Lambda.

Implementation

We implement our benchmark for AWS Lambda with an AWS DynamoDB serverless database. To ensure comparability, we use the Java 11 runtime and conduct our experiments in the eu-central-1 (Frankfurt) region. We again set the memory limit to 256 MB. Our load generator for this implementation runs in the same region on an m5.xlarge EC2 instance.

Results

As we expect the costs for function execution to scale linearly with event arrival rate, we consider the average cost for individual function execution which we show in Fig. 5. The average cost per function execution is 6.4% higher on AWS Lambda than on Google Cloud Functions for both applications, which is caused mainly by the more expensive database access in DynamoDB over Cloud Firestore.

Takeaway for Cloud Provider Choice in FaaS

In our experiments, the choice of FaaS provider had only a limited impact on the total cost of execution, yet we see that the cost difference can depend on the type of application as applications using other cloud platform services may encounter significant costs (which may vary between providers).

4.4 Different Kubernetes Engines

Similar to our evaluation of different FaaS Platforms, we also compare GKE and AWS Elastic Kubernetes Service (EKS).

Implementation

Deployment descriptions for Kubernetes are largely platform independent, allowing us to almost use the same deployment with EKS as with GKE. As in our evaluation of different FaaS Platforms, we write incoming events in our UC1 implementation to an AWS DynamoDB serverless database. Both our EKS cluster and the load generator for this implementation use m5.xlarge EC2 instances, running in the eu-central-1 (Frankfurt) region.

Results

As shown in Fig. (a)a, the costs for our UC1 deployment on EKS increase at a steeper rate than in the GKE deployment. Averaged over all evaluated load profiles, EKS has 24.3% higher costs than GKE as shown in Fig. (a)a. Interestingly, EKS has higher costs although the EKS deployment requires significantly less Flink taskmanager instances: Loads up to 1,100 req/s can be processed by a single taskmanager, compared to 8 instances required in the GKE deployment. However, higher costs per VM instance and especially higher costs per database write outweigh this superior performance. As we do not see such a difference in resource usage for UC2, we conclude that either DynamoDB provides faster writes than Firestore or Beam’s DynamoDB writer is more resource efficient than the Firestore writer.

In our implementation of the stateful application, we use only native Apache Beam functionality. As shown in Fig. (b)b, costs increase in EKS at a similar rate as in GKE. Depending on the load intensity, at which VMs have to be added to the cluster, either GKE or EKS is cheaper. Averaged over all evaluated load profiles, EKS has 8.8% higher costs than GKE (see Fig. (b)b). This is in accordance with the slightly higher costs per VM instance in AWS.

Takeaway for Cloud Platform Choice in Stream Processing

Similar to our findings from evaluating different FaaS platforms, the choice of cloud infrastructure for running a DSP engine has a small but noteworthy impact on the total costs. The discrepancy results mainly from different costs for cloud resources, which even outweigh significant performance gaps.

4.5 Different Programming Languages in FaaS

In our baseline FaaS evaluation, we use the Java 11 runtime in order to account for effects of programming language or runtime performance when comparing to Apache Flink. Most modern FaaS platforms support a wider variety of runtimes, and the choice of language may have an indirect impact on execution cost when an implementation requires more resources or function executions take more time.

Implementation

To quantify the effect of runtime choice, we implement our benchmark in Node.js and Go. Node.js is one of the most popular choices for cloud functions, while Go is the only programming language supported by Google Cloud Functions that is compiled directly to machine code and may thus have the smallest performance overhead [33].

Results

As shown in Fig. 5, the choice of programming language has only a small effect on the cost of function execution, with overall costs changing by -1.9% and -7.5% (Go) and 0.4% and -1.9% (Node.js) for UC1 and UC2, respectively. Although the duration of a function execution changes by -22.7% and -50.8% for UC1 and UC2 with Go, the effect on costs is insignificant compared to costs for database access. Surprisingly, the Node.js implementation is as efficient as our Java implementation. This might be caused by a more mature and optimized execution environment in Google Cloud Functions, as Node.js is one of the most popular languages for FaaS functions.

Takeaway for Language Choice in FaaS

As the majority of costs for the execution of a function are incurred by database access and not function duration, the choice of programming language has no considerable effect on the cost of our application. For stateless applications without database access, and especially for more complex functions where the largest share of costs is incurred by execution duration rather than function invocation, comparing implementation runtimes may nevertheless be beneficial.

4.6 Different Streaming Engines

We use Apache Flink for our baseline evaluation, which is a DSP engine originating in academia and extensively studied in research. In this experiment, we compare this to Apache Samza, an open source DSP engine developed in industry at LinkedIn [18]. Samza is built around similar concepts as Flink and can also be used to run Apache Beam pipelines.

Implementation

Thanks to Apache Beam, we can use exactly the same implementation for Samza as we use for Flink. In contrast to Flink, Samza does not need a dedicated coordinator, but instead uses our existing Kafka/ZooKeeper deployments for coordination among instances.

Results

In case of the stateless application, we found that Samza has a significantly higher resource demand than Flink, causing higher costs as shown in Fig. (a)a. As processing 300 req/s already requires 14 Samza instances, we extrapolated the costs for higher loads. We assume that this huge discrepancy is because we did not enable bundling, a Beam feature, which is used in Beam’s FirestoreIO to write multiple records as batch. Bundling is disabled per default and its usage is not documented for Samza.

With the stateful application, Samza performs similar to Flink. As, however, Samza scales in smaller steps, the rather small load profiles studied here result in slightly lower costs for Samza as shown in Fig. (b)b.

Takeaway for Engine Choice

In general, different stream processing engines can be operated at similar costs. However, different feature sets and inappropriate configuration options might cause cost pitfalls, particularly when interacting with other cloud services.

4.7 Serverless vs. Serverful Stream Processing

In our baseline evaluation, we compare serverless FaaS implementations with streaming implementations running in Kubernetes. Major cloud vendors also provide managed streaming offerings, which run DSP pipelines on top of hosted stream processing engines. While requiring the same development skills than with other DSP engines, serverless stream processing services can be considered an in-between of self-operated DSP engines and FaaS in terms of operational complexity.

Implementation

To compare the costs of self-operating a DSP engine with a fully-managed one, we run our Apache Beam implementations on Google Cloud Dataflow with varying numbers e2-standard-4 instances. Similar to the other engines, Dataflow should be used with a durable data source instead of ingesting data directly via HTTP. As we consider using a serverless DSP service along with a self-operated Kafka cluster to be less realistic for real-world systems, we focus on processing data from Google Cloud Pub/Sub and use the Flink experiments with Pub/Sub as baseline.

Results

As shown in Fig. 6, Google Cloud Dataflow has significantly lower costs than our Apache Flink on Kubernetes deployment. Averaged over all evaluated load profiles (see Fig. 7), Dataflow has 85.6% of the costs for operating Flink for UC1 and only 41.2% for UC2. This is primarily due to the massively reduced costs for the virtual machines as with Dataflow, fewer instances are required to process the same load, e.g., the stateful application can be run with a single VM at all tested load rates. We observed that costs for Dataflow could be further reduced when using smaller instances such as n1-standard-1 ones. Additionally, there are no general managing fees for Dataflow, while Google charges customers $0.10 per hour for managing a Kubernetes cluster. The impact of this fee on total costs decreases with increasing load. Since the largest cost driver in the stateless application are database writes, costs are reduced less than in the stateful application. An in-depth analysis of resource efficiency advantages in Dataflow is beyond the scope of this work, but possible reasons are:

  • Dataflow might in general offer a better performance than other stream processing engines.

  • Apache Beam might be optimized for Google Cloud Dataflow and, as shown in previous research [34], Flink provides much better performance when running native Flink pipelines instead of using Beam.

  • Flink’s default configuration might not be optimal and additional tuning is required to reach comparable performance.

  • Resource utilization when running Flink in small Kubernetes clusters might not be optimal.

Takeaway for Platform Choice

Processing event streams with Google Cloud Dataflow had significantly lower costs in our experiments compared to our Flink deployment. Thus, serverless stream processing services can be a compelling alternative to running stream processing engines manually in Kubernetes, reducing both operational complexity and costs.

4.8 Serverless vs. Serverful Kubernetes

Recently, cloud providers started offering managed Kubernetes services, which charge users per container resource usage instead of for the underlying VM instances. A prominent example for such a service is GKE Autopilot [35].

Implementation

As autoscaling of the Kubernetes cluster takes a considerable amount of time, running dedicated experiments with GKE Autopilot is unpractical. However, we can get a reasonable cost approximation by using the results of our baseline evaluation, in which we determined the required number of Flink taskmanagers per load profile on a sufficiently dimensioned cluster. Total costs are then the costs for the taskmanagers, combined with the constant costs for other components such as Kafka, HTTP Bridge, or monitoring.

Results

Independent of the load profile and the use case, GKE Autopilot has higher costs compared to GKE’s default mode (see Fig. 6). The relative cost difference appears to decrease with higher loads. This can be explained by a minimal cost per container that is charged independent of the actual resource usage. Moreover, the cost difference is less pronounced in the stateless application, where costs are heavily influenced by database writes (see Fig. 7).

Takeaway for Engine Choice

While serverless Kubernetes offerings reduce the management burden, they also have higher cloud service costs. Nevertheless, costs for running self-operated DSP engines in a serverless Kubernetes cluster are still lower than for FaaS at medium and high loads.

5 Decision Guidelines

In our experiments, we have quantitatively evaluated the choice between functions and stream processing for cloud event processing and have explored the impact of choosing cloud providers, endpoints, programming languages, and platforms. We see that the major influences on cost are the rate at which events arrive and the type of application. As shown in Fig. 8, FaaS is the economic choice for applications that manage little to no state and process events with low to medium arrival rates. DSP is better suited for operations that require state, such as windowed aggregation, and for applications that process more events, i.e., on the order of thousands of events per second.

Figure 8: From a cost perspective, FaaS is the best choice for applications that require less state and process less events.

Beyond these considerations, we could not observe any considerable impact of other deployment parameters on costs. The choice of a specific messaging paradigm, such as Pub/Sub or HTTP, should thus be based not on cost but on functional differences. Similarly, the choice of cloud service provider did not influence costs significantly and might be influenced more by specific services that a provider offers.

6 Limitations & Future Research Directions

In our benchmarks and guidelines, we consider solely the cost incurred by cloud resources for different deployments of our applications. Particularly, we did not try to quantify the “human resource” costs for implementing, operating, and maintaining a specific target design. Beyond both cost types, there are other aspects that may influence the design of a cloud event processing application. We discuss these perspectives here and derive avenues in which our work could be extended in the future.

6.1 Non-Constant Workloads & Elasticity

In our benchmark experiments, we consider a constant event arrival rate as our goal is to measure deployment costs at a specific load. In some domains, workloads may instead fluctuate, requiring elasticity from the processing application. This elasticity is handled differently in DSP and FaaS: As functions are stateless and can be scaled horizontally quickly, load peaks can be processed in real-time. This will briefly increase costs for a FaaS deployment. In DSP, such peaks may be handled by queuing events and processing them once load has reduced. This does not require any additional infrastructure and hence does not incur additional costs as long as sufficient queue capacity exists. Alternatively, infrastructure can be expanded easily by adding more compute nodes to the cluster. Compared to FaaS platforms, such horizontal scaling is rather slow and will still require queuing. Depending on the billing scheme of the runtime platform as well as the scale-in strategy, short load spikes can also mean that the DSP cluster is overprovisioned (and thus over-expensive) for some time after the load spike whereas FaaS providers pay the costs for keeping functions warm after a load spike.

6.2 Stateful Functions

Our experiments show that building a stateful processor with serverless functions leads to high costs incurred by database access used to persist state. Recently, there have been some proposals to add mechanisms for stateful stream processing to function platforms, e.g., [25, 26, 36]. These approaches typically include a dedicated datastore directly in the FaaS platform, which could reduce access costs. However, public cloud vendors do not offer such services at this time, leaving engineers only the option of dedicated cloud datastores. As an alternative, engineers might use an open source FaaS system and retrofit the “sharding by key” features to its load balancer and use local ephemeral storage for state. This would require significant engineering and infrastructure management efforts, breaking the concept of “serverless” platforms.

6.3 Lock-In Effects

In addition to deployment costs, there are “hidden” costs to building cloud applications with managed services such as Kubernetes engines or FaaS platforms: Lock-in effects increase the effort required to move between cloud vendors. Such effects could also influence the decision between DSP and FaaS as paradigms for a cloud event processing application: We were able to move our Apache Flink benchmark implementation from Google Kubernetes Engine to AWS EKS with little effort (Section 4.4) as both platforms understand similar Kubernetes application descriptions. Porting our implementation from Google Cloud Functions to AWS Lambda (Section 4.3), however, required changing the highly platform-specific function implementation almost completely.

6.4 SLAs and SLOs

A further factor that is beyond the scope of our benchmark is the influence of different service level agreements (SLA) and service level objectives (SLO) on the true cost of an application deployment. For self-managed streaming applications in Kubernetes, only very basic SLAs are guaranteed by the cloud provider such as the availability of compute instances. Application-level SLOs such as maximum latency must be monitored and managed by the operator. As FaaS platforms are fully managed by the provider, they may provide further guarantees on availability.

6.5 Tuning for Cost-efficiency

Finding a cost-optimal configuration (e.g, machine type, cluster size, or stream processing engine settings) for a self-operated DSP deployment is a complex task, especially in comparison to FaaS. This is even more important when comparing managed stream processing services against self-operated ones and may also explain why we found Cloud Dataflow to be significantly less expensive than running Apache Flink. We cannot exclude that Apache Flink can be tuned for better performance to achieve similar or better cost efficiency than FaaS for low event rates or than Google Cloud Dataflow in general. However, such performance tunings come at the cost of expert knowledge or extensive benchmarking.

7 Related Work

Although including a cost model in cloud benchmarking studies is considered good scientific practice [37], in existing benchmarking studies on FaaS [38, 39, 40, 24, 41] and DSP [42, 43, 4], cost evaluations can mainly be found for cloud functions, where the pay-per-execution pricing model has presented a significant paradigm shift.

LIBRA [44] is an approach to offload FaaS function invocations to self-managed function infrastructure to leverage economies of scale and decrease costs for FaaS applications. Conversely, SplitServe [45] offloads latency-sensitive Apache Spark jobs to a FaaS platform to manage unexpected spikes in demand. Chadha et al. [46] present a comprehensive evaluation of the impacts of runtime, region, and processor architecture choice on the performance and cost of compute-intensive functions on Google Cloud Functions. Similarly, Eivy [47] gives an overview and discussion of cloud FaaS pricing and Cordingly et al. [48] introduce SAAF, a cost and performance predictor for serverless functions. In the context of DSP, Truong et al. [49] present a resource provisioning strategy that optimizes costs for cloud data processing and Bedini et al. [50] show an approach to model the performance of the Apache Storm stream processing engine. To the best of our knowledge, existing work has not compared FaaS and DSP to implement the same application.

Copik et al. [39] evaluate how Infrastructure-as-a-Service costs relate to FaaS costs, finding that IaaS provides better performance at lower costs if high utilization could be reached. Similarly, Müller et al. [51] compare the costs of Query-as-a-Service systems with FaaS costs and show that cold data can be requested significantly cheaper with FaaS.

Previous research comparing different stream processing systems focuses on self-operated, open source systems such as Apache Storm, Apache Flink, and Apache Spark and does not include cloud services for DSP [52, 53, 54, 4]. Akidau et al. [8] present a performance comparison of Apache Flink and Google Cloud Dataflow on GCP. These evaluations, however, do not focus on cloud infrastructure costs.

In previous work [5], we have considered the choice between functions, stream processing, and batch processing for IoT data and event processing in the fog from a qualitative perspective and derived a set of best practices. With a focus on cloud event processing in this paper, we have extended this with a quantitative evaluation focusing on the cost dimension.

8 Conclusion

In this paper, we have presented a cost perspective on cloud event processing. We have presented a novel application-centric cost benchmark with workflows from an IIoT context that include both a stateless and a stateful job graph. Further, we have used this benchmark to compare distributed stream processing and Functions-as-a-Service, today’s most popular cloud event processing paradigms, and have explored the parameter space to evaluate which factors influence the cost of operating event processing applications in the cloud. Based on these learnings, we have derived guidelines for designing such applications.

Acknowledgements

Partially funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 415899119. This material is based upon works supported by the Google Cloud Research Credits program with the awards GCP209186206 and GCP203304083.

References

  • [1] L. Thamsen, J. Beilharz, A. Polze, and O. Kao, “The methods of cloud computing,” Technische Universität Berlin, Tech. Rep., Feb. 2022.
  • [2] T. Pfandzelter, J. Hasenburg, and D. Bermbach, “From zero to fog: Efficient engineering of fog-based internet of things applications,” Software: Practice and Experience, vol. 51, no. 8, pp. 1798–1821, 2021.
  • [3] D. Bermbach, A. Chandra, C. Krintz, A. Gokhale, A. Slominski, L. Thamsen, E. Cavalcante, T. Guo, I. Brandic, and R. Wolski, “On the future of cloud engineering,” in Proceedings of the 9th IEEE International Conference on Cloud Engineering (IC2E 2021), Oct. 2021, pp. 264–275.
  • [4] S. Henning and W. Hasselbring, “Theodolite: Scalability benchmarking of distributed stream processing engines in microservice architectures,” Big Data Research, vol. 25, p. 100209, 2021.
  • [5] T. Pfandzelter and D. Bermbach, “IoT data processing in the fog: Functions, streams, or batch processing?” in Proceedings of the 1st Workshop on Efficient Data Movement in Fog Computing (DaMove 2019), Jun. 2019, pp. 201–206.
  • [6] A. Margara, G. Cugola, N. Felicioni, and S. Cilloni, “A model and survey of distributed data-intensive systems,” arXiv:2203.10836 [cs.DC], 2022.
  • [7] M. Fragkoulis, P. Carbone, V. Kalavri, and A. Katsifodimos, “A survey on the evolution of stream processing systems,” arXiv:2008.00842 [cs.DC], 2020.
  • [8] T. Akidau, E. Begoli, S. Chernyak, F. Hueske, K. Knight, K. Knowles, D. Mills, and D. Sotolongo, “Watermarks in stream processing systems: Semantics and comparative analysis of apache flink and google cloud dataflow,” Proceedings of the VLDB Endowment, vol. 14, no. 12, pp. 3135–3147, 2021.
  • [9] A. Mahgoub, L. Wang, K. Shankar, Y. Zhang, H. Tian, S. Mitra, Y. Peng, H. Wang, A. Klimovic, H. Yang, and Others, “SONIC: Application-aware data passing for chained serverless applications,” in Proceedings of the 2021 USENIX Annual Technical Conference (USENIX ATC ’21), Jul. 2021, pp. 285–301.
  • [10] M. Copik, A. Calotoiu, K. Taranov, and T. Hoefler, “FaasKeeper: a blueprint for serverless services,” arXiv:2203.14859 [cs.DC], 2022.
  • [11] S. Eismann, J. Scheuner, E. van Eyk, M. Schwinger, J. Grohmann, N. Herbst, C. Abad, and A. Iosup, “Serverless applications: Why, when, and how?” IEEE Software, vol. 38, no. 1, pp. 32–39, 2021.
  • [12] S. Hendrickson, S. Sturdevant, T. Harter, V. Venkataramani, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau, “Serverless computation with openLambda,” in Proceedings of the 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud ’16), Jun. 2016.
  • [13] P. Castro, V. Ishakian, V. Muthusamy, and A. Slominski, “The rise of serverless computing,” Communications of the ACM, vol. 62, no. 12, pp. 44–54, 2019.
  • [14] E. van Eyk, A. Iosup, C. L. Abad, J. Grohmann, and S. Eismann, “A SPEC RG cloud group’s vision on the performance challenges of FaaS cloud architectures,” in Companion of the 2018 ACM/SPEC International Conference on Performance Engineering (ICPE ’18), Apr. 2018, pp. 21–24.
  • [15] D. Bermbach, E. Wittern, and S. Tai, Cloud Service Benchmarking: Measuring Quality of Cloud Services from a Client Perspective.   Springer, 2017.
  • [16] J. Dean and S. Ghemawat, “MapReduce: a flexible data processing tool,” Communications of the ACM, vol. 53, no. 1, pp. 72–77, 2010.
  • [17] P. Carbone, A. Katsifodimos, S. Ewen, V. Markl, S. Haridi, and K. Tzoumas, “Apache Flink: Stream and batch processing in a single engine,” Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, vol. 36, no. 4, 2015.
  • [18] S. A. Noghabi, K. Paramasivam, Y. Pan, N. Ramesh, J. Bringhurst, I. Gupta, and R. H. Campbell, “Samza: Stateful scalable stream processing at LinkedIn,” Proceedings of the VLDB Endowment, vol. 10, no. 12, pp. 1634–1645, 2017.
  • [19] G. Wang, L. Chen, A. Dikshit, J. Gustafson, B. Chen, M. J. Sax, J. Roesler, S. Blee-Goldman, B. Cadonna, A. Mehta, V. Madan, and J. Rao, “Consistency and completeness: Rethinking distributed stream processing in Apache Kafka,” in Proceedings of the 2021 International Conference on Management of Data (SIGMOD/PODS ’21), Jun. 2021, pp. 2602–2613.
  • [20] T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Fernández-Moctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, and S. Whittle, “The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing,” Proceedings of the VLDB Endowment, vol. 8, no. 12, pp. 1792–1803, 2015.
  • [21] Cloud Native Computing Foundation, “CNCF annual survey 2021,” https://www.cncf.io/reports/cncf-annual-survey-2021/, Feb. 2022, accessed: 2022-04-07.
  • [22] J. Scheuner and P. Leitner, “Function-as-a-Service performance evaluation: A multivocal literature review,” Journal of Systems and Software, vol. 170, p. 110708, 2020.
  • [23] Z. Jia and E. Witchel, “Nightcore: efficient and scalable serverless computing for latency-sensitive, interactive microservices,” in Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2021), Apr. 2021, pp. 152–166.
  • [24] M. Grambow, T. Pfandzelter, L. Burchard, C. Schubert, M. Zhao, and D. Bermbach, “BeFaaS: An application-centric benchmarking framework for faas platforms,” in Proceedings of the 9th IEEE International Conference on Cloud Engineering (IC2E 2021), Oct. 2021, pp. 1–8.
  • [25] A. Akhter, M. Fragkoulis, and A. Katsifodimos, “Stateful functions as a service in action,” Proceedings of the VLDB Endowment, vol. 12, no. 12, pp. 1890–1893, 2019.
  • [26] V. Sreekanti, C. Wu, X. C. Lin, J. Schleier-Smith, J. E. Gonzalez, J. M. Hellerstein, and A. Tumanov, “Cloudburst: Stateful functions-as-a-service,” Proceedings of the VLDB Endowment, vol. 13, no. 12, pp. 2438–2452, 2020.
  • [27] T. Pfandzelter and D. Bermbach, “tinyFaaS: A lightweight faas platform for edge environments,” in Proceedings of the Second IEEE International Conference on Fog Computing (ICFC 2020), Apr. 2020, pp. 17–24.
  • [28] A. Agache, M. Brooker, A. Iordache, A. Liguori, R. Neugebauer, P. Piwonka, and D.-M. Popa, “Firecracker: Lightweight virtualization for serverless applications,” in Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’20), Feb. 2020, pp. 419–434.
  • [29] S. Henning, W. Hasselbring, H. Burmester, A. Möbius, and M. Wojcieszak, “Goals and measures for analyzing power consumption data in manufacturing enterprises,” Journal of Data, Information and Management, vol. 3, no. 1, pp. 65–82, 2021.
  • [30] S. Henning and W. Hasselbring, “How to measure scalability of distributed stream processing engines?” in Proceedings of the Companion of the ACM/SPEC International Conference on Performance Engineering (ICPE ’21), Apr. 2021, pp. 85–88.
  • [31] G. Brataas, N. Herbst, S. Ivanšek, and J. Polutnik, “Scalability analysis of cloud software services,” in Proceedings of the 2017 IEEE International Conference on Autonomic Computing (ICAC), Jul. 2017, pp. 285–292.
  • [32] J. Hasenburg, F. Stanek, F. Tschorsch, and D. Bermbach, “Managing latency and excess data dissemination in fog-based publish/subscribe systems,” in Proceedings of the Second IEEE International Conference on Fog Computing (ICFC 2020), Apr. 2020, pp. 9–16.
  • [33] R. Cordingly, H. Yu, V. Hoang, D. Perez, D. Foster, Z. Sadeghi, R. Hatchett, and W. J. Lloyd, “Implications of programming language selection for serverless data processing pipelines,” in Proceedings of the 2020 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Aug. 2020, pp. 704–711.
  • [34] G. Hesse, C. Matthies, K. Glass, J. Huegle, and M. Uflacker, “Quantitative impact evaluation of an abstraction layer for data stream processing systems,” in Proceedings of the 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), Jul. 2019, pp. 1381–1392.
  • [35] D. Bradstock, “Introducing GKE Autopilot: a revolution in managed Kubernetes,” https://cloud.google.com/blog/products/containers-kubernetes/introducing-gke-autopilot, Feb. 2021, accessed: 2022-04-19.
  • [36] M. de Heus, K. Psarakis, M. Fragkoulis, and A. Katsifodimos, “Distributed transactions on serverless stateful functions,” in Proceedings of the 15th ACM International Conference on Distributed and Event-Based Systems (DEBS ’21), Jun. 2021, pp. 31–42.
  • [37] A. V. Papadopoulos, L. Versluis, A. Bauer, N. Herbst, J. v. Kistowski, A. Ali-Eldin, C. L. Abad, J. N. Amaral, P. Tůma, and A. Iosup, “Methodological principles for reproducible performance evaluation in cloud computing,” IEEE Transactions on Software Engineering, vol. 47, no. 8, pp. 1528–1543, 2021.
  • [38] J. Kuhlenkamp, S. Werner, M. C. Borges, D. Ernst, and D. Wenzel, “Benchmarking elasticity of faas platforms as a foundation for objective-driven design of serverless applications,” in Proceedings of the 35th Annual ACM Symposium on Applied Computing (SAC ’20), Mar. 2020, pp. 1576–1585.
  • [39] M. Copik, G. Kwasniewski, M. Besta, M. Podstawski, and T. Hoefler, “SeBS: A serverless benchmark suite for function-as-a-service computing,” in Proceedings of the 22nd International Middleware Conference (Middleware ’21), Dec. 2021, pp. 64–78.
  • [40] K. L. Ngo, J. Mukherjee, Z. M. Jiang, and M. Litoiu, “Evaluating the scalability and elasticity of function as a service platform,” in Proceedings of the 2022 ACM/SPEC on International Conference on Performance Engineering (ICPE ’22), Apr. 2022, pp. 117–124.
  • [41] E. van Eyk, J. Scheuner, S. Eismann, C. L. Abad, and A. Iosup, “Beyond microbenchmarks: The SPEC-RG vision for a comprehensive serverless benchmark,” in Proceedings of the Companion of the ACM/SPEC International Conference on Performance Engineering (ICPE ’20), Apr. 2020, pp. 26–31.
  • [42] M. V. Bordin, D. Griebler, G. Mencagli, C. F. R. Geyer, and L. G. L. Fernandes, “DSPBench: A suite of benchmark applications for distributed data stream processing systems,” IEEE Access, vol. 8, pp. 222 900–222 917, 2020.
  • [43] G. van Dongen and D. Van den Poel, “Influencing factors in the scalability of distributed stream processing jobs,” IEEE Access, vol. 9, pp. 109 413–109 431, 2021.
  • [44] A. Raza, Z. Zhang, N. Akhtar, V. Isahagian, and I. Matta, “LIBRA: An economical hybrid approach for cloud applications with strict SLAs,” in Proceedings of the 2021 IEEE International Conference on Cloud Engineering (IC2E 2021), Oct. 2021, pp. 136–146.
  • [45] A. Jain, A. F. Baarzi, G. Kesidis, B. Urgaonkar, N. Alfares, and M. Kandemir, “SplitServe: Efficiently splitting apache spark jobs across FaaS and IaaS,” in Proceedings of the 21st International Middleware Conference (Middleware ’20), Dec. 2020, pp. 236–250.
  • [46] M. Chadha, A. Jindal, and M. Gerndt, “Architecture-specific performance optimization of compute-intensive faas functions,” in Proceedings of the 2021 IEEE 14th International Conference on Cloud Computing (CLOUD 2021), Sep. 2021, pp. 478–483.
  • [47] A. Eivy, “Be wary of the economics of “serverless” cloud computing,” IEEE Cloud Computing, vol. 4, no. 2, pp. 6–12, 2017.
  • [48] R. Cordingly, W. Shu, and W. J. Lloyd, “Predicting performance and cost of serverless computing functions with SAAF,” in Proceedings of the 2020 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Aug. 2020, pp. 640–649.
  • [49] T. M. Truong, A. Harwood, R. O. Sinnott, and S. Chen, “Cost-efficient stream processing on the cloud,” in Proceedings of the 2019 IEEE 12th International Conference on Cloud Computing (CLOUD 2019), Jul. 2019, pp. 209–213.
  • [50] I. Bedini, S. Sakr, B. Theeten, A. Sala, and P. Cogan, “Modeling performance of a parallel streaming engine: Bridging theory and costs,” in Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE ’13), Apr. 2013, pp. 173–184.
  • [51] I. Müller, R. Marroquín, and G. Alonso, “Lambada: Interactive data analytics on cold data using serverless cloud infrastructure,” in Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD ’20), Jun. 2020, pp. 115–130.
  • [52] J. Karimov, T. Rabl, A. Katsifodimos, R. Samarev, H. Heiskanen, and V. Markl, “Benchmarking distributed stream data processing systems,” in Proceedings of the 2018 IEEE 34th International Conference on Data Engineering (ICDE), Apr. 2018, pp. 1507–1518.
  • [53] G. van Dongen and D. Van den Poel, “Evaluation of stream processing frameworks,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 8, pp. 1845–1858, 2020.
  • [54] G. Hesse, C. Matthies, M. Perscheid, M. Uflacker, and H. Plattner, “ESPBench: The enterprise stream processing benchmark,” in Proceedings of the ACM/SPEC International Conference on Performance Engineering (ICPE ’21), Apr. 2021, pp. 201–212.