Formalizing Event-Driven Behavior of Serverless Applications

12/08/2019 ∙ by Matthew Obetz, et al. ∙ Rensselaer Polytechnic Institute 0

We present new operational semantics for serverless computing that model the event-driven relationships between serverless functions, as well as their interaction with platforms services such as databases and object stores. These semantics precisely encapsulate how control transfers between functions, both directly and through reads and writes to platform services. We use these semantics to define the notion of the service call graph for serverless applications that captures program flows through functions and services. Finally, we construct service call graphs for twelve serverless JavaScript applications, using a prototype of our call graph construction algorithm, and we evaluate their accuracy.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Serverless computing is a programming model where code executes on-demand in a shared network of pre-configured computing resources [26]. By pooling resources and managing the execution platform, serverless platform providers are able to offer highly elastic scaling by balancing workloads across multiple physical servers [32]. This distribution across physical servers is made possible through the use of containers that bundle serverless code with a virtual execution environment. Serverless providers can dynamically scale the number of instances of a serverless application by creating multiple containers. To facilitate this approach, serverless application code is organized into stateless serverless functions that execute in response to events. As a consequence, serverless computing heavily favors microservice architectures; serverless functions pass messages and subscribe to notifications from other platform services to complete tasks cooperatively [34]. The advantages of this platform have led to the rapid adoption of commercial serverless platforms by developers in industry [12].

However, the serverless model also presents new challenges. In particular, previous research has cited a lack of tooling for development and debugging of serverless applications [36]. Without access to these tools, developers may struggle to trace executions, measure performance, and verify the security of programs they write. While significant progress has been made toward answering these questions for traditional programs using program analysis, this analysis has not yet been significantly extended to work in the serverless domain. Recent surveys of the state of serverless computing have suggested that static analysis can help address these challenges [22].

Existing abstractions for serverless computing emphasize unique features of the environment where serverless functions are executed [20, 15]. However, these abstractions do not consider effects of transmitting data to other services and functions. Data transmitted in this fashion is commonly replicated to new executions of serverless functions that spawn in response to a change in state on their associated service. Without operational semantics that capture this behavior, program analysis cannot construct a precise call graph and cannot soundly reason about dataflow between parts of a serverless application. The lack of semantics to describe these event triggers also serves as a barrier to more advanced reasoning about data privacy, application correctness, and resource usage.

To address this gap, we propose new operational semantics for event-driven serverless computation. These semantics describe how writes and reads to platform services create inter-function control transfer in serverless applications. Our semantics formalize the most common platform services including object stores, databases, notifications, queues and stateless services. We then define a new approach to call graph construction for serverless applications that uses these semantics to augment call graphs with information about relationships between serverless functions and platform services. We introduce the notion of the service call graph, which extends the classical call graph to include new nodes. These new nodes represent the platform services written to or read by application code to produce control flow that spans multiple disconnected parts of a program. By tracing control flow through reads and writes to services, individual serverless functions become a single unified application with additional context describing what data may flow to later functions in a call chain.

We make the following contributions:

  • [noitemsep,nolistsep,leftmargin=*]

  • We formulate new operational semantics for the execution of serverless programs. These semantics precisely model interactions with platform services, including event triggers that causes additional functions to execute.

  • We extend the traditional notion of a call graph with new types of nodes and edges that represent event-driven behavior on serverless platforms. These new nodes and edges capture the inter-function control and state transfer represented in our operational semantics.

  • We design and implement an algorithm for constructing call graphs of serverless programs. We evaluate the accuracy of our approach by presenting metrics on the call graphs produced by a prototype implementation of our algorithm against serverless programs collected from GitHub. We focus on applications written in Javascript for the AWS Lambda platform [9]; we choose Javascript as it is the most common language for AWS Lambda programs.

1.0.1 Related Work.

The semantics we define for the lifecycle of a single serverless function are closely related to those used in a recent formalization of serverless computing [20]. That work focused on modeling low-level behavior of serverless systems. Such models are useful for capturing behavior such as program non-determinism that can arise from reading state from previous executions of serverless functions. Our semantics start from this model to describe initiating requests, language-agnostic computation steps, and generated responses. However, the semantics defined in [20] do not capture inter-function communication and program flows that span multiple serverless functions. Specifically, these semantics limit data persistence to a locking transactional key-value store. Our semantics introduce several new state domains that model the behavior of these services. More importantly, the previous semantics also lack a conceptualization of serverless events, which initiate execution of a serverless function when state is manipulated on a data storage service. We model these interactions by extending the semantics with a new collection of event semantics that capture state transfer between serverless components.

Dynamic analysis has also been previously explored as a tool for reasoning about the behavior of serverless applications. Specifically, this line of research has developed systems to visualize program structure [24, 25], track the flow of sensitive information [1], and measure resource costs [35]. These systems instrument new tools that modify or extend serverless platforms with runtime logging and label checking that enable them to make partial judgments about security given their partial view of current system state. By contrast, we are interested in formalizing serverless behavior so that such analyses may be performed statically without requiring application deployment or execution.

The service call graph shares some features of message flow graphs for distributed event-based systems that communicate through publish-subscribe middleware [16]. The publish-subscribe model is related to the serverless notification systems, however, retrieval of data from databases and object stores cannot be succinctly captured in publish-subscribe semantics. Our work considers not only notification-based communication, but also messages that pass through other channels available to serverless applications.

In preliminary work [30], we introduced the notion of a service call graph for serverless applications. In this paper, we formalize the call graph definition in terms of our new operational semantics. Further, we design and implement a call graph construction algorithm and present experimental results on eight real-world serverless applications.

1.0.2 Outline.

The remainder of this paper is organized into the following sections. Section 2 defines the serverless computing model, then Section 3 maps this model onto a set of operational semantics that formalize serverless computation. Section 4 presents our a serverless call graph construction. We evaluate the accuracy of call graph construction in Section 5, and present conclude in Section 6.

2 The Serverless Model

Serverless computing is a new programming model that allows developers to execute modules of code in a distributed setting without specifying the physical servers where this code will run. The most common implementation of serverless computing is the Function-as-a-Service (FaaS) model. In this model, an application is decomposed into a collection of serverless functions.

Most platforms provision resources for serverless functions by deploying virtual containers to execute function code [14]. These containers provide a minimal operating system, including a language interpreter specified by the function runtime. Since these containers can be quickly created and destroyed on-demand, serverless platforms are able to reuse the same physical hardware to service multiple functions, even from different users. Additionally, many serverless platforms are optimized to allow the container of a frequently-called function to remain provisioned. A warm start occurs when a function is invoked from a reused container. Warm starts eliminate the latency associated with waiting for new containers to initialize during a cold start, but create a risk of stale memory allocations from previous invocations of the function affecting program behavior.

Serverless functions interact with one another either via direct invocation or via services that expose interfaces for data storage and messaging. Below, we present the set of common services provided by serverless platforms, followed by a description of the methods by which larger serverless applications can be constructed from functions and services.

2.1 Platform Services

Services and functions interact in two main ways: 1) functions write data to services through the use of platform-provided libraries, and 2) functions receive data from services, either as part of an explicit read using those same libraries, or as inputs assigned to the parameters of a function when it is initially invoked. We define five broad categories of services.

Object stores.

Object stores persist unstructured data in buckets. Each item uploaded to a bucket is identified by a unique key. This key can be used to retrieve the item for reading. Object stores are commonly used as a replacement for a filesystem in serverless applications. Examples of object store services include Amazon Simple Storage Service [6], Google Cloud Storage [19], and Azure Blob Storage [28].

Databases.

Databases store semi-structured data in one or more tables. Unlike object stores, databases provide advanced APIs for retrieving data based on queries. This category includes both relational databases such as Amazon Aurora [2] and Azure Cosmos DB [29], as well as column-store NoSQL databases such as DynamoDB [3].

Notifications.

Notification services expose a collection of named topics, which may be organized into a hierarchy for granular filtering. When a serverless function publishes data under a topic, all functions that subscribe to that topic or a parent topic are invoked and receive a copy of the data. Commercially available notification services include Amazon Simple Notification Service [7], Google Cloud Pub/Sub [18], and Firebase Cloud Messaging [17].

Queues.

Queues allow for intermediate storage of data that requires further processing. Serverless functions can be configured to execute when a queue receives new items. Queues either invoke serverless functions immediately when data is added to the queue, or batch several queued items in an array to be processed by a single invocation of a serverless function. Example serverless queues include the Amazon Simple Queue Service [8] and Amazon Kinesis Streams [4].

Stateless Services.

Stateless services perform data processing on-demand for serverless applications and store the results in another service. Often, stateless services implement common but computationally expensive tasks, such as image identification or speech parsing through services such as Amazon Rekognition [5].

2.2 Serverless Function Composition

There are three main ways that functions are composed into larger applications in serverless platforms.

Direct Invocation.

Direct invocation is the simplest method of invoking successor serverless functions. A serverless function may directly invoke another serverless function by passing the identifier of the successor function into a library call that interacts with the serverless platform.

Composition Frameworks.

Most serverless platforms also implement frameworks for directly composing services and functions. These frameworks, such as Amazon StepFunctions [10] and OpenWhisk Composer [11], provide a declarative syntax for composing serverless functions and supported services. In addition to declaring the relationship between functions and services, these function composition frameworks also include higher level abstractions for program flows such as conditional branching based on the value of data returned by an earlier stage of the composition.

Figure 1: Comparison of service call graph generated by our analysis for the galleria serverless application [13], and pipeline diagram provided in the repository’s user documentation. In the call graph at left, we see that GET and POST API gateway events in the top left of the call graph trigger the app-dev-uploader serverless function. This function then writes to the ORIGINALS S3 bucket, which in turn triggers the app-dev-rotate serverless function. This function reads from its triggering bucket then writes to a ROTATED bucket. The process repeats for two more image processing functions before the final image is uploaded to THUMBS.
Event Programming.

Serverless platforms provide a robust interface for specifying events that trigger the execution of a function. Most platform services allow developers to configure event triggers that activate when a service undergoes a state transition, e.g., when an object is created in an object store bucket. Upon being triggered, a serverless function will be provided with a copy of the data that triggered it, or an identifier to retrieve it from the associated service. In addition to events triggered by services, platforms also provide special gateways that handle interaction outside the platform, such as through HTTP requests. We present a real-world example of function composition on the righthand-side of Figure 1. The sequence of functions performs image processing on uploaded files to generate consistently formatted thumbnails.

3 Semantics for Serverless Computation

We introduce operational semantics for the execution of serverless applications. The goals of these serverless semantics are to: 1) precisely model the semantics of communication between serverless functions and platform services, and 2) capture program flows that are introduced as a result of this communication.

  defined functions
internal state
initial state
value
request ID
instance ID
executing serverless function
received request
generated response
computational step
RECEIVE x is fresh
START
COMPUTE
RESPOND
DIE
Figure 2: In-process semantics models the sequence of steps in an individual serverless functions. A full serverless application is modeled as a set of requests , executing functions , and generated responses . Functions and requests are appended to as they become active, and are removed from as they terminate or are responded to.

3.1 In-Process Semantics

In-process semantics for single serverless functions are defined in Figure 2. These semantics capture the sequence of steps in an individual serverless function. When an external gateway service initiates a request for the execution of the serverless program, the platform applies the RECEIVE rule which adds a new request . The request contains a serverless function and a data value v that will be passed to the function. Most commonly, RECEIVE represents a request made to a public web endpoint integrated with the serverless application. When an unhandled request exists, the platform applies the START rule which initializes with an initial state init(f,v) and starts the execution of . We note that init(f,v) captures both initial state at cold and warm start. COMPUTE models the execution steps in a serverless function f. Similarly to [20], COMPUTE is a language agnostic representation of transitions on state . COMPUTE absorbs interactions with platform services; Section 3.2 details the rules for these interactions. A serverless function may issue a response, in which case the platform applies the RESPOND rule. This rule will remove the unhandled request from the system and replace it with a response , where is a value provided by the RESPONDing serverless function. Responses represent data which is sent back to the external service that initiated the request; they are terminal states and are not used for further computation within the platform. Finally, functions may terminate through the application of the DIE rule. The system reaches a stable state when all requests have been responded to and no serverless functions are still executing.

3.2 Event Semantics

We extend the in-process semantics with an event semantics to capture interaction of functions with platform services and direct invocation. We develop semantics for each service: object stores, databases, notifications, queues, and stateless services. These semantics detail how serverless functions interface with that specific service during execution. In this section we detail the semantics of object stores. We include the semantics for the remaining services in Appendix 0.A; these semantics follow the general structure of the object store semantics, however each details behavior specific to the service they model.

The semantic rules can be broadly grouped into rules that write the state of a service (UPLOAD and REMOVE for object stores; INSERT, UPDATE, and DELETE, for databases; and ENQUEUE for queues), and rules that read data from a service into the state of an executing serverless function (READ for object stores, SELECT for databases, and DEQUEUE for queues).

functions:
  processor:
    handler: index.process
    events:
    - s3:
      bucket: photos
      event: s3:ObjectCreated:*
Figure 3: An example event configuration. The serverless function processor is triggered when an object is added to the photos bucket. In the semantics, this event is represented as the fact where .

Our semantics introduce a domain of events that captures function invocations due to service state transitions. An event consists of two parts: a triggering condition, , and an associated serverless function . Triggering conditions are generally defined by a unique service identifier and an operation (e.g, to an object store); we write . Program configurations unambiguously reference their associated services and the associated serverless functions. We reduce configurations to set of events during static analysis. We present an example configuration in Figure 3. An event is triggered when a serverless function performs a step that fires the event condition. For instance, an upload to an object store will activate all events tied to upload to . To capture the effect of these triggering events, our semantics introduce the function . This function accepts a triggerring condition , and returns the set of functions for which there is , i.e., the set of functions that will execute when a function runs operation on service . We note that some types of triggering conditions defined in our semantics are officially supported by serverless platforms but rarely occur in practice, such as the trigger associated with a REMOVE from an object store.

Our semantics distinguishes between functions triggered by external requests and functions triggered by events on services. The platform applies RECEIVE followed by START on functions triggered by external requests. It immediately applies START on functions triggered by “internal” events on services. Our semantics allows that any function that is part of the serverless application may issue a response to the external request. RECEIVE and RESPOND define the “boundary” of the serverless application, although functions may continue to execute and modify services after a RESPOND.

object store
object store name
defined events
store operation
event condition
object store name
triggered functions
UPLOAD
REMOVE
READ
Figure 4: Object store event semantics.

We define semantics for object stores in Figure 4. Each object store has a unique identifier in , the set of object stores defined for the application. Object stores provide a filesystem-like interface for writing and reading data. In the semantics, this interaction is encoded by allowing serverless functions to write or overwrite some value in a named bucket by applying the UPLOAD rule. When a file is uploaded, all events triggered by state transition on the receiving bucket initialize their respective function(s). Serverless functions can also delete data contained in a bucket through application of the REMOVE rule. When a function retrieves a data value from a bucket, the READ rule accesses the associated data and assigns it to a variable inside the function’s local state.

Our event semantics are synchronous in the sense that a request to a service and the execution of the request by the service happen in “one step”. This facilitates static reasoning. In practice, a request is decoupled from the execution; we conjecture that the synchronous semantics are sufficient as programs implicitly synchronize events on services: a read in is triggered by a write in . Further, for reads and writes within the same function, standard libraries typically provide only synchronous methods for interacting with platform services. We will formalize sufficiency conditions on programs in future work.

3.3 Platform Behavior Encoded in Semantics

Our semantics are sufficiently expressive to capture features of serverless platforms that impact system state in unintuitive ways. The non-finality of RESPOND and the effects caused by function retries are two examples of this behavior, which we discuss below.

export.shortenUrl = function(event, context, callback) {
  let url = event.body;
  let slug = crypto.randomBytes(8).toString(…).replace(…);
  callback(null, {shortUrl: context.domainName + slug});
  dynamodb.put({
    TableName: ShortUrls”,
    Item: {slug: slug,long_url: url}
  });
}
Figure 5: Example of execution continuing after response. RESPOND is applied when the callback passed in to the serverless function is invoked, but a database is written to after this response. Code adapted from the url-shortener project [31].

3.3.1 Non-finality of RESPOND.

Unlike return statements in normal functions, responses from a serverless function do not return from the function. Consider the example in Figure 5. This serverless function accepts a URL string and generates a random short slug for that URL. It immediately responds with the generated shortened URL, then afterward writes the association between the slug and the original URL to a database. Our semantics models the execution of this serverless function by the following transitions in our semantics ( represents the database service. INSERT has semantics similar to UPLOAD in Figure 4):

by rule RECEIVE(, , )
by START()
by COMPUTE()
by RESPOND(, )
by INSERT(, )
by DIE()

The application of the INSERT rule affects the final state of the system by introducing the value to the database . This insertion occurs even though the serverless function has already generated a response in an earlier step.

3.3.2 Failures and Retried Executions.

A serverless function may fail during execution for two reasons 1) the function code has entered an error state as the result of an uncaught exception, or 2) the container runtime has killed the function, either because execution has timed out, or because the language interpreter has failed with an error. When a function fails, the platform can retry the function by starting a new execution with a clone of the data from the original request [9].

Our semantics capture the effects of failures and retried executions that may impact system state. In particular, serverless functions that are not idempotent may emit messages to platform services that are repeated in retried executions, affecting final system state. In our semantics, these retries are modeled as an application of the DIE rule, followed by a subsequent application of START to handle a still-unsatisfied request. Consider a serverless function that uses the UPDATE rule to increment a view count. It is retried due to a spontaneous failure in the data center where the function is executing. This series of events are modeled under our semantics as:

by RECEIVE(, , )
by START()
by UPDATE(, )
by DIE()
by START()
by UPDATE(, )
by RESPOND(, {})
by DIE()

We observe that following these state transitions, the of the database has been incremented twice, despite only a single request being made to the serverless function. Such faults are representative of data inconsistencies that exist in real serverless applications that violate the idempotency recommended by serverless providers [9].

3.4 Platform Supported Function Composition

Function composition frameworks allow developers to statically declare pathways for messages through a serverless application. When one of these pathways is defined, the return value of a serverless function implicitly becomes a message passed to the serverless function or service following it in the composition. This occurs without the explicit invocation of a library method to cause state transfer on a platform service that is used for other serverless events. Despite this difference, our semantics are expressive enough to capture such behavior using the same set of state transitions as other serverless events.

Consider a StepFunction composition that defines a chain of two serverless functions, and . (The Appendix 0.A provides an example StepFunction declaration.) Our semantics models the execution as follows:

by RECEIVE(, , )
by START()
by COMPUTE()
by INVOKE(, )
by DIE(, , )
by COMPUTE()
by RESPOND(, )
by DIE(, , )

This execution illustrates an important difference between standalone serverless functions and those defined as part of a composition chain. The platform starts the StepFunction chain by issuing a request by RECEIVE. Only the final serverless function in the chain RESPONDs to the request. The “return” of all other functions in the chain is encoded as an event rule that activates the next function in the chain. Since compositions are static, the target of each stage of the composition is known. To preserve the connection to the originating StepFunction request that started , inherits the identifier from when it is invoked. Thus, the lifecycle of the first function in the composition chain is RECEIVE, START, COMPUTE, DIE; the lifecycle of the final one is COMPUTE, RESPOND, DIE.

4 Service Call Graphs

Our semantics enable construction of a service call graph that explicitly models interaction between services and serverless functions. The service call graph extends the classical call graph by adding nodes that represent platform services and edges that represent reads from services, writes to services, and transfer of control to functions triggered by state transition on services. We simplify our graphs by treating an entire intra-function call graph as a single node as demonstrated in Figure 6 to more clearly capture the interaction between functions and platform services.

Figure 6: Example simplification of a serverless function node. When processorLambda is invoked, the platform executes its handler. The handler may be split into several local helper functions, which we represent as a single node, shown at right.

Construction of the service call graph proceeds in two phases: configuration analysis, and code analysis. Configuration analysis processes configuration files and identifies the serverless functions and services for the given application. Each serverless function , and each service (object store), (database), (queue), and (notification topic) becomes a node in the service call graph. In addition to identifying the set of functions and services, configuration analysis also identifies the set of events and triggering conditions (recall Section 3 for the explanation on and ); each event where gives rise to an edge from to .

Code analysis processes each serverless functions . It constructs the standard interprocedural control flow graph (ICFG) of (we note that here “interprocedural” refers to the intra-function call graph of ). The analysis tracks the set of service identifiers that flow to call sites in the ICFG corresponding to rules of the event semantics (such as UPLOAD, ENQUEUE, INSERT, or NOTIFY). At each such call site, the analysis adds an edge from the current serverless function to each service that may reach the call site corresponding to the event rule. For instance, consider the execution of serverless function captured in the semantics as:

by COMPUTE
by COMPUTE
by INSERT(, )

In this function, identifier someTable is assigned to the variable id as a step in the execution of . When the later applies INSERT, all possible values of id flow to the parameter of INSERT. Since id can only have the value of someTable at this time, the analysis adds an edge in the service call graph from to the service . Since there is a triggering event where , INSERT in the configuration, this INSERT also triggers the execution of .

There are a variety of applications of our semantics and static service call graphs [22]. One application is container prewarming; the platform can use the call graph to prepare containers for functions scheduled for execution and reduce the penalty of cold starts. In our experiments, warm starts reduced running time for galleria in Figure 1 by approx. 25% (Appendix 0.A). Other applications include resource usage prediction, information flow analysis, and static debugging.

5 Call Graph Implementation and Evaluation

We implement service call graph construction as an extension of the Type Analysis for JavaScript framework [21]. Specifically, we employ a branch of TAJS that supports reasoning about asynchronous behavior [33]. Our analysis consists of 1187 lines of Java code that interface with the TAJS intermediate representation of JavaScript to build the service call graph, and 245 lines of JavaScript code that summarize the effects of third party libraries, including the AWS SDK. We constructed summaries of library functions to overcome limitations in TAJS that prevented us from performing standard whole-program analysis. We intend to release our implementation publicly.

We generate call graphs for applications collected from GitHub. We searched GitHub for repositories that included serverless configuration files that defined more than one serverless function, sorted by repository popularity. We analyze the top twelve applications that fit this criteria. To evaluate the accuracy of our generated call graphs, we compare the output of our analysis against call graphs drawn by manual inspection of programs.

Application Lines of Code # Functions Sound? Missed Edges
hello-retail 2288 14 Y 0
citizen-dispatch 865 3 N 6
galleria 641 5 Y 0
rating-service 412 2 Y 0
LEX 323 2 Y 0
lending-app 258 4 Y 0
url-shortener 172 3 Y 0
zen-beer 155 4 Y 0
greeting-app 99 2 Y 0
lane-breach 98 2 N 1
wombat 88 2 Y 0
serverless-chaining 28 2 Y 0
Table 1: Service Call Graph results.

Table 1 presents the analysis results. For 10 of the 12 applications, our analysis produced a service call graph identical to the ground truth. One such comparison is shown in Figure 1. For two applications, our analysis missed edges. In the case of lane-breach, the missed edge corresponded to a web request made directly to another function through the external web API. We note that it is not possible, in general, to determine whether a web address belongs to the application under analysis or a third-party web site. Fortunately, this behavior represents a discouraged pattern [12]; the program could be made more efficient by rewriting the code to use a direct invocation, which would be captured through our INVOKE rule.

In the case of citizen-dispatch, the analysis missed edges from serverless functions to a set of database tables that corresponded to database queries made by third-party library calls. This program violated our assumption that third-party libraries do not interact with services. Though constant service identifiers flow to the library calls, it is difficult to statically infer which tables will be accessed by a particular call due to the nature of the query inference engine. Future versions of our tool could safely over-approximate this behavior by assuming that any library call has the potential to query all tables. If we could perform standard whole-program analysis, interactions with the database through the library would have been soundly detected. (Whole-program analysis is trivially supported in tools for languages such as Java, but is not supported by TAJS due to the difficulty of analyzing JavaScript.)

6 Conclusion

We have introduced new operational semantics for serverless computing. We have demonstrated how these semantics can be used to produce a new type of call graph that incorporates services and event-dependent program flows. Finally, we have presented a prototype of our call graph construction algorithm and showed its efficacy on real-world serverless programs. In future work, we will use these semantics to construct analyses and tools for improving performance and security of serverless applications.

References

Appendix 0.A Semantics of Serverless Computation

database service
database table name
database operation
database insertion
database deletion
database update
effect of query
defined events
event condition
database query
triggered functions
INSERT
UPDATE
DELETE
SELECT
Figure 7: Database event semantics.

0.a.0.1 Databases.

The database semantics in Figure 7 are similar to object stores. As with object stores, each table of a database has a uniquely identifying table name in , the global domain of database tables defined for a serverless applications. Serverless functions may trigger other functions by adding data to a database using the INSERT or UPDATE rules. They may also remove existing data using the DELETE rule and access data with the SELECT rule. However, unlike object stores, databases allow for complex queries which may operate on several values in a single step. In order to encapsulate the effect of database queries, we define the function , which accepts a database value and produces some resulting value . When a serverless function performs a step that acts on a database, the step receives as input a query function that is used to compute the state transfer on a database and select returned rows. This abstraction allows the effects of database querying to be reasoned about without the need to define semantics for the relational algebra operations supported by serverless databases.

0.a.0.2 Serverless Queues.

The queue semantics defined in Figure 8 are distinct from other platform services in that data cannot be read from a queue into a currently executing serverless function. Instead, each individual queue in , the global domain of queues defined for a serverless application, acts as a buffer for data that will be processed by new invocations of serverless functions. Serverless functions may append data to a queue by applying the ENQUEUE rule. When a serverless platform detects that a queue meets service-specific conditions, it pops data from that queue using the DEQUEUE rule and passes it as a parameter into a new instance of each serverless function that is triggered by that queue.

queue service
queue name
defined events
event condition
queue name
queue triggering condition
triggered functions
ENQUEUE
DEQUEUE
Figure 8: Queue event semantics.

0.a.0.3 Stateless Services.

Our semantics also support stateless services through rules defined in Figure 9. We encode interactions with stateless services through the SERVICE rule of our event semantics. In this rule, an invocation of a stateless service is provided with data as well as an event condition . The event condition serves as the identifier for the service where the stateless service should externally store the result of its computation on . This write to by the stateless service will cause functions with events triggered by writes to through to execute as normal. Additionally, we encode the behavior of stateless notification services through the NOTIFY rule. When a serverless function publishes data to some topic in , the global set of defined topics, all functions which subscribe to the topic are triggered. In addition to communication between functions and services, functions can also directly invoke other functions as a step of the function body. We represent this behavior through the INVOKE rule.

notification topic
defined events
event condition
notification topic
effect of external service
triggered functions
NOTIFY
SERVICE
INVOKE
Figure 9: Stateless service event semantics.
receive:
 Type: Task
 Resource: arn:aws:lambda:us-east-1:XXX:function:receive
 Next: parallel
parallel:
 Type: Parallel
 ResultPath: \$.results.parallel
 Branches:
 - StartAt: log
    States:
      log:
        Type: Task
        Resource: arn:aws:lambda:us-east-1:XXX:function:log
        End: true
 - StartAt: auth
    States:
      auth:
        Type: Task
        Resource: arn:aws:lambda:us-east-1:XXX:function:auth
        End: true
        ResultPath: \$.results.authorize
Figure 10: Example of parallelism in a StepFunction. In this composition, the receive serverless function will INVOKE both log and auth.

0.a.0.4 Platform Supported Function Composition: Composition Parallelism and Conditionals.

Platform supported composition allows multiple functions to be executed in response to a single event. We provide an example of such a configuration in Figure 10. In our semantics, such parallelism is encoded by applying the necessary rules repeatedly, once for each starting point in the parallel section of the composition. For instance, the parallel portion of the execution of Figure 10 would be encoded as:

INVOKE(, )
INVOKE(, )
DIE()
DIE()
RESPOND(, )

Our semantics assumes that the function that spawns the parallel arms acts as a barrier. In the above execution recv joins log and auth, then it RESPONDs to the StepFunction request.

Figure 11: Example of platform supported function composition using AWS StepFunctions. In this example, a web request triggers the RecordDB serverless function. When RecordDB completes, it INVOKES the serverless function RecordAC. Code is modified from the slack-signup-serverless project [37].
AuthOrNot:
  Type: Choice
  Choices:
   - Variable: ”\$.results.receive.doAuth
       NumericEquals: 1
       Next: authorize
  Default: fail
authorize:
  Type: Task
  Resource: arn:aws:lambda:us-east-1:XXX:function:auth
  End: true
  ResultPath: \$.results.authorize
fail:
  Type: Task
  Resource: arn:aws:lambda:us-east-1:XXX:function:fail
  End: true
  ResultPath: \$.results.respond
Figure 12: Example of branching behavior in StepFunctions. In this example, the AuthOrNot step evaluates the value of a doAuth field against the numeric literals 1 and 0. In the case of 1 it executes the authorize function, otherwise it executes the fail function.

Composition frameworks also allow users to declare branching behavior in a function composition. Branching behavior is achieved by declaring simple conditional expressions that assess a value received as input. We provide an example of branching behavior in Figure 12. Unlike conditionals written as part of a normal function body, branching behavior defined in a platform function composition framework executes outside of the application, on resources owned and managed by the platform. To encode these conditionals in the event semantics, we create a virtual function whose COMPUTE step evaluates the conditional, then performs the event rule for the branch whose condition is met. For example, if the value of doAuth in an execution of the composition from Figure 12 were 1, the state transitions observed would be:

START()
COMPUTE
INVOKE(, v)

Since COMPUTE steps represent abstract local operations on a serverless function, they are sufficiently general to capture the evaluation of conditional logic. If the value of doAuth were 0, the series of transitions observed would be the same, though the function initialized by the final application of INVOKE would be .

Appendix 0.B Container Prewarming

In addition to their standard applications in program analysis, such information flow analysis and dead code detection, service call graphs have applications specific to serverless platforms, for example, container prewarming. Container prewarming is a strategy for mitigating delays associated with new container deployment during cold starts. Cold starts for JavaScript have previously been measured to incur as much as 644 ms of delay on AWS Lambda, and 9822 ms of delay on Azure [27]. In prewarming, containers are initialized before they are needed and are kept warm by sending mock requests that trigger invocations at regular intervals. This strategy can be effective when the workload is predictable, where the correct number of containers can be kept warm. When workloads are intermittent or bursty, it may not be possible to predict the number of containers that are needed. Thus, this pre-warming approach can lead to wasted function invocations on unused containers or cold start latency penalties when too few containers are provisioned [23].

We propose an event-triggered prewarming approach that leverages our service call graphs. For each entry point to the application, e.g., each possible web request, the call graph identifies the chain of functions that may be triggered by that request. In the example in Figure 1, a web request triggers the uploader function, which leaders to a sequence of function executions, rotator, resizer, and compressor, each one triggering the next. In our scheme, as soon as uploader function is invoked, we send mock requests for the remaining three functions. If no containers are available for these functions, the requests will start the initialization of new containers , thus reducing any cold start penalties.

Mean Median
Function Name Cold Warm Penalty Cold Warm Penalty
uploader 2.562 2.122 0.440 2.562 2.125 0.437
rotater 1.731 1.065 0.665 1.608 1.005 0.603
resizer 1.425 1.329 0.095 1.515 1.079 0.436
compressor 2.173 1.294 0.879 2.021 1.095 0.926
Table 2: Startup Times for Serverless Functions in Galleria (in seconds)

To demonstrate the effectiveness of event-triggered prewarming, we measure cold start and warm start times for the galleria application in Figure 1 using CloudWatch logging. We run galleria on AWS Lambda using 1536 MB containers for each serverless function. Container start times were calculated by measuring time elapsed from completion of the previous serverless function in the function chain to the start of the next function. For uploader, start time is the time elapsed since the web request was issued. Cold starts were triggered by leaving the application dormant for 50 minutes prior to the request. Each cold start was followed by five warm starts, each a single request spaced two minutes apart. 90 measurements were collected in total, 15 cold starts and 75 warm starts. The measurements are shown in Table 2.

In our experiments, the cold start penalty represents up to 45% of total function execution time. Given a prewarming scheme that begins warming all functions in a chain upon a cold start of the first function, we calculate that end-to-end median cold start time is reduced from 7.706 seconds to 5.741 seconds, improving performance by nearly 25%.