In this paper, we explore Record and Replay (RnR) of multi-process applications where processes communicate via shared memory. Record and replay (RnR) mechanisms aim to allow parallel program debugging to proceed as follows. The programmer runs the program, and potentially observes incorrect behavior. The programmer then re-runs the program, while more closely watching the program state, and attempts to discover where a program bug may have occurred. However, even when a parallel program is re-executed with the same input, different executions of the program may proceed differently, due to non-determinism introduced by the uncertainty in the delays incurred in performing various operations. Thus, the observed bug may not re-occur during re-run, making it quite difficult to discover the cause of the original problem. Record and Replay (RnR) aims to solve this problem by creating a record during the original execution, and using it during replay to guarantee that the re-run produces the same outcomes as the original execution. In other words, while the original execution may be non-deterministic, the replay using the record eliminates the non-determinism as desired.
There can be many sources of non-determinism in parallel programs. For example user inputs, readings from sensors, random coin flips, etc. However, in this paper we focus specifically on the non-determinism allowed by the shared memory consistency models in the read-write memory model. For a given program, the shared memory consistency model defines a space of allowed executions possible when the program is run. By creating a record during an execution and enforcing it in the replay, this space is further restricted hence reducing the inherent non-determinism. The goal is to record enough from the original execution so as to reproduce the same outcomes in the replay.
The work in this paper is motivated by the trade-off between the consistency model for shared memory and the amount of information that must be recorded to facilitate a replay. A stronger consistency model imposes more constraints on the execution, resulting in a smaller space of allowed executions. Intuitively, a stronger consistency model should require a smaller record to resolve the non-determinism during replay. In Section 5.3 we present an example execution to illustrate that this intuition is indeed correct. In prior work, Netzer  identified the minimum record necessary for RnR under the sequential consistency model . The computer architecture research community has also investigated RnR systems under various consistency models, for example , , and . See also a survey by Chen et. al. . However, to the best of our knowledge, only Netzer’s work  has addressed identification of minimum record for RnR under read-write memory model.
This paper builds on Netzer’s work to address the minimum record for correct replay under causal consistency. Whether a certain record is necessary and sufficient for replay depends on several factors, as discussed next. Lee et. al.  have also discussed a classification of RnR strategies and Chen el. al.  provide a taxonomy of deterministic replay schemes.
[label=0), wide, labelwidth=!, labelindent=0pt]
How faithful should the replay be to the original execution? To understand the different scenarios that are plausible, let us consider an implementation of shared memory. Suppose that each process maintains a local replica of the shared variables. When a process writes to a shared variable, the new value is propagated to other processes via update messages. The new value is eventually written at each replica, while ensuring that the consistency model is obeyed. Figure 1(a) illustrates an execution of two processes that implement sequential consistency. In this case, in the original execution, is updated to equal 1 due to the write operation by process , and then is updated to 2 due to the write operation by process . Subsequently, process reads as 2 with the read operation . Figures 1(b) and (c) show two possible replays of the execution in Figure 1(a). Observe that, while the read returns the same value in both replays, the order in which the variables are updated is different in the replay in Figure 1(b) than the original execution. On the other hand, the replay in Figure 1(c) performs the updates in an identical order as in the original execution.
Depending on whether we must reproduce the replay as in Figure 1(b), or allow a replay as in Figure 1(c), the minimum record necessary will be different. As one may expect, the record required for replay in Figure 1(b) is smaller, since the replay is not as faithful as that in Figure 1(c). Netzer’s minimum record  for sequential consistency allows the replay in Figure 1(b), which ensures that all the reads and writes to the same variable occur in the same order during replay as in the original execution. However, the updates to different variables may not necessarily occur in the same order during replay as in the original execution.
At a minimum, the read operations in the replay must return the same values as the corresponding read operations in the original execution. This ensures that the program state for each process, and so the output, in the replay is the same as the one in the original execution (i.e., the same branches are taken in both the executions as the next step to be performed by a process depends on the current program state and the values read from shared memory) and so the replay is indistinguishable to the high-level user from the original execution. We discuss the exact formal model for this work in Section 4.
At what level of abstraction is the RnR system implemented? The abstraction level where the RnR system operates influences what can and needs to be recorded. For instance, if the shared memory is implemented via message passing, then, for the purpose of RnR, we may treat this as a message-passing system and record messages rather than shared memory operations. In this case, the RnR system can be viewed as residing below the shared memory implementation.
Alternatively, the RnR system may operate at the library level where the low level details, including interactions with the shared memory, are abstracted via the provided libraries. The RnR system is only allowed to record interactions with the APIs of the given libraries. We refer the reader to  Section 4.3 for a more detailed explanation of different abstract levels.
In this paper, our focus is on RnR for the shared memory. In our model, the RnR system resides on top of the shared memory layer so that the inner workings of the shared memory are abstracted while the interactions with the shared memory, via the read and write operations on shared variables, is exposed. In this case, we assume that the RnR module may observe, at each process, the reads of that process and the writes of all the processes.
Offline versus online recording. In the offline setting, the RnR module is provided with a completed execution in its entirety, and can use this information to obtain a record that suffices for a correct replay. In the online setting, each process has its own RnR module that observes the execution incrementally, and must decide incrementally what information must be recorded. The online record can be useful when, for example, the replay proceeds in tandem with the original execution for redundancy purposes. Netzer’s result  applies to both the offline and online setting for sequential consistency. In this paper, we consider both offline and online settings in the context of causal consistency.
A summary of our contributions is presented in Table 1. In this work, we present the optimal record for a version of causal consistency which we call strong causal consistency. This is formally defined in Section 3 and is followed by many practical implementations of causal consistency. We consider both the RnR model for replay as in Figure 1(b) and as in Figure 1(c). These are defined formally in Section 4. Sequential consistency was considered by Netzer . We consider the first RnR model in Section 5. In Sections 5.1 and 5.2 we present the optimal records for strong causal consistency for the offline and online scenarios respectively. The question of optimal record for causal consistency is still open and we discuss this in Section 5.3. We consider the second RnR model in Section 6 with optimal record for the offline case of strong causal consistency given in Section 6.1 and the one for causal consistency discussed in Section 6.2. We finish the paper with a discussion in Section 7, along with some open problems.
|Replay as in|
|Figure 1(b) (resolves||Figure 1(c) (resolves|
|Model Setting||entire views identically)||data races identically)|
|Similar to Netzer ||Netzer |
|Similar to Netzer ||Netzer |
|This work||This work|
|This work||Future work|
A relation on a set is a set of tuples such that . We use the notation if . We denote if either or . An irreflexive, antisymmetric, and transitive relation is called a partial order. A partial order on a set is a total order if for any , either or . A partial order can be represented by a directed acyclic graph which is closed under transitivity. For two relations and on a set , we say that respects if . We use the notation to restrict the relation on set to a subset . denotes the (unique) transitive reduction of the partial order and denotes . We use to denote the union, with the transitive closure, of relations and , and to denote the disjoint union of and . For example, consider two partial orders and on the set , given by and . Then, while . Observe that union and disjoint union of two partial orders may not be a partial order, as the previous example shows.
We borrow some notation by Steinke and Nutt  for shared memory formalism. The shared memory consists of a set of variables and supports two operations, read and write. We use for writes, for reads, and when the operation can be either read or write. We use a subscript for process identifier or leave it blank if it is unspecified. If the variable and the corresponding value read/written is relevant, we specify it in parenthesis. For example, denotes a write of value to variable performed by process and denotes an operation performed by process that can either be a read or write to variable . Formally, an operation is a -tuple where is for read and for write, is the unique identifier of the process that performed the operation, is the (shared) variable on which the operation was performed, and is the unique identifier of the operation. This notation allows for wild-card entries, e.g. is the set of all writes executed by process . Observe that we do not specify the values in the notation. We assume that each write operation writes a unique value111Since the unique write values have a one-to-one correspondence with the unique identifiers of the respective write operations, therefore formally specifying the write values is redundant.. The values read by read operations may vary between executions, but each read operation reads a value written by some write.
All operations in are totally ordered. We denote this total order by . The disjoint union of these is the program order given by . This is the order on operations implied by the program text. In figures representing total orders, we draw operations from left to right as they appear in the total order. For example, Figure 2(a) draws the program order for two processes, and . The two total orders, and , corresponding to processes and respectively, are drawn from left to right.
We model the distributed system as a network of processes that communicate with each other via reads and writes to the shared memory. Each process comes with a program that specifies the operations to be executed and the order in which they should be executed. Formally, a shared memory system is a set of processes , a set of operations , a program order on , a set of shared variables , and a shared memory . An execution is the result of processes running their programs on a shared memory system where each read operation returns a value written by some write operation.
Definition 2.1 (Writes-to).
Given an execution, a write operation writes-to a read operation , denoted , if and are on the same variable and returns the value written by .
We reason about executions as a collection of read and write operations on shared variables. We do not distinguish any operation as special, e.g. synchronization operation, but view all operations to the shared memory uniformly. This is the same as Netzer’s model .
Assumptions about Programs
In general, programs are dynamic where the next operation to be executed depends on the current program state. Our model requires reproducing the execution faithfully; at the very least all read operations must return the same values. Since we consider deterministic programs, that read the same values from the shared memory via the corresponding read operations, therefore we claim, without proof, that program at each process will execute the same operations in the same order in both the original execution and the replay. A similar result is shown in  for a different setting. So we assume that the program order is fixed.
One standard practice for writing concurrent programs is to ensure that they are properly synchronized such that they are data race free . This guarantees sequential semantics for such programs under most concurrent languages and multiprocessors. We do not make any such assumptions since
we do not distinguish any operation as special, e.g. synchronization operations,
one of the aims of this work is to replay programs for debugging purposes, so assuming that the programmer has written the program correctly is a dangerous assumption, and
the guarantee of sequential semantics for data race free programs is for a different consistency model (cache consistency) and it does not hold for causal consistency.
3 Shared Memory Consistency
For an execution, a view on a set of operations is a total order on such that each read returns the last value written to the corresponding variable in . For a view , the data-race order is given by . Reasoning about allowed executions under a shared memory consistency model relies on existence of some collection of views that satisfy some properties, depending on the shared memory consistency model. We say that explains the execution under the consistency model. For example, causal consistency  requires existence of per-process views that satisfy causality, which is the union (with the transitive closure) of the writes-to relation and the program order. Formally, we use the definition by Steinke and Nutt .
Definition 3.1 (Write-read-write Order [Steinke and Nutt ]).
Given an execution with a writes-to relation , two writes, and , are ordered by write-read-write order, , if there exists a read operation such that .
Definition 3.2 (Causal Consistency [Steinke and Nutt ]).
An execution is causally consistent if there exists a set of views such that, for every process ,
is a view on the set of operations , and
A shared memory is causally consistent if every execution run on is causally consistent.
Note that, by definition, each view already respects the writes-to relation restricted to since, by definition of a view, each read returns the last value written to the corresponding variable in . Note also that read operations are only observed by the processes that perform them while write operations are observed by every process. We work with a version of causal consistency which we call strong causal consistency. This model is motivated by an implementation of causal consistency via lazy replication . Ladin et. al. 
use vector timestamps to ensure that a write operationfrom process is only committed locally when all write operations in ’s history, as summarized by ’s vector timestamp, have been observed. Many practical systems use vector timestamps to determine order of operations and detect conflicts in systems with weak consistency guarantees (e.g. Dynamo , COPS , and Bayou ) although these systems have conflict resolution schemes which make their actual consistency guarantees stronger than strong causal consistency (see also Section 7). Formally, we define strong causal consistency as follows.
Definition 3.3 (Strong Causal Order).
Given a set of views , two writes, and , are ordered by strong causal order, , if .
This is stronger than the write-read-write order since two writes and are ordered by if and only if has been read by process before it performs . However, has to be merely observed by process for the two operations to be ordered by strong causal order. Intuitively, this corresponds to causality when each write operation observed is immediately read.
Definition 3.4 (Strong Causal Consistency).
An execution is strongly causal consistent if there exists a set of views such that, for every process ,
is a view on the set of operations , and
A shared memory is strongly causal consistent if every execution run on is strongly causal consistent.
Observe that strong causal consistency does not violate the write-read-write order and thus it is at least as strong as causal consistency. In fact, it is strictly stronger than causal consistency. Figure 2 shows a causally consistent execution of a two process program. The read and write values are given in Figure 2(a). Figure 2(b) gives a set of views that explains this execution under causal consistency. The values of read and write operations have been omitted with the dotted edges giving the writes-to relation. Some obvious edges have also been omitted to avoid clutter. We reason that no set of views can explain the execution under strong causal consistency. Observe that ordering implies an edge that must be respected by . Therefore, any set of views that explain the execution under strong causal consistency must have either or . We show that none of these is possible.
For the first case, note that . Therefore can not be placed after in . Now if is placed after in , then does not return the last value written to in . This violates the definition of a view.
For the second case, we have that . Therefore can not be placed after in . Now if is placed after in , then does not return the last value written to in . Again, this violates the definition of a view.
Compiler and Hardware Optimizations
In real world systems, many optimizations are applied to the provided program by both the compiler at compile time and the hardware at runtime. The shared memory consistency model ensures that these optimizations are such that the guarantees provided are still maintained by these optimizations. For example, consider a uniprocessor and a shared memory consistency model that guarantees a view consistent with the program order implied by the written program. The compiler and hardware optimizations may result in operations being executed out of order in apparent violation of the program order constraints. However, the resulting execution can still be explained by the existence of a view (or views) where the operations are executed exactly as specified by the program order. Using view based definitions of shared memory consistency models allows us to abstract these implementation details. Therefore we allow all optimizations to be applied to the given program as long as the relevant shared memory consistency guarantees are satisfied.
4 RnR Model
For replaying executions, we assume that the per-process views are provided to the RnR system. The RnR system uses the views to determine the record. In case of online recording, the views are provided to the RnR system incrementally, as and when new operations occur that affect the views. Now let us illustrate how this requirement may be implemented in practice. Consider a shared memory implementation wherein each process has a copy of the shared variable and the shared memory is implemented via message passing. Then the shared memory adds a write operation to process ’s view when the local copy of the corresponding variable is updated at process . Similarly a read by process is added to process ’s view when the local copy is read.
The RnR system will record some edges from each view (i.e. on each process) and the replay execution is only allowed views that enforce these records. Note that we do not place any restriction on how the record is enforced. We assume that any set of views can explain the replay as long as it extends the record and is consistent under the shared memory consistency model. Formally, we define two RnR models with different fidelities. Under the first model, the RnR system is allowed to record any edge from each view and we require that the replay reproduces the per-process views exactly as in the original execution. Under the second model, the RnR system is only allowed to record data races from each view and we only require that the data races are resolved identically in the replay.
RnR Model 1: Given a set of views , is a record of if each . An execution is a replay of if there exists a set of views that explain the execution under the consistency model and each respects . We say that certifies the replay to be valid for . A record of a set of views is good if, for any replay of , under the same consistency model, any set of views that certifies the replay to be valid for must have for all (i.e. only certifies the replay to be valid for ).
RnR Model 2: Given a set of views , is a record of if each . An execution is a replay of if there exists a set of views that explain the execution under the consistency model and each respects . We say that certifies the replay to be valid for . A record of a set of views is good if, for any replay of , under the same consistency model, any set of views that certifies the replay to be valid for must have for all .
The second replay model is the same as the one considered by Netzer . Observe that for each record, there exists at least one replay, specifically the original execution. Note that RnR Model 1 forces all writes to appear in the same order for a process’ view as they did in the original execution, which is different than Netzer’s model in . This may seem expensive since reordering writes to different variables can result in performance optimizations while still returning the same values for reads and allowing the program state in the replay to progress the same as in the original execution. RnR Model 2 allows writes to different variables to be executed in different order, which is the same as Netzer’s model in . But for RnR Model 1 we require that each process’ point of view with respect to the order of events must be indistinguishable between the original execution and the replay.
In contrast to the discussion at the end of Section 3, the optimizations for the replay execution may be more restrictive than those for the original execution. Exactly what optimizations are allowed in the replay execution versus the original execution depends on the shared memory consistency model as well as the replay system implementation. In this work, we do not discuss replay systems, their implementations, or how they may enforce the provided record. So we do not discuss the optimizations during the replay.
5 Optimal Records for RnR Model 1
5.1 Offline Record for Strong Causal Consistency
In this section we consider offline record for strong causal consistency. In this case the entire set of per-process views is made available to the RnR system. The RnR system determines the record that must be saved. If the RnR system decides to record the entire views for every process , then this would be sufficient to reproduce the original execution exactly. However, this is wasteful since the transitive reduction for each process would also achieve the same result.
We first give intuition on what edges from each do not need to be recorded before formalizing it in Theorem 5.3. Fix a process . Since is fixed and independent of executions the RnR system does not have to record these edges in as they are guaranteed by the consistency model. Now consider two write operations and , for , such that . If process correctly orders the two operations in the replay, then this edge will be guaranteed by the consistency model, due to strong causal order, and process does not need to record it. Such edges are captured by the following definition.
Given a set of views , the relation , for a process , is defined as follows. Two writes, and , are ordered , if and .
Observe that the subscript distinguishes the relation from (Definition 3.3) which is a partial order for strongly causal executions. We now present an example to illustrate another set of edges that do not need to be recorded, although they are not directly guaranteed by the consistency model. Consider the following execution on three processes and a set of views that explains it under strong causal consistency (Figure 3). Process performs the write , process performs , and process does not perform any operations. Now process orders , process orders , and process orders . It can be easily verified that this set of views satisfies Definition 3.4 of strong causal consistency where both and are empty. Now note that if process records , process does not need to record its order of the two operations. The reason is that any possible set of views , that certify a replay to be valid for , will have order . So if process orders , this will create an edge . Since respects , therefore process will order . This conflicts with the recorded edge . Thus, such a set of views can not certify a replay execution to be valid for . The set of such edges is captured by the following relation.
Given a set of views , the relation , for a process , is defined as follows. Two writes, and such that , are ordered if and there exists a process such that .
Informally, in any set of views that explain a replay of , setting will create an edge which will conflict with . The following theorem states that for every process it suffices to record all edges in , except those in , , or .
Consider a set of views that explain a strongly causal consistent execution. For each process , let . Then, is a good record of .
The formal proof of the theorem is given in Appendix A. We first show that the strong causal order and the ’s are preserved in the replay (Lemma A.1). The proof then proceeds by arguing that, for every process , each path in is reproduced correctly in the replay. We refer the reader to Appendix A for the details. The following theorem states that, for every process , each edge in is necessary for a good record under strong causal consistency.
Consider a set of views that explain a strongly causal consistent execution. For any good record of , for any process and any two operations , if , then .
The formal proof of the theorem is presented in Appendix A. We show that if any two operations are such that, for some process , but is not recorded, then we can swap the two operations during the replay without violating consistency or replay constraints. This violates the definition of a good record. Theorems 5.3 and 5.4 show that the record such that is both sufficient and necessary for a correct replay under strong causal consistency.
5.2 Online Record for Strong Causal Consistency
We now look at the optimal record in an online setting. Consider the following implementation of shared memory. Each process keeps a copy of every shared variable in . Processes exchange messages to propagate their writes to shared variables. Based on the received messages, each process updates the current value of its copy of the shared variables. At any point in the execution, a read on variable at process returns the current value of stored at . We abstract this perspective of shared memory as follows. Each process has a fixed set of read and write operations that it executes in their local order by communicating with the shared memory. Executing an operation may take arbitrarily long and the process may spend arbitrarily long time to execute the next operation but each process only executes one operation at a time. Via the shared memory, a process observes its own operations and write operations from other processes one at a time. The order in which these operations are observed give rise to the view . More formally, the execution proceeds in time steps. At each time step in the execution, a unique222Uniqueness of the process makes the model simpler. If more than one process observes an operation at a given time step, we can separate this into multiple time steps ordered by the process identifiers. process observes an operation from and adds it to its view .
The online record algorithm proceeds as follows. Suppose process wants to record . Then, process must record at the time when it observes . In the online setting, process has limited information about views of other processes at any given time in the execution. How much does process know? We assume that, at most, process has access to the history of other processes brought with the observed operation. More precisely, at any time in the execution, if process is aware that , for some process , then process must have already observed such that . As discussed in Sections 1 and 4, the recording proceeds without information about the internal workings of the shared memory. However, we assume that the RnR system is aware of the shared memory guarantees. More precisely, for strong causal consistency, we assume that any process can check if and also if . For a given execution , we say that a record is an online record of if can be recorded in this manner.
Recall from Theorems 5.3 and 5.4 that for any process , is both sufficient and necessary in the offline setting. Therefore, if the recording unit can detect, for an edge , if it is one of , , or , then the optimal record in the online setting would match exactly the one in the offline scenario. However, it turns out that the membership of in cannot be checked by the recording unit online. This is formalized in Theorems 5.5 and 5.6 which state that for each process , is both sufficient and necessary in the online setting. The formal proofs are presented in Appendix A.
Consider a set of views that explain a strongly causal consistent execution. For each process , let . Then, is a good online record of .
Consider a set of views that explain a strongly causal consistent execution. For any good online record of , for any process and any two operations , if , then .
5.3 Causal Consistency
Causal consistency (Definition 3.2) imposes less restrictions on views that can explain an execution as compared to strong causal consistency. As discussed in Section 1, we expect a smaller record for strong causal consistency than causal consistency. Indeed consider a simple execution on two processes and two operations where process performs and process performs . Consider the set of views given in Figure 4 that explains this execution under both causal and strong causal consistency. Under strong causal consistency, only process has to record . However, since causal consistency imposes no restrictions in this particular example, a good record for causal consistency will require process to record as well.
The question of what is the optimal record for causal consistency is still open. We give a simple counterexample that shows that the natural strategy following the scheme of strong causal consistency does not work. More concretely, consider a set of views that explain a causally consistent execution. For each process , let . We give a simple four process example that shows that is not a good record of . The program for this example is given in Figures 5 and 6. Figure 5 gives the writes-to relation in bold edges, for the original execution of the program, as well as a set of views that explains the execution. The red edges represent the recorded edges, as specified above. Figure 6 gives one possible replay where the reads return the default values for the variables (so that the writes-to relation is empty), as well as a set of views that certifies the replay to be valid for the given record.
Observe that . There are two edges and in the original execution while , the write-read-write order for the replay, is empty. Note that, in this example, not only do the views differ, but the reads return the wrong values in the replay as well.
The example replay execution is causally consistent, but it has the strange property that processes do not commit their writes locally before informing other processes. For example, consider and . We have but . Both process and observed the other process’s write before they saw their own; one of these processes distributed it’s write to the other, then observed the other process’s write, then committed it’s own write. This does not violate causality because neither process had read the other process’ write (note, however, that this does violate strong causality). Consider the setting where each process keeps a copy of each variable and the shared memory is implemented via message passing. Then either process or process sends messages for its write before writing the local copy of the corresponding variable. Such an execution would not be possible if each process always wrote to their local copy of the variable first and then sent the relevant messages to other processes.
6 Optimal Records for RnR Model 2
6.1 Offline Record for Strong Causal Consistency
In this section we consider offline record for strong causal consistency. In this case, as in Section 5.1, the entire set of per-process views are made available to the RnR system which then determines the record that must be saved. We define strong write order inductively as below. It will be important for the optimal record for this RnR model.
Definition 6.1 (Strong Write Order).
Given a set of views , two writes, and , are ordered
[labelwidth=!, labelindent=0pt, topsep=0pt, itemsep=0pt]
We say that and are ordered by strong write order, , if for some . Furthermore, if , then for every process , we say that