Separation and Equivalence results for the Crash-stop and Crash-recovery Shared Memory Models

12/07/2020 ∙ by Ohad Ben-Baruch, et al. ∙ University of Southern California Ben-Gurion University of the Negev 0

Linearizability, the traditional correctness condition for concurrent data structures is considered insufficient for the non-volatile shared memory model where processes recover following a crash. For this crash-recovery shared memory model, strict-linearizability is considered appropriate since, unlike linearizability, it ensures operations that crash take effect prior to the crash or not at all. This work formalizes and answers the question of whether an implementation of a data type derived for the crash-stop shared memory model is also strict-linearizable in the crash-recovery model. This work presents a rigorous study to prove how helping mechanisms, typically employed by non-blocking implementations, is the algorithmic abstraction that delineates linearizability from strict-linearizability. Our first contribution formalizes the crash-recovery model and how explicit process crashes and recovery introduces further dimensionalities over the standard crash-stop shared memory model. We make the following technical contributions: (i) we prove that strict-linearizability is independent of any known help definition; (ii) we then present a natural definition of help-freedom to prove that any obstruction-free, linearizable and help-free implementation of a total object type is also strict-linearizable; (iii) finally, we prove that for a large class of object types, a non-blocking strict-linearizable implementation cannot have helping. Viewed holistically, this work provides the first precise characterization of the intricacies in applying a concurrent implementation designed for the crash-stop model to the crash-recovery model, and vice-versa.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Concurrent data structures for the standard volatile shared memory model typically adopt linearizability as the traditional safety property [Her91]. However, in the non-volatile shared memory model where processes recover following a crash, linearizability is considered insufficient since it allows object operations that crash to take effect anytime in the future. In the crash-recovery model [golab15], linearizability is strengthened to force crashed operations to take effect before the crash or not take effect at all, so-called strict-linearizability [AF03]. While there exists a well-studied body of linearizable data structure implementations in the crash-stop model [HS08-book], concurrent implementations in the crash-recovery model are comparatively nascent. Consequently, it is natural to ask: under what conditions is a linearizable implementation in the crash-stop also strict-linearizable in the crash-recovery model?

Non-blocking implementations in the crash-stop model employ helping: i.e., apart from completing their own operation, processes perform additional work to help linearize concurrent operations and make progress. This helping mechanism enables an operation invoked by a process to be linearized by the event performed of another process , but possibly after the crash of . However, strict-linearizability stipulates that the operation invoked by be linearized before the crash event. Intuitively, this suggests that linearizable implementations that are help-free must be strict-linearizable (also conjectured in [golab15]). This work formalizes and answers this precise question: whether a help-free implementation of a data type derived for the crash-stop model can be used as it is in the crash-recovery model.

Precisely answering this question necessitates the formalization of the crash-recovery shared memory model. Explicit process crashes introduces further dimensionalities to the set of executions admissible in the crash-recovery model over the well formalized crash-stop shared memory [AW98]. Processes may crash on an individual basis, i.e., an event in the execution corresponds to the crash of a single process (we refer to this as the individual crash-recovery model). An event may also correspond to (), process crashes where is total number of processes participating in the concurrent implementation (when it is the full-system crash-recovery model). Following a crash event in this model, the local state of the process is reset to its initial state when it recovers and restarts an operation assuming the old identifiers crash-recovery model (and resp. new identifiers crash-recovery model) with the original process identifier (and resp. new process identifier). Our contributions establish equivalence and separation results for crash-stop and the identified crash-recovery models, thus providing a precise characterization of the intricacies in applying a concurrent implementation designed for thecrash-stop model to the crash-recovery model, and vice-versa.

1.1 Contributions

First, we define the crash-recovery model and its characteristics. We show that there exist sequential implementations of object types in the crash-stop model that as is have inconsistent sequential specifications in the old identifiers crash-recovery model.

We then consider how data structures use helping in the crash-stop model by adopting the definitions of linearization-helping [help15] and universal-helping [DBLP:jattiya-help]. When considering an execution with two concurrent operations, the linearization of these operations dictates which operation take effect first. The definition of linearization-helping considers a specific event , in which it is decided which operation is linearized first. In an implementation that does not have linearization-helping, is an event by the process whose operation is decided to be the one that comes first. Universal-helping requires that the progress of some processes eventually ensures that all pending invocations are linearized, thus forcing a process to ensure concurrent operations of other processes are eventually linearized.

The first technical contribution of this paper is proving that some pairs of conditions are incomparable. That is, an implementation can satisfy exactly one of them, both, or none.

  • linearization-helping vs. universal-helping

  • strict-linearizability vs. linearization-helping

  • strict-linearizability vs. universal-helping

The second technical contribution is to show that under certain restrictions there is a correlation between some of the above pairs.

  • Restricting the definition of linearization-helping to be prefix-respecting, we prove that linearization-help free implies strict-linearizability. More specifically, any obstruction-free implementation of a total object type that is linearizable and has no linearization-helping in the crash-stop model is also strict-linearizable in the new identifiers individual crash-recovery model (Lemma §LABEL:lm:prefix).

  • We prove that any non-blocking implementation of an order-dependent type that is strict-linearizable in the crash-recovery model has no universal-helping in the crash-stop model (Lemma §LABEL:lm:oo-eq).

Roadmap. The contributions in this paper are structured as follows: §2 introduces the crash-stop shared memory model and other preliminaries. §3 presents our characterization of the dimensionalities of the crash-recovery shared memory model. §LABEL:sec:help recalls universal-helping, linearization-helping, valency-helping and presents new results on implementations satisfying these definitions. §LABEL:sec:SL-LH discuss the correlation between strict-linearizable implementations and linearization-helping. §LABEL:sec:HelpFree-SL proves that help-freedom does not implies strict-linearizability in general, but under a natural definition of help-freedom it does follows. §LABEL:sec:SL-UH proves that strict-linearizability and universal-helping are independent. However, for a large class of objects, strict-linearizability implies universal-help freedom. Finally, §LABEL:sec:SL-VH discuss the relation between strict-linearizability and valency-helping. The paper is concluded with a short discussion in §LABEL:sec:disc.

1.2 Related work

Strict-linearizability was proposed by Aguilera et al. [AF03] who show that it precludes wait-free implementations of multi-reader single-writer registers from single-reader single-writer registers. [golab15] showed that this is in fact possible with linearizability thus yielding a separation between the crash-stop and crash-recovery models. That helping mechanisms, typically employed by non-blocking implementations, is the algorithmic abstraction that may delineate linearizability from strict-linearizability was also conjectured in [golab15]. This is the first work to conclusively answer this question by providing the first precise characterization of the intricacies in applying a shared memory concurrent implementation designed for the crash-stop (and resp. crash-recovery) model to the crash-recovery (and resp. crash-stop) model.

Censor-Hillel et al.[help15] formalized linearization-helping and showed that without it, certain objects called exact-order types lack wait-free linearizable implementations (assuming only read, write, compare-and-swap, fetch-and-add primitives) in the standard crash-stop shared memory model. Universal-helping and valency-helping were defined by Attiya et al. [DBLP:jattiya-help]. Informally, it was shown in [DBLP:jattiya-help] that a non-blocking -process linearizable implementation of a queue or a stack with universal-helping can be used to solve -process consensus. This result was also extended to strong-linearizability [golab-strong] which requires that once an operation is linearized, its linearization order cannot be changed in the future. The definition of strong-linearizability does bear resemblance with the proposed helping definitions in [help15, DBLP:jattiya-help]; however, it is defined as restriction of linearizability and is incomparable to helping. Indeed, [help15] makes the observation that strong-linearizability is incomparable with linearization-helping. The results in this paper study the implications of the universal, linearization and valency helping definitions for strict-linearizability in the crash-recovery, which has not been studied carefully thus far.

2 Crash-stop Model

This section presents the preliminaries of the standard volatile shared memory model in which processes stop participating following a crash.

Processes and shared memory. We consider an asynchronous shared memory system in which a set of processes communicate by applying operations on shared objects. Each process has an unique identifier and an initial state. An object is an instance of an abstract data type which specifies a set of operations that provide the only means to manipulate the object. An abstract data type defines a set of operations, a set of responses, a set of states, an initial state and a transition relation that determines, for each state and each operation, the set of possible resulting states and produced responses [AFHHT07]. We consider only deterministic types: when an operation is applied on an object of type in state , there is exactly one state to which the object can move to and exactly one matching response . An object type is total if any operation of the object type applied by any process (we assume that the identifier of the process does not matter) is well-defined for every object state. An example for an object that is not total can be drawn by restricting known objects. For example, consider a stack where process is allowed to perform only after completing . In such object, a process is not allowed to invoke as its first operation. However, to the best of our knowledge, most objects types are total.

An implementation of an object type (sometimes we just say object) provides a specific data-representation of by applying primitives on a set of shared base objects , each of which is assigned an initial value and a set of algorithms , one for each process. We assume that the primitives applied on base objects are deterministic. A primitive is a generic read-modify-write (rmw) procedure applied to a base object [G05, Her91]. It is characterized by a pair of functions : given the current state of the base object, is an update function that computes its state after the primitive is applied, while is a response function that specifies the outcome of the primitive returned to the process. Let be an event issued by some transaction that applies the rmw primitive to a base object after an execution . Let be the value of after . Now, atomically performs the following: it updates the value of to the value specified by the function and returns a response specified by the function .

Executions and configurations. An event of a process in the crash-stop model (sometimes we say admissible step of ) is an invocation or response of an operation performed by or a rmw primitive applied by to a base object along with its response. A configuration specifies the value of each base object and the state of each process. The initial configuration is the configuration in which all base objects have their initial values and all processes are in their initial states.

An execution fragment is a (finite or infinite) sequence of events. An execution of an implementation is an execution fragment where, starting from the initial configuration, each event is issued according to and each response of a rmw event matches the state of resulting from all preceding events. An execution , denoting the concatenation of and , is an extension of and we say that extends . Let be an execution fragment. For every process identifier , denotes the subsequence of restricted to events of process . If is non-empty, we say that participates in , else we say is -free. An operation precedes another operation in an execution , denoted , if the response of occurs before the invocation of in . Two operations are concurrent if neither precedes the other. An execution is sequential if it has no concurrent operations. Two executions and are indistinguishable to a set of processes, if for each process , . An operation is complete in if it returns a matching response in . Otherwise we say that it is incomplete or pending in . We say that an execution is complete if every invoked operation is complete in .

Well-formed executions. In the crash-stop model, we assume that executions are well-formed: no process invokes a new operation before the previous operation returns. Specifically, we assume that for all , begins with the invocation of an operation, is sequential and there is no event between a matching response event and the subsequent following invocation.

Safety property: Linearizability. A history of an execution is the subsequence of consisting of all invocations and responses of operations. Histories and are equivalent if for every process , . A complete history is linearizable with respect to an object type if there exists a sequential history equivalent to such that (1) and (2) is consistent with the sequential specification of type . A history is linearizable if it can be completed (by adding matching responses to a subset of incomplete operations in and removing the rest) to a linearizable history [HW90, AW04].

3 Characterization of the Crash-recovery Model

Processes and non-volatile shared memory. We extend the crash-stop model defined in §2 by allow any process to fail by crashing; following a crash, process does not take any steps until the invocation of a new operation. Following a crash, the state of the shared objects remains the same as before the crash; however, the local state of crashed process is set to its initial state.

Executions and configurations. An event of a process in the crash-recovery model is any step admissible in the crash-stop model as well as a special crash step; is a set of process identifiers. The step performs the following actions: (i) for each , the local state of set to its initial state, (ii) the execution ; is -free, is indistinguishable to every process from the execution . In other words, processes are not aware to crash events.

Process crash model. We say that an execution is admissible in the individual crash-recovery model if for any event in , . If , we refer to it as the system-wide crash-recovery model. We say that an implementation is admissible in the individual crash-recovery model (resp. system-wide crash-recovery model) if every execution of is admissible in the individual crash-recovery model (resp. system-wide crash-recovery model).

Safety property: Strict-Linearizability. A history is strict-linearizable with respect to an object type if there exists a sequential history equivalent to , a strict completion of , such that (1) and (2) is consistent with the sequential specification of .

A strict completion of is obtained from by inserting matching responses for a subset of pending operations after the operation’s invocation and before the next crash step (if any), and finally removing any remaining pending operations and crash steps.

Liveness. An object implementation is obstruction-free if for any execution and any pending operation by process , returns a matching response in or crashes where is the complete solo-run ( only contains steps of executing ) execution fragment of by . An object implementation is non-blocking if in every execution, at least one of the correct processes completes its operation in a finite number of steps or it crashes. An object implementation is wait-free if in every execution, every correct process completes its operation within a finite number of its own steps or crashes. Obviously, liveness in the crash-stop model is identical to the above without the option of process crashing.

Old identifiers crash-recovery model. Consider an execution and a process that crashes in . We say that an execution is admissible in the old identifiers crash-recovery model if for any process and any event in such that , takes its first step in after the crash by invoking a new operation.

New identifiers crash-recovery model. We say that an execution is admissible in the new identifiers crash-recovery model if for any process and any event in such that , process no longer takes steps following in . Note that even in this model, there are at most active processes in an execution, i.e., processes that have not crashed.

As we next prove, given an implementation in the crash-stop model, using it as is in the old identifiers crash-recovery model may result a sequential execution in which a process returns an invalid response. Therefore, it is not trivial to transform an implementation from the crash-stop model to the old identifiers crash-recovery model. On the other hand, any execution in the new identifier crash-recovery model with crash events is indistinguishable to all non-crashed processes from an execution in the crash-stop model in which every crashed process simply halts, and vice-versa. Thus, and by abuse of notation, we can consider the same execution in both models in the context of deriving proofs for a given implementation.

For this reason, all results in this work concern the new identifiers crash-recovery model, thus we do not state the model explicitly. We note that all impossibility results in this paper holds also for the old identifiers crash-recovery model. This stems from the fact that given an execution in the new identifiers crash-recovery model, it can be seen as an execution in the old identifiers crash-recovery model when , the total number of processes in the system, is larger then the number of processes taking steps in the execution.

Lemma 1.

There exists a sequential implementation of an object type in the crash-stop model providing sequential liveness, such that is not consistent with the sequential specification of in the old identifiers system-wide crash-recovery model.

To prove the claim we present an implementation of type , and construct an execution in the old identifiers system-wide crash-recovery model, such that process invokes and completes in a crash-free manner an operation ; however returns a response that is not consistent with the sequential specification of (sequential liveness in the lemma statement simply requires that a process running sequentially will complete its operation with a matching response). Notice that the lemma holds for the more restricted system-wide crash model, hence it holds for the individual crash model as well. Moreover, the lemma holds even for the weak sequential liveness progress condition, where no concurrency is allowed.

Private variables: SWSR register initially 0