The question of universality has always been central in all areas. What can be done with a given tool and a context, and more importantly what cannot be done with such tool. In sequential computing, universality is represented by a Turing machine that can compute all that is computable. In the context of distributed systems, we have known since 1985 and the famous FLP impossibility result that the consensus problem has no deterministic solution in a distributed system where even one process can fail by crashing. This impossibility is not due to the computing power of the individual processes, but rather to the difficulty of coordination between the different processes that compose the distributed system. Coordination and agreement problems are thus at the heart of computability in distributed systems .
A distributed system can be abstracted as a set of processes accessing concurrently a set of concurrent objects. The implementation of these objects are based on read/write registers and hardware instructions. Searching for correct and efficient implementations of usual objects (queues, stacks, etc.) is far from being trivial [9, 14, 16] when the system is failure prone. Intuitively, a “good” implementation of a concurrent object has to satisfy two kinds of properties: a consistency condition and a progress condition. The consistency condition specifies the safety property that is the meaningfulness of the returned results, and the progress condition specifies the guarantees on the liveness.
Linearizability  is a consistency criterion. It ensures that all the operations in a distributed history appear as if they were executed sequentially: each operation appears at a single point in time, between its start event and its end event. Such a consistency criterion gives the illusion to the processes to access a physical concurrent object. However, such implementations are often costly, when not impossible.
Definition 1 (Linearizability).
An execution is linearizable if all operations return the same value as if they occurred instantly at some instant, called the linearization point, between their invocation and their response, possibly after removing some non-terminated operations.
The use of locks in the implementation may cause blocking in a system where processes can crash. Prohibiting the use of locks leads to several progress conditions, among which wait-freedom  and lock-freedom . While wait-freedom guarantees that every operation terminates after a finite time, lock-freedom guarantees that, if the computation run for long enough, at least one process makes progress (this may lead some processes to starve). Wait-freedom is thus stronger than lock-freedom: while lock-freedom is a system-wide condition, wait-freedom is a per-process condition.
Definition 2 (Wait-freedom).
An execution is wait-free if no operation takes an infinite number of steps in .
Maurice Herlihy proved in  that consensus is universal in classical distributed systems composed of a set of processes. Namely, any object having a sequential specification has a wait-free implementation using only read/write registers (memory locations) and some number of consensus objects.
For proving the universality of consensus, Herlihy introduced the notion of universal construction111A small guided tour on universal constructions can be found in .. It is a generic algorithm that, given a sequential specification of any object whose operations are total222This means that any operation on the object can be called and the call returns regardless of the state of the object., provides a concurrent implementation of this object. Since then, many universal constructions have been proposed for several objects, assuming the availability of hardware special instructions that provide the same computing power as consensus, like compare&swap (CAS), Load-Link/Store-Conditional (LL/SC) etc.
This last decade, first with peer-to-peer systems, and then with multi-core machines and the multi-threading model, the assumption of a closed system with a fixed number of processes and where every process knows the identifiers of all processes became too restrictive. Hence the infinite arrival model introduced in . In this model, any number of processes can crash (or leave, in a same way as in the other model), but any number (be it finite or not) of processes can also join the network. When a process joins such a system, it is not known to the already running processes, so no fixed number of processes can be used in the implementations as a parameter. Three kinds of arrival models can be distinguished: the bounded arrival model in which at most processes may participate, the unbounded arrival model in which a finite number of processes participate in each execution but the number of participants is unknown to them, and the infinite arrival model where new processes may keep arriving during the whole execution. Let us note that, at any time, the number of processes that have already joined the system is finite, but can be infinitely growing. The classical system is part of the bounded arrival model where all the processes arrive at once. In this article, we focus on the infinite arrival model.
The aim of this paper is to extend universality of consensus to the infinite arrival model. Solutions to the consensus problem have already been investigated for the infinite arrival model [2, 5, 13] and consensus has been used as a base for reasoning about computability in this model . The question is thus “is it possible to build a universal wait-free linearizable construction based on consensus objects and read/write atomic registers?” This is not trivial for different reasons. First, although the lock-free universal constructions still work in the infinite arrival model because they ensure a global progress condition, this is no more the case for wait-free universal constructions. Second, wait-free implementations rely on what is called help mechanism, that has been recently formalized in . This mechanism requires any process, before terminating its operation, to help processes having pending operations, in order to reach wait-freedom. One of the difficulties in the infinite arrival model is that helping is not obvious. Indeed, helping requires at least that a process needing to be helped is able to announce its existence to other processes willing to help it. Due to the infinite number of potential participating processes over time, it is not reasonable to assume that each process can write in a dedicated register, and to require helping processes to read them all. When only consensus and read/write registers are accessible to a process, a newly arriving process must compete with a potentially infinite number of other arriving processes on either a consensus object or a same memory location; and may fail on all its attempts.
This paper has two main contributions.
Similarly to  which first proposes a Collect object that will be used as a building block for a universal construction, we first propose a construction that implements a weak log object. This log is used as a list of presence where a process that arrives registers. We propose two implementations of a weak log using respectively consensus objects and the special hardware instructions compare&swap.
We propose a universal construction based on consensus objects and the weak log object in the infinite arrival model where, moreover, processes are anonymous. This proves that consensus is universal even in this model.
Organization of the paper
The remainder of this paper is organized as follows. We first present the infinite arrival model in Section 2. Then, Section 3 introduces a new abstraction, called the weak log. In Section 4, we show how the weak log can be used together with consensus to implement a wait-free universal construction. In Section 5, we give an implementation of the weak log using consensus objects. In Section 6, we discuss a simpler algorithm that uses compare&swap instead of consensus to implement a weak log. Finally, Section 7 concludes the paper.
2 System Model
The infinite arrival model is composed of an infinite set of processes called Processes communicate through an infinite memory composed of unbounded registers 333Memory addresses of an infinite memory are unbounded, so this assumption is necessary to store references.. Processes can only access atomic memory locations through references or thanks to a memory allocation mechanism that creates a finite number of memory locations when it is invoked. In other words, it is not possible to create an infinite array in this model. Memory is composed of three kinds of registers: immutable registers, read/write registers and consensus objects defined thereafter.
- An immutable register
is initialized with a fixed value and cannot be modified later. It provides a read operation, simply denoted by in the algorithms, that returns the internal value of the register.
- A read/write register
provides two operations: a read operation that returns the internal value of the register, and a write operation that replaces the current value of by .
- A consensus object
provides two operations: a read operation similar to the read operation on an immutable register, and an operation that atomically checks if the current value is , then sets the value of to if it is the case, and finally returns the value of . In other words, the first proposed value is written on , which becomes immutable from this point on 444 This definition is close to the sticky bit and is not exactly the same as commonly accepted definitions of consensus. However, it is easy to implement it using a regular consensus task and one read/write register, initialized to and written when the consensus is decided. We use this definition for the sake of clarity of the proposed algorithms. .
An execution in the infinite arrival model is a (finite or infinite) sequence of steps. In each step, one process executes either a local transition of its algorithm, accesses an operation available on a register, or returns a value. A process that has already returned a value cannot execute a step afterwards. Processes are asynchronous, in the sense that there is no constraint on which process takes each step of an execution: a process may take an unbounded but finite number of consecutive steps, or wait an unbounded but finite number of other processes’ steps between two of its own steps. Moreover, it is possible that a process stops taking steps at some point in the execution even if it has not returned yet, which is similar to a crash in the classical model.
We say that a process arrives in an execution at the time of its first step during this execution. Remark that, although the number of processes in the system is infinite, the number of processes that have arrived into the system at any step is finite.
Processes are anonymous, in the sense that they do not know their unique identifier: this hypothesis models the fact that it would be unreasonably costly to base an algorithm on unbounded identifiers, especially when the attribution of identifiers is not controlled by the arrival order of processes in the system. Moreover, by assuming anonymous processes, the result becomes more general.
3 The Weak Log Abstraction
As explained previously, the main difficulty in building a wait-free universal construction in the infinite arrival model resides in replacing helping mechanisms already in place in algorithms for the finite arrival model. In this section, we introduce the weak log, an abstraction that allows each process to announce its own value.
In the infinite arrival model, long-lived linearizable objects can be expressed as distributed tasks, as a process that invokes several operations on a long-lived object can be modelled as several processes each issuing a unique operation . We now define the weak log abstraction as a distributed task.
In an instance of the weak log, each process proposes a value through an operation , that returns the sequence of all the values previously appended. The weak log is wait-free but not linearizable, in the sense that there might be no inclusion between the sequences returned by different processes. Instead, it is specified that the value proposed by a correct process will eventually appear in all subsequent sequences. Moreover, the order of the values inside sequences is consistent over the different sequences returned by all processes.
Definition 3 (weak log).
All processes propose distinct values by invoking , that returns a finite sequence such that:
All values in a sequence have been appended by some process:
If process terminates its invocation, then its value is appended at the end of its returned sequence:
- Total order
If two processes and terminate their invocations, then all pairs of values that and both contain appear in the same order: there is no such that , , and .
- Eventual visibility
If some process terminates its invocation, then, eventually, all processes terminating their invocation will return a sequence containing . In other words, the number of returned sequences that do not contain is finite.
- Wait-freedom :
No process takes an infinite number of steps in an execution.
4 A Universal Construction
Algorithm 1 presents a universal construction using a weak log and consensus objects. This algorithm is similar to the one presented in , except that the array of single-writer/multiple-reader registers used by processes to announce their operations is replaced by a weak log. The shared object to implement is represented by an initial state initialState and a set of operations that change the state of the object and return a value.
Processes share two variables:
announce is a weak log in which processes append their own invocation;
operations is a consensus object at the tail of a linked list of operations. The list is a succession of nodes of the form , where is the invocation of a process and cons is a consensus object, referencing another node after the consensus has been won by some process.
When process calls , it first appends invoc to announce and obtains a list of invocations in return. Then, it attempts to insert invocations of at the end of the list operations until all the invocations of announce have been inserted. While traversing the list, it maintains a state of the implemented object, initialized to initialState and on which all invocations are applied in their order of appearance in the list.
Lemma 1 (Linearizability).
All executions admissible by Algorithm 1 are linearizable.
Let be an execution admissible by Algorithm 1.
Let us first remark that, for any operation invoked by process , at most one memory location cons is such that . Indeed, suppose this is not the case, and let us consider the first two such memory locations, and . Both were proposed on line 1 by some processes and . As operations are totally ordered in a list, Process accessed before accessing . After accessing , because of line 1, which contradicts the fact that proposed invoc on .
Let us define the linearization point of any operation for which there is a memory location cons such that as the first step in which some process invoked .
We now prove that any operation done by a terminating process has a linearization point, between its invocation and termination point. By the validity property of announce, and as all invoc values are different, no process proposes invoc before arrived in the system. By the suffixing property of announce, at the beginning of ’s loop, . When terminates, . Therefore, invoc was removed on line 1 of some iteration of the loop, and cons at the beginning of the iteration is such that at the end of the loop.
Lemma 2 (Wait-freedom).
All executions admissible by Algorithm 1 are wait-free.
Suppose there is an execution admissible by Algorithm 1 that is not wait-free. It means that some process takes an infinite number of steps in . By the wait-freedom property of announce, enters the while loop after a finite number of steps, and each iteration of the loop terminates. Therefore, executes an infinite number of loop iterations. Let be the initial value of . As is finite and proposes some at each iteration, there exists a value such that proposes an infinite number of times.
Let be the infinite sequence of the values taken by during the execution, let be the process that took the step on line 1 installing the value in the consensus and let be the process that invoked .
As processes always propose the first invocation of that was not inserted in the list yet, there is an infinite number of values such that either
is not part of or
is part of , but appears after in the list.
By the eventual visibility property, the first case only concerns a finite number of , so there is a finite number of values in the second case.
For each of them, by the suffixing property, , that is the process that invoked obtained at the last value of its . By the total order property, it is impossible that obtains before and obtains before . Therefore does not contain . However, this contradicts the eventual visibility property that prevents an infinite number of processes to ignore .
This is a contradiction meaning the assumption of a non-wait-free execution is absurd. ∎
5 From Consensus to the Weak Log
The main difficulty in the implementation of a weak log lies in the allocation of one memory location per process, where it can safely announce its invoked operation. As it is impossible to allocate an infinite array at once, it is necessary to build a data structure in which processes allocate their own piece of memory, and make it reachable to other processes, by winning a consensus. The list of Algorithm 1 follows a similar pattern, but it poses a challenge: as an infinite number of processes access the same sequence of consensus objects, one process may loose all its attempts to insert its own node, breaking wait-freedom.
Algorithm 2 solves this issue by using a novel feature, that we call passive helping: when a process wins a consensus, it creates a side list to host values of processes concurrently competing on the same consensus object. As only a finite number of processes have arrived in the system when the consensus is won, a finite number of processes will try to insert their value in the side list, which ensures termination. Figure 1 presents an execution of Algorithm 2.
Processes executing Algorithm 2 share two variables: first and last defined as follows.
first is a consensus object that accepts values of the form to be proposed, where is a consensus object that stores values of the same type as first. is a node of the side list of the form , where is a value appended by some process and is a consensus object containing values of the same type as node. In other words, first references the beginning of a list of lists of appended values.
last is a read/write register referencing a consensus object of the same type as first, and initialized to the address of first. In absence of concurrency, last references the end of the list starting with first.
When process invokes , it first reads last, then it proposes a list containing as its successor and it writes the value returned by the consensus in last. If loses the consensus, it inserts in the side list of the successor (Lines 2 and 2). After that, traverses the list of lists to build the sequence it returns ( represents concatenation).
Note that the consensus and the write on lines 2 and 2 are not done atomically. This means that a very old value can be written in last, in which case its value could move backward. The central property of the algorithm, proved by Lemma 3, is that last eventually moves forward, allowing very slow processes to find some place in a side list.
We first prove by induction on the list of lists of the weak log that for all , the number of write operations of list in last at line 2 is finite.
Initially, first is never written in last, because only decided values on line 2 are written, and first is never proposed.
We now prove that, if the number of writes of list in last is finite, then the number of writes of in last is finite.
We prove the following contrapositive proposition: if the number of writes of in last is infinite, then the number of writes of list in last is infinite as well.
In order to write in last, a process needs to read list in last at line 2. As is written an infinite number of times and list is read an infinite number of times, then necessarily, list is written an infinite number of times as well.
Lemma 4 (Validity).
All values in the sequence have been appended by some process.
Lemma 5 (suffixing).
Definition 4 formalizes the order in which values are ordered in the weak log. Intuitively, this order is the concatenation of all the side lists. In Algorithm 2, the list of list is traversed in this precedence order, which ensures consistency of the order of all returned sequence (Total order property of the weak log).
Definition 4 (Precedence).
A value precedes a value in the weak log if:
is in a list and is in a list such that there exists a sequence of lists such that for all , , and .
and are in the same list and there exists a sequence of nodes such that for all , , and .
We extend the notion of precedence to lists and nodes using the same definition.
Lemma 6 (Total order).
If two processes and terminate their invocations, then all pairs of values that and both contain appear in the same order.
Let us remark that both processes and append values in their log following the precedence order defined by Definition 4. Therefore, for any two values and that are both contained by and , and have appended them at the end of the log in the same order, which proves the lemma. ∎
Lemma 7 (Eventual visibility).
If some process terminates its invocation, then the number of returned sequences that do not contain is finite.
Let us denote by and the values of and when terminates.
Let us suppose (by contradiction) that there is an infinite number of processes whose returned sequence does not contain , and an infinite number of them started their operation after returned. For each process of them, let and be the values of and when terminates its execution. As the collect loop respects the precedence order of Definition 4, for an infinite number of , precedes . As there is only a finite number of lists preceding ( terminates), an infinite number of processes have the same value of . All of them read the same value of last at line 2 and wrote on line 2. This contradicts Lemma 3. ∎
Lemma 8 (Wait-freedom).
No process takes an infinite number of steps in an execution.
Let us suppose that there exists an execution such that process takes an infinite number of steps in trying to append . This means that either of the two loops (lines 2 and 2) are looping an infinite number of times:
If the loop at line 2 loops for an infinite number of times, it means that for an infinite number of nodes. This implies that an infinite number of values are appended in the same list at line 2, which means that an infinite number of processes read the same value at line 2, and wrote at line 2 which contradicts Lemma 3.
If the loop at line 2 loops for an infinite number of times, this means that never reads , and as there is a finite number of lists that precedes the list in which has been appended, one of these lists contains an infinite number of nodes. All these nodes were created by processes reading the same value of last in line 2, which contradicts Lemma 3.
Both cases lead to a contradiction, so the algorithm is wait-free. ∎
6 From Compare&Swap to the Weak Log
In multi-threaded environments, consensus is usually replaced by special operations like compare&swap. In this section, we replace read/write registers and consensus objects by CAS registers. A CAS register provides two operations: a read operation working as the read operation of the immutable register and a compare&swap operation that atomically checks if the current value of is , changes it to and returns true if it is the case. If , then returns false without changing the value of . On the one hand, like in the finite arrival model consensus objects and CAS registers are computationally equivalent: a consensus object can be easily simulated by implementing as , and conversely, compare&swap can be implemented using consensus as proved in the previous section. On the other hand, compare&swap is more flexible as it allows to write several times in the same register.
Algorithm 3 presents a simplified implementation of the weak log based on CAS registers instead of consensus objects. There are two main differences between Algorithm 3 and Algorithm 2. First, values are stored in a list instead of a list of lists. Second, the list is managed as a stack, and not as a queue. Figure 2 illustrates an execution of the algorithm.
Processes share a unique CAS register last, that stores either the initial value, empty, or nodes of the form , where head is a value appended by a process and tail has the same type as last.
When a process invokes , it first attempts to add at the top of the stack, by winning a compare&swap on last. If this attempt fails and a value was inserted concurrently by , continuously tries to inserts after . Similarly to Algorithm 2, eventually succeeds because it only competes in this task with the finite set of processes that read last before won its compare&swap. After inserted in the list, it traverses the list to its end to build its return sequence. Traversal is done in a last-in-first out manner, so read values are appended at the beginning of .
Consensus is a central problem in distributed computing, because it allows wait-free linearizable implementations of all objects with a sequential specification, in systems composed of asynchronous processes that may crash. In this paper, we asked the question of whether the result still hold in the infinite arrival model, in which a potentially infinite number of processes can arrive and leave during an execution. We answered this question positively by giving a wait-free and linearizable universal construction using only consensus objects and read/write registers.
Our proposed construction is based on two lists containing all the operations invoked on the object. An interesting question is whether it is possible to reduce the cost in memory by eventually removing the operations from both lists. Our approach does not allow such optimizations, as it relies on a “passive helping” mechanism, in which processes winning a consensus instance allocate memory locations to host the value of potential processes that loose all the consensus instances they are involved in.
This work was partially supported by the French ANR project 16-CE25-0005 O’Browser devoted to the study of decentralized applications on Web browsers.
-  (2011) From bounded to unbounded concurrency objects and back. In Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing, pp. 119–128. Cited by: §1.
Wait-free consensus with infinite arrivals.
Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pp. 524–533. Cited by: §1.
-  (2017-05) Long-Lived Tasks. In NETYS 2017 - 5th International Conference on NETworked sYStems, Vol. 10299, Marrakech, Morocco, pp. 439–454. External Links: Cited by: §3.
-  (2015) Help!. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, pp. 241–250. Cited by: §1.
-  (2005) Active disk paxos with infinitely many processes. Distributed Computing 18 (1), pp. 73–84. Cited by: §1.
-  (2014) Highly-efficient wait-free synchronization. Theory Comput. Syst. 55 (3), pp. 475–520. Cited by: 1st item.
-  (1985) Impossibility of distributed consensus with one faulty process. J. ACM 32 (2), pp. 374–382. Cited by: §1.
-  (2013) Power and limits of distributed computing shared memory models. Theor. Comput. Sci. 509, pp. 3–24. Cited by: §1.
-  (2008) The art of multiprocessor programming. Morgan Kaufmann. External Links: Cited by: §1, §4, §4, §4.
-  (1990) Linearizability: A correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12 (3), pp. 463–492. Cited by: §1, §1.
-  (1988) Impossibility and universality results for wait-free synchronization. In Proceedings of the Seventh Annual ACM Symposium on Principles of Distributed Computing, Toronto, Ontario, Canada, August 15-17, 1988, pp. 276–290. Cited by: §1.
-  (1991) Wait-free synchronization. ACM Transactions on Programming Languages and Systems (TOPLAS) 13 (1), pp. 124–149. Cited by: §1.
-  (2003) Resilient consensus for infinitely many processes. In International Symposium on Distributed Computing, pp. 1–15. Cited by: §1, §1.
-  (2013) Concurrent programming - algorithms, principles, and foundations. Springer. External Links: Cited by: §1.
-  (2017) Distributed universal constructions: a guided tour. Bulletin of the EATCS 121. Cited by: footnote 1.
-  (2018) Distributed computing pearls. Synthesis Lectures on Distributed Computing Theory, Morgan & Claypool Publishers. External Links: Cited by: §1.