Peregrine 2.0: Explaining Correctness of Population Protocols through Stage Graphs

07/15/2020 ∙ by Javier Esparza, et al. ∙ Technische Universität München 0

We present a new version of Peregrine, the tool for the analysis and parameterized verification of population protocols introduced in [Blondin et al., CAV'2018]. Population protocols are a model of computation, intensely studied by the distributed computing community, in which mobile anonymous agents interact stochastically to perform a task. Peregrine 2.0 features a novel verification engine based on the construction of stage graphs. Stage graphs are proof certificates, introduced in [Blondin et al., CAV'2020], that are typically succinct and can be independently checked. Moreover, unlike the techniques of Peregrine 1.0, the stage graph methodology can verify protocols whose executions never terminate, a class including recent fast majority protocols. Peregrine 2.0 also features a novel proof visualization component that allows the user to interactively explore the stage graph generated for a given protocol.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We present Peregrine 2.0111 Peregrine 2.0 is available at, a tool for analysis and parameterized verification of population protocols. Population protocols are a model of computation, intensely studied by the distributed computing community, in which an arbitrary number of indistinguishable agents interact stochastically in order to decide a given property of their initial configuration. For example, agents could initially be in one of two possible states, “yes” and “no”, and their task could consist of deciding whether the initial configuration has a majority of “yes” agents or not.

Verifying correctness and/or efficiency of a protocol is a very hard problem, because the semantics of a protocol is an infinite collection of finite-state Markov chains, one for each possible initial configuration. Peregrine 1.0

[5] was the first tool for the automatic verification of population protocols. It relies on theory developed in [6], and is implemented on top of the Z3 SMT-solver.

Peregrine 1.0 could only verify protocols whose agents eventually never change their state (and not only their answer). This constraint has become increasingly restrictive, because it is not satisfied by many efficient and succinct protocols recently developed for different tasks [1, 4, 2]. Further, Peregrine 1.0 was unable to provide correctness certificates and the user had to trust the tool. Finally, Peregrine 1.0 did not provide any support for computing parameterized bounds on the expected number of interactions needed to reach a stable consensus, i.e., bounds like “ interactions, where is the number of agents”.

Peregrine 2.0 addresses these three issues. It features a novel verification engine based on theory developed in [7, 3], which, given a protocol and a task description, attempts to construct a stage graph. Stage graphs are proof certificates that can be checked by independent means, and not only prove the protocol correct, but also provide a bound on its expected time-to-consensus. Stages represent milestones reached by the protocol on the way to consensus. Stage graphs are usually small, and help designers to understand why a protocol works. The second main novel feature of Peregrine 2.0 is a visualization component that offers a graphical and explorable representation of the stage graph.

The paper is organized as follows. Section 2 introduces population protocols and sketches the correctness proof of a running example. Section 3 describes the stage graph generated for the example by Peregrine 2.0, and shows that it closely matches the human proof. Section 4 describes the visualization component.

2 Population protocols

A population protocol consists of a set of states with a subset of initial states, a set of transitions, and an output function assigning to each state a boolean output. Intuitively, a transition means that two agents in states can interact and simultaneously move to states . A configuration is a mapping , where represents the number of agents in a state . An initial configuration is a mapping . A configuration has consensus if all agents are in states with output . We write configurations using a set-like notation, e.g. or is the configuration where , and for .

Running example: Majority Voting.

The goal of this protocol is to conduct a vote by majority in a distributed way. The states are . Initially, all agents are in state or , according to how they vote. The goal of the protocol is that the agents determine whether at least 50% of them vote “yes”.

The output function is and . When two agents interact, they change their state according to the following transitions:

Intuitively, agents are either active (, ) or passive (, ). By transition , when active agents with opposite opinions meet, they become passive. Transitions and let active agents change the opinion of passive agents. Transition handles the case of a tie.

Computations in population protocols.

Computations use a stochastic model: starting from an initial configuration , two agents are repeatedly picked, uniformly at random, and the corresponding transition is applied. This gives rise to an infinite sequence of configurations, called a run. A run stabilizes to consensus if from some point on all configurations have consensus . Intuitively, in a run that stabilizes to the agents eventually agree on the answer . Given a population protocol and a predicate that maps every configuration to a value in , we say that computes if for every initial configuration , a run starting at stabilizes to consensus

with probability 1. The

correctness problem consists of deciding, given and , whether computes . Intuitively, a correct protocol almost surely converges to the consensus specified by the predicate. Majority Voting is correct and computes the predicate that assigns to the configurations where initially at least 50% of the agents are in state , i.e. we have .

Majority Voting is correct.

To intuitively understand why the protocol is correct, it is useful to split a run into phases. The first phase starts in the initial configuration, and ends when two agents interact using transition

for the last time. Observe that this moment arrives with probability 1 because passive agents can never become active again. Further, at the end of the first phase either all active agents are in state

, or they are all in state . The second phase ends when the agents reach a consensus for the first time, that is, the first time that either all agents are in states , or all are in states . To see that the second phase ends with probability 1, consider three cases. If initially there is a majority of “yes”, then at the end of the first phase no agent is in state , and at least one is in state . This agent eventually moves all passive agents in state to state using transition , reaching a “yes” consensus. The case with an initial majority of “no” is symmetric. If initially there is a tie, then at the end of the first phase all agents are passive, and transition eventually moves all agents in state to , again resulting in a “yes” consensus. The third phase is the rest of the run. We observe that once the agents reach a consensus no transition is enabled, and so the agents remain in this consensus, proving that the protocol is correct.

3 Protocol verification with Peregrine 2.0

Peregrine 2.0 allows the user to specify and edit population protocols. (Our running example is listed in the distribution as Majority Voting.) After choosing a protocol, the user can simulate it and gather statistics, as in Peregrine 1.0 [5]. The main feature of Peregrine 2.0 is its new verification engine based on stage graphs, which closely matches the “phase-reasoning” of the previous section.

Stage Constraint Certificate Speed
Figure 1: Stage graphs for Majority Voting protocol with constraints, certificates and speeds. The expression and denote abstractions of the reachability relation, which are a bit long and therefore omitted for clarity.

Stage graphs.

A stage graph is a directed acyclic graph whose nodes, called stages, are possibly infinite sets of configurations, finitely described by a Presburger formula. Stages are inductive, i.e. closed under reachability. There is an edge to a child stage if , and no other stage satisfies . Peregrine 2.0 represents stage graphs as Venn diagrams like the ones on the left of Figure 1. Stages containing no other stages are called terminal, and otherwise non-terminal. Intuitively, a phase starts when a run enters a stage, and ends when it reaches one of its children.

Each non-terminal stage comes equipped with a certificate. Intuitively, a certificate proves that runs starting at any configuration of will almost surely reach one of its children and, since is inductive, get trapped there forever. Loosely speaking, certificates take the form of ranking functions bounding the distance of a configuration to the children of , and are also finitely represented by Presburger formulas. Given a configuration and a certificate , runs starting at reach a configuration satisfying with probability 1.

To verify that a protocol computes a predicate we need two stage graphs, one for each output. The roots of the first stage graph contain all initial configurations with and the terminal stages contain only configurations with consensus 0. The second handles the case when .

Stage graphs for Majority Voting.

For the Majority Voting protocol Peregrine 2.0 generates the two stage graphs of Figure 1 in a completely automatic way. By clicking on a stage, say , the information shown in Figure 3 is displayed. The constraint describes the set of configurations of the stage (Figure 1 shows the constraints for all stages). In particular, all the configurations of satisfy , that is, all agents initially in state have already become passive. The certificate indicates that a run starting at a configuration eventually reaches or a configuration such that . Peregrine 2.0 also displays a list of dead transitions that can never occur again from any configuration of , and a list of eventually dead transitions, which will become dead whenever a child stage, in this case , is reached.

Figure 2: Details of stage in Figure 1 at configuration . The terms are the number of agents in state .
Figure 3: Partially constructed Markov chain after a simulation of the Majority Voting protocol inside the protocol’s stage graphs, with = selected.

While they are automatically generated, these stage graphs closely map the intuition above. The three stages of each graph naturally correspond to the three phases of the protocol: and correspond to the first phase (we reduce or ), and to the second phase ( or is zero, and we reduce or ), and and to the third phase (all agents are in consensus).


Because agents interact randomly, the length of the phase associated to a stage is a random variable (more precisely, a variable for each number of agents). The expected value of this variable is called the

speed of the stage. A stage has speed if for every the expected length of the phase for configurations with agents is at most for some constant . Peregrine 2.0 computes an upper bound for the speed of a stage using the techniques of [7]. The last column of Figure 1 gives the upper bounds on the speed of all stages. Currently, Peregrine 2.0 can prove one of the bounds , , for some and . Observe that for stage of Majority Voting the tool returns . Majority Voting is indeed very inefficient, much faster protocols exist.

4 Visualizing runs in the stage graph

To further understand the protocol, Peregrine 2.0 allows the user to simulate a run and monitor its progress through the stage graph. The simulation is started at a chosen initial configuration or a precomputed example configuration of a stage. The current configuration is explicitly shown and also highlighted as a yellow circle in the stage graph. To choose the next pair of interacting agents, the user can click on them. The resulting interaction is visualized, and the successor configuration is automatically placed in the correct stage, connected to the previous configuration. After multiple steps, this partially constructs the underlying Markov chain of the system as shown in Figure 3. One can also navigate the current run by clicking on displayed configurations or using the PREV and NEXT buttons.

Figure 4: Counterexample automatically found by Peregrine when verifying Majority Voting (broken), shown in the stage graphs as a run from = to = . The graph with root is only a partial stage graph, because stage contains configurations that do not have the correct consensus.

Beyond choosing pairs of agents one by one, the user can simulate a full run of the protocol by clicking on PLAY. The acceleration slider allows to speed up this simulation. However, if the overall speed of the protocol is very slow, a random run might not make progress in a reasonable time frame. An example for this is the Majority Voting protocol for populations with a small majority for , where the expected number of interactions to go from to is . Thus, even for relatively small configurations like a random run is infeasible. To make progress in these cases, one can click on PROGRESS. This automatically chooses a transition that reduces the value of the certificate. Intuitively, reducing the certificate’s value guides the run towards a child stage and thus, the run from to needs at most steps. To visualize the progress, the value of the stage’s certificate for the current configuration is displayed in the stage details as in Figure 3 and next to the PROGRESS button.

Finding counterexamples.

The speed of stage with certificate is so low because of transition that increases the value of the certificate and may be chosen with high probability. Removing the transition makes the protocol faster (this variant is listed in the distribution as “Majority Voting (broken)”). However, then Peregrine cannot verify the protocol anymore, and it even finds a counterexample: a run that does not stabilize to the correct consensus. Figure 4 shows the counterexample ending in the configuration from the initial configuration , i.e. a configuration with a tie. In this case, the configuration should stabilize to 1, but no transition is applicable at , which does not have consensus 1. This clearly shows why we need the transition . Note however that the left part with root stage in Figure 4 is a valid stage graph, so the modified protocol works correctly in the negative case. This helps locate the cause of the problem.