 # MSO+nabla is undecidable

This paper is about an extension of monadic second-order logic over infinite trees, which adds a quantifier that says "the set of branches π which satisfy a formula ϕ(π) has probability one". This logic was introduced by Michalewski and Mio; we call it MSO+nabla following Shelah and Lehmann. The logic MSO+nabla subsumes many qualitative probabilistic formalisms, including qualitative probabilistic CTL, probabilistic LTL, or parity tree automata with probabilistic acceptance conditions. We consider the decision problem: decide if a sentence of MSO+nabla is true in the infinite binary tree? For sentences from the weak variant of this logic (set quantifiers range only over finite sets) the problem was known to be decidable, but the question for the full logic remained open. In this paper we show that the problem for the full logic MSO+nabla is undecidable.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Probability and logics that reason about it have been present in verification since the very beginning. An early example [20, 21] is the following question: given an ltl

formula and a Markov chain, decide if almost all (in the sense of measure) runs of the system satisfy the formula. Another early example

 is: given a formula of probabilistic ctl, decide if there is some Markov chain where the formula is true (the complexity of the problem is settled in ). The same question for the more general logic ctl is answered in [14, Theorem 1 and 2, and Section 15]. Other variants of these logics have been considered in [13, 2]. More recently, there has been an effort on synthesizing controllers for probabilistic systems that ensure some /̄regular condition surely and another one almost surely, see [4, Theorem 15]

Is there a master theorem, which unifies all decidability results about probabilistic logics? An inspiration for such a master theorem would be Rabin’s famous theorem  about decidability of monadic second/̄order logic over infinite trees. Rabin’s theorem immediately gives most decidability results (if not the optimal complexities) about temporal logics, including satisfiability questions for (non/̄probabilistic) logics like ltl, ctl and the modal /̄calculus. Maybe there is a probabilistic extension of Rabin’s theorem, which does the same for probabilistic logics?

Quite surprisingly, the question about a probabilistic version of Rabin’s theorem has only been asked recently, by Michalewski and Mio . It is rather easy to see that any decidable version of mso must be qualitative rather than quantitative (i.e. probabilities can be compared to and , but not to other numbers), since otherwise one could express problems like “does a given probabilistic automaton accept some word with probability at least ”, which are known to be undecidable , see also . Even when probabilities are qualitative, one has to be careful to avoid undecidability. For example, the following problem is undecidable [1, Theorem 7.2]: given a Büchi automaton, decide if there is some /̄word that is accepted with a non/̄zero probability (assuming that runs of the automaton are chosen at random, flipping a coin for each transition). This immediately implies [16, Theorem 1] undecidability for a natural probabilistic extension of mso, which has a quantifier of the form “there is a non/̄zero probability of picking a set of positions that satisfies ”, both for infinite words and infinite trees.

Michalewski and Mio propose a different probabilistic extension of mso, which does not admit any straightforward reductions from known undecidable problems, like the ones for probabilistic Büchi automata mentioned above. Their idea—which only makes sense for trees and not words—is to extend mso over the infinite binary tree by a quantifier which says that a property of branches is true almost surely, assuming the coin-flipping measure on infinite branches in the complete binary true. The logic proposed by Michalewski and Mio is obtained from Rabin’s mso by adding the probabilistic quantifier for branches. We write mso+ for this logic333In  the quantifier is denoted by , but in this paper we denote it by , following the notation used by Shelah and Lehmann in .. As explained in , mso+ directly expresses qualitative problems like: model checking Markov chains for ltl objectives, their generalisations such as 2 player games with /̄regular objectives, or emptiness for various automata models with probability including the qualitative tree languages from . These results naturally lead to the question [16, Problem 1]: is the logic mso+ decidable?

A positive result about mso+ was proved in [5, 7]: the weak fragment of mso+ is decidable. In the weak fragment, the set quantifiers and of mso range only over finite sets444Actually, the papers prove decidability for a stronger logic, where set quantifiers range over “thin” sets, which are a common generalisation of finite sets and infinite branches.. The proof uses automata: for every formula of the weak fragment there is an equivalent automaton of a suitable kind [5, Theorem 8], and emptiness for these automata is decidable [7, Theorem 3]. Combining these results, one obtains decidable satisfiability555For weak logics the satisfiability problem “is a given formula true in some infinite labelled binary tree” is in general more difficult than the model checking problem “is a given formula true in the unlabelled binary tree”. For general mso, this difference disappears, as set quantification can be used to guess labellings. for the weak fragment of mso+. The weak fragment of mso+ is still powerful enough to subsume problems like satisfiability for qualitative probabilistic ctl. Nevertheless, the decidability of the full logic mso+ remained open.

This paper proves that the full logic mso+ is undecidable, i.e. it is undecidable if a sentence of the logic is true in the complete binary tree, thus answering [16, Problem 1]. Independently and in parallel another proof of this result is given in , by proving that the emptiness problem of qualitative universal parity tree automata is undecidable.

Because the logic seems to be very close to the decidability frontier, our undecidability proof requires a lot of care to encode Turing machines using the very limited and asymptotic means available in

mso+. Informally speaking, the difficulty is that any pair of branches bound using the  quantifier have at most finite joint prefix, and the logic is designed so that it is invariant under finite perturbations. To overcome this obstacle, our proof strategy uses “global” properties instead of local ones.

The main technical result in the proof is that mso+ can express the following property about disjoint intervals. Define an interval to be a finite path in the complete binary tree, i.e. a set of nodes which connects some tree node with one of its descendants. For a family of pairwise disjoint intervals, consider the following property:

• Almost surely a branch satisfies:

• there is some such that:

• with finitely many exceptions, if an interval from intersects , then it has size .

In Lemma III.1, we show that the property above can be expressed in mso+. From this, undecidability of the logic can be established using standard methods, by describing runs of counter machines. Note how that above property is asymptotic in two ways: (a) it talks only about almost all branches, and (c) it allows finitely many exceptions. The fact that we can only express such asymptotic behaviour is a testament to the difficulty of isolating counting behaviour in the logic mso+. The proof of Lemma III.1 occupies most of this paper, and builds on the ideas developed in the undecidability proofs from [8, 6], which deal with the logic mso+u—another quantitative extension of mso, where the quantitative part talks not about probability, but about boundedness.

## Ii Notation

Denote the set by . The set of all nodes in the full binary tree is , that is the set of finite words over the alphabet . If then by we denote the length of . Let be the usual descendant relation on . The set of (infinite) branches of the tree is denoted . We will identify a branch with the corresponding set of nodes . In particular, given a branch and a node , we write , if is a node in the branch . The coin/̄tossing measure on (with the /̄algebra generated by the cylinders) is the unique complete probabilistic measure that satisfies

 P[x⋅2ω]=2−|x|,

for all . Often we will be interested in the conditional probability, defined as follows:

 P[R ∣∣ x]def=2|x|⋅P[R∩x⋅2ω]∈[0,1], (1)

where and is a /̄measurable set. If we think that the random choice of a branch is done iteratively, by choosing its successive directions, the value in (1) is the probability that the further choices will generate a branch in , assuming that we’ve already reached during that process.

mso+ is mso on the binary tree, extended with a probabilistic branch quantifier , that binds a branch and such that is true if and only if there exists a measurable set , such that and for all , is true. Intuitively, it means that holds for a randomly chosen branch.

Let be two nodes, we will use the following notation for intervals:

 [x,y]={u : x≤u≤y}.

Define the source and target functions of intervals respectively as , and . We extend these two functions to sets of intervals in the obvious way. For an interval , define , and extend it to sets of intervals as . The length of an interval (denoted ) is the cardinality of its , i.e. .

Consider a set of pairwise disjoint intervals (for the sake of brevity, in the rest of the paper, we simply say “a set of intervals” instead of “a set of pairwise disjoint intervals”). For all , let (respectively ) be the set of sources (resp. targets) of intervals in for which the number of /̄ancestors in is exactly  (resp. ). We call the set the th level of . Notice that the sets are pairwise disjoint; each pair of distinct elements of such a set is /̄incomparable; and if then for some we have and . Moreover, and .

Given two sets of nodes and , and a branch we write in (i.e. infinitely often) if there are infinitely many nodes in . For the dual property we write in (i.e. finitely often). We say in , if for each descendant in we have . Dually, in means that there is a descendant in such that .

In all the above notions we can omit the branch and write e.g.  as a set of branches: . To simplify the notation we will write logical connectives between such properties of branches, i.e.  is the set of branches in which both sets and appear infinitely often. The same applies to other logical connectives.

## Iii Bounded intervals

Let be a set of (pairwise disjoint) intervals and  a branch. For all we denote by  the length of the interval in starting at . When appears infinitely often in , we say that is defined in and write in . In this case, (for all ) defines a sequence of natural numbers, which we denote by . Again, stands for the set of branches where is defined; is the set of branches where the sequence is bounded (i.e. ); and is the set of branches where the sequence is unbounded (i.e. ).

In other words, we associate to each source of an interval an integer: the length of the interval that begins at that node. Then, the branches that contain infinitely many sources of intervals define infinite sequences of integers.

Such a sequence is eventually constant if there exists a number such that all except finitely many elements of the sequence are equal to . The main technical contribution of this paper is the following lemma.

###### Lemma III.1.

One can express in mso+ that  is a set of intervals such that

 P[D def∧D is % eventually constant]=1.

This section presents the first step towards the above lemma: it shows how to express (up to probability ) properties of boundedness of sequences , in the logic mso+.

Formally speaking, a set of (pairwise disjoint) intervals is a set of sets of nodes. We cannot represent it as such in second/̄order logic. However, as we work only with sets of pairwise disjoint intervals, one can encode in mso using two sets of nodes: the set of sources and the set of targets . To every source we can easily associate its target and vice versa to every target we can easily associate its source, see Appendix -B. It means that properties like are also expressible in mso.

We will now make two observations that reveal that there is a connection between the lengths of intervals and whether targets of intervals appear infinitely often. This connection is the core idea that makes possible expressing more complicated properties of sequences of numbers in our formalism later on.

First, we observe that if we have a set of intervals that are all of equal length, then in almost every branch if sources of intervals appear infinitely often, then so do the targets.

###### Lemma III.2.

Let and be a set of intervals whose lengths are exactly . Then we have:

 P[σ(D) io⟺τ(D) io]=1.
###### Proof.

Since every node in is a descendant of a node in (the source of the respective interval), the implication holds on every branch. To prove the converse, assume towards a contradiction that there is a non/̄zero probability that . As in implies that from some point on there must be no member in in , we obtain that equals

 ⋃x∈2∗[σ(D) io∧Globallyx(¬τ(D))]∩x⋅2ω.

By /̄additivity of the measure, the fact that the above set has positive probability implies that there exists some such that

 P[σ(D) io∧Globallyx0(¬τ(D)) ∣∣ x0]>0. (2)

Notice that for each source we have , because the interval whose source is has length exactly and its target belongs to . In other words, when going down the tree from , whenever we visit some node , , the relative probability that we further avoid is below . This means that

 P[σ(D) io∧Globallyx0(¬τ(D)) ∣∣ x0] ≤limn→∞(1−2−b)n=0,

contradicting (2). A more direct (but abstract) proof of this lemma can be given by using Lévy zero/̄one law, specified to the context of branches of infinite trees instead of martingales. ∎

Next we turn our attention to the dual case: we observe that if a set of intervals is such that as we go down the tree we meet longer and longer intervals, then on almost every branch the targets appear only finitely often. In other words, if the intervals are getting longer and longer, there is less and less chance of meeting any of the targets.

###### Lemma III.3.

Let be a set of intervals such that for all , . Then we have:

 P[τ(D) fo]=1.
###### Proof.

Note that it is sufficient to prove that there exists such that for all :

 P[Finallyy(τ(D)) ∣∣ y]≤1−ε. (3)

Indeed, the above inequality implies that to satisfy a branch needs to infinitely often satisfy a property relatively to the current node . The probability of such an event is at most .

To prove (3) consider in the th level of for some and a number . Let denote the set . Then, to reach  when going down the tree from , one needs to first visit a node and then reach from . This means that

 P[Finallyy(τk(D)) ∣∣ y] (1)=∑x∈S2|y|−|x|⋅P[Finallyx(τk(D)) ∣∣ x] (2)≤∑x∈S2|y|−|x|⋅2−k−1(3)≤2−k−1,

where: the first equality follows from the fact that the elements of are pairwise /̄incomparable; the second inequality follows from the fact that if then by lemma’s assumption and thus the relative probability of reaching from is at most ; and the third inequality follows again from the fact that is a /̄antichain contained in and thus .

The above equation implies that

 P[Finallyy(τ(D)) ∣∣ y]≤∑k>k02−k−1=2−k0−1≤12.

Therefore, taking is enough to guarantee (3). ∎

###### Definition III.4 (Record breakers).

Let be a set of intervals. The record breakers of is the set of intervals that contains an interval if and only if is larger than for every , .

Notice that if are the record breakers of then for all we have . In the following, we will write for the set of branches such that the sequence contains a bounded subsequence. Similarly for and or instead of . If a sequence is finite then assume that are taken as the last element of the sequence—this means that the values are finite in that case. Notice that the set of branches is equal to .

### Iii-a Boundedness

Most of the statements in the forthcoming sections take the form of an equivalence between a semantic property of sets of intervals and a condition that is easily definable in mso+. For the sake of completeness, Appendix -B argues how one can actually express all these conditions in mso+.

###### Lemma III.5.

Let be a set of intervals. Then the following statements are equivalent:

• ,

• there exists such that and for all we have

 P[σ(D) io ⟺τ(D) io]=1.
###### Proof.

We begin with the forward implication. The first statement implies that there exists some constant such that . We let be the set of intervals in that have length exactly . Then follows from the assumption and the second statement comes from Lemma III.2.

For the converse implication, let be as in the second statement and assume towards a contradiction that:

 P[¬(C def)∨(limC=∞)]=1.

Then from the definition of we have:

 P[C′ def∧(limC′=∞)]>0.

If are the record breakers of , then from the above we have . The second statement implies that but this contradicts Lemma III.3. ∎

###### Lemma III.6.

Let be a set of intervals. Then the following statements are equivalent:

• ,

• there exists such that

 P[C def⟺D def] =1 and P[D def⟹(limD=∞)]=1.
###### Proof.

For the forward implication, take to be the record breakers of . The converse implication is immediate. ∎

###### Definition III.7.

We say that a set of intervals is unbounded if it satisfies the conditions from the lemma above, i.e. almost surely whenever is defined it is unbounded.

Given and a set of nodes , we say that is a characteristic of if

 P[X io⟺C ubnd]=1. (4)

A characteristic of a set of intervals allows us to represent explicitly in mso+ (up to probability ) the set of branches where a given set of intervals is unbounded. Thus, the following lemma is used multiple times when arguing about definability, see Remark .3.

###### Lemma III.8.

Let be a set of intervals and a set of nodes. Then the following statements are equivalent:

• is a characteristic of ,

• there exists that is unbounded such that and for each that is unbounded we have .

###### Proof.

Let be the record breakers of . Clearly  is unbounded and (as sets of branches):

 [E def] = [C ubnd]. (5)

We start with the forward implication of the lemma. Assume that is a characteristic of  and take . Such is unbounded and satisfies (5) what implies that , see (4). Moreover, if is unbounded then because .

Now consider the converse implication of the lemma. Let be as in the second statement and consider . Then we know that . Together with (5) it proves that . On the other hand, the assumptions imply that and as is unbounded also . This concludes the proof of (4) because . ∎

Finally, we note that every set of intervals always has a characteristic. It suffices to take where are the record breakers.

### Iii-B Asymptotic equivalence

We finish this section by introducing asymptotic equivalence: a relation between infinite sequences of numbers (called number sequences). If then by we denote the subsequence of taking only positions from , i.e. —notice that if is finite then is a finite sequence of numbers.

###### Definition III.9 (Asymptotic equivalence).

Given , we say that is asymptotically equivalent to , denoted , if and are bounded on the same sets of positions, i.e. for all , either both and are bounded or both are unbounded.

Consider mso

on infinite words for a moment. Suppose that we encode two number sequences with sets of intervals

, . A priori it is not possible to express in the logic666Even if we are allowed to speak about boundedness., unless we impose some restriction, such that there is some mso definable function that given the th interval of outputs the position of the th interval of . The simplest way of having this is to require that the intervals in and are alternating:

If , are arranged in such a way, the functions and are mso definable (the first neighbour to the left, or right respectively) and hence we are able to quantify over subsequences which enables us to express asymptotic equivalence in our formalism.

For trees we have the following definitions.

We call two sets of intervals , isolated if , i.e. there is no node that belongs both to an interval and to an interval .

###### Definition III.10 (Precedes).

Let , be isolated sets of intervals. We say that precedes  if for all there exists such that and there is no node strictly between and that belongs to .

The fact that precedes induces a function that maps as in the definition above. Additionally, for a set of intervals , we define:

 Suc(C)def={[x′,y′]∈C2 : Pre(x′)∈σ(C)}⊆C2,

and dually, for we put

 Pre(C)def={[x,y]∈C1 : ∃x′∈σ(C). Pre(x′)=x}.

For the sake of readability we will use the functions and without additional parameters, assuming that the sets and are known from the context. The picture on trees looks as follows:

In a branch , it might be the case that between consecutive intervals in , there are many sources of intervals from , so the encoding of the two sequences is not alternating, hence the following definition.

###### Definition III.11 (Preceding subsequence).

Let , be isolated sets of intervals such that precedes . Assume that is a branch where is defined i.e.  io in . By we denote the subsequence of that we get by applying only to the nodes for which there exists such that .

Notice that in the above definition we require to belong to , a priori we might have for some outside but for no such node in (in that case is not taken into ). Observe additionally that if precedes and is defined in a branch then is a number sequence (i.e. it is infinite). However, we are not claiming that is a set of intervals.

Typically, on a branch where is defined we have: a few intervals of then one interval in and so on. The sequence is taking into account only the intervals that immediately precede those of . It looks as follows:

###### Remark III.12.

Consider , two isolated sets of intervals such that precedes . Let be a branch on which is defined. In that case the two sequences and are both defined. Let be the th source of an interval in on . Then, by the definitions of the respective sequences:

 C2(π)(k) =C2(x′k), CPre1(π)(k) =Pre(C2)(Pre(x′k)).

This means that the two number sequences are in a sense synchronised and the function maps between the corresponding sources.

In other words, number sequence encodings and  are alternating as in the case of infinite words, which facilitates quantifying over their subsequences.

We prove now that we can express when the two sequences of numbers mentioned above, and , are asymptotically equivalent.

###### Lemma III.13.

Let , be isolated sets of intervals, such that precedes . Then the following statements are equivalent:

• ,

• either

 ∃C⊆Pre(C2).  P[Suc(C) def∧C bnd∧Suc(C) ubnd]>0, (6) or ∃C⊆C2.  P[C def∧C bnd∧Pre(C) ubnd]>0. (7)
###### Proof.

For the forward implication, assume that there exists a set of branches that has a non/̄zero probability, such that for each we have in and there exists a set of positions on which the sequence is bounded but is not (the dual case is analogues, see below). By /̄additivity of the measure, this implies that there exists such that:

 P[C2def∧∃X⊆N.(CPre1↾X≡b)∧(limsupC2↾X=∞)]>0. (8)

Take as the set intervals that have length equal to . Take any branch in the set from (8) and let be a witness. Clearly, must be infinite and therefore in and . On the other hand, because contains as a subsequence the lengths of intervals in  that are measured in , see Remark III.12. It means in particular that is defined in . Therefore, Condition (6) holds for and such , what means that the probability there is positive.

In the dual case, when for each there is  such that sequence is unbounded but is bounded, we know that there exists such that:

 P[C2def∧∃X⊆N.(limsupCPre1↾X=∞)∧(C2↾X≡b)]>0. (9)

In that case we take as the set of intervals of length equal to . For each branch in the set from (9) and its witness we have: in ; ; and — notice that the sequence contains the sequence as, possibly strict, subsequence. However, as the latter is unbounded, also the former must be unbounded. Therefore, Condition (7) holds.

For the converse implication, first assume that (6) is true and fix . Take any branch in the set measured in (6). Since in , by the definition of we have in . We will show that .

Let be the set of numbers such that . Then is unbounded by the assumption. On the other hand, and by the definition of we know that is a subsequence of and is therefore bounded. This concludes the proof that .

Finally, consider the last case that (7) holds and fix witnessing that. Take a branch from the set measured in (7). The fact that in implies directly that in . Take as the set of numbers such that . Then is bounded. However, contains as a subsequence and therefore is unbounded. Therefore, . ∎

## Iv Eventually constant intervals

We have shown that we can express properties of boundedness of intervals. In this section we will prove that, by making two sets of intervals interact with one another in a certain way, we can express the fact that one set of intervals is not only bounded, but also eventually constant. To this end, we will follow ideas from .

### Iv-a Vector sequences and asymptotic mixes

vector sequence is an element of . We say that a number sequence is an extraction of (denoted ) if for each the number is a component of (written simply ).

###### Definition IV.1 (Asymptotic mix).

Given two vector sequences , we say that is an asymptotic mix of if for all there exists such that .

A vector sequence has dimension if every vector in it has dimension . Notice that each vector of a vector sequence must be non/̄empty and therefore, always. The following lemma (that we state without a proof) makes a crucial connection between the dimension and asymptotic mixes, the latter being a property of boundedness of the components of vector sequences.

###### Lemma IV.2 ( Lemma 2.1).

Let , . There exists a vector sequence of dimension which is not an asymptotic mix of any vector sequence of dimension (nor any smaller dimension).

We will use this idea in the next section to prove that we can express in the logic the fact that a set of intervals is eventually constant. Prior to this, we will gather a couple of lemmas concerning asymptotic mixes that will be useful.

For a vector sequence denote by (respectively ) the number sequences that pick the minimal (respectively maximal) component of every vector. For a number sequence and we write if for all we have .

###### Definition IV.3 (Separation).

Let , be two vector sequences and . We say that separates from  if one of the following holds:

• and is unbounded,

• and is unbounded.

In the next lemma we prove that separability characterises when a vector sequence is not an asymptotic mix of another sequence.

###### Remark IV.4.

The reason why we give this equivalent definition of asymptotic mixes is that it will allow us in the sequel to partition certain sets of branches into countably many subsets (one for each bound ), for the purpose of then using the -additivity of the measure. Thereby allowing us to pull out one existential quantifier.

###### Lemma IV.5.

Let , be two vector sequences. Then  is not an asymptotic mix of if and only if there exists that separates from .

###### Proof.

We start with the forward implication. Given a number sequence we define the best response for as

 gf(n)=argminx∈g(n)|f(n)−x|.

So is the choice of components in that minimize the distance to .

Since is not an asymptotic mix of , there exists such that for all , ; in particular we have . This means that there exists such that one of the following holds:

• is bounded and is unbounded,

• is bounded and is unbounded.

By the definition of , in the first case is unbounded while is clearly bounded (by some ). In the second case we have for some while is unbounded. From here it follows that there exists that separates from .

For the backward implication, assume that separates from . In the first case of Definition IV.3 it suffices to construct by picking a component smaller than if it exists, and an arbitrary component otherwise. In the second case, we pick the maximal component. ∎

### Iv-B Wrappings

Our aim now is to enable encoding of vector sequences as pairs of sets of intervals. Recall the definition of from page II.

###### Definition IV.6 (Wrappings).

Let , be sets of intervals. We say that wraps if and for each interval we have .

Let , be sets of intervals such that wraps and take . Then such that and . All the s are sources of some intervals in . Define:

 →D(C,x)=(C(x1),C(x2),…,C(xD(x))).

Extend this definition to branches in such a way that if is defined in then is a vector sequence: if then equals .

In this way we can encode vector sequences using two sets of intervals , . The lengths of intervals in the outer layer are the dimensions of the vectors, while the lengths of the intervals in are the components. We illustrate this in the following picture: In this partial tree the set is an interval in , and are intervals in , . We have , , , , and . The vector that is encoded in is .

Using facts stated in the previous section about vector sequences, we will show how to express that the dimensions of a vector sequence (i.e. the lengths of intervals in ) are eventually constant, see Lemma III.1. First we give a few preparatory lemmas.

###### Definition IV.7 (Tail-precedes).

Let , be isolated sets of intervals. We say that tail/̄precedes if for all there exists such that and there is no node strictly between and that belongs to .

Note that tail/̄preceding is a stronger property than preceding given in Definition III.10, therefore if tail/̄precedes , and is defined in some branch then the sequences and are well/̄defined.

###### Lemma IV.8.

Let , such that . Then there exist and such that between any two nodes in there exists a node that belongs to and moreover .

###### Proof.

We construct for all , sets , and put , . For any node we say that is an /̄successor of if and there is no node strictly between and that is in . Similarly we define /̄successors.

Let where is the root node and define for all :

 Xndef= ⋃y∈Yn−1{x∈X : x is an\leavevmode% \nobreak\ X\={/}successor of y}, Yndef= ⋃x∈Xn {y∈Y : y is a\leavevmode% \nobreak\ Y\={/}successor of x}.

We can easily observe that for , constructed this way we have that between every two nodes in there is always a node in (in fact, also symmetrically, the nodes in are separated by nodes in ).

Let be a branch where both and appear infinitely often. Then the first non/̄root node in this branch that belongs to belongs to , after which the first node that belongs to belongs to , and so on. Consequently both and also appear infinitely often in . Therefore, . ∎