Decision problems for semantic equivalences have been a frequent topic in computer science. For pushdown automata (PDA) language equivalence was quickly shown undecidable, while the decidability in the case of deterministic PDA (DPDA) is a famous result by Sénizergues . A finer equivalence, called bisimulation equivalence or bisimilarity, has emerged as another fundamental behavioural equivalence ; for deterministic systems it essentially coincides with language equivalence. By  we can exemplify the first decidability results for infinite-state systems (a subclass of PDA, in fact), and refer to  for a survey of results in a relevant area.
One of the most involved results in this area shows the decidability of bisimilarity of equational graphs with finite out-degree, which are equivalent to PDA with alternative-free -steps (if an -step is enabled, then it has no alternative); Sénizergues  has thus generalized his decidability result for DPDA.
We recall that the complexity of the DPDA problem remains far from clear, the problem is known to be PTIME-hard and to be in TOWER (i.e., in the first complexity class beyond elementary in the terminology of ); the upper bound was shown by Stirling  (and formulated more explicitly in ). For PDA the bisimulation equivalence problem is known to be nonelementary  (in fact, TOWER-hard), even for real-time PDA, i.e. PDA with no -steps. For the above mentioned PDA with alternative-free -steps the problem is even not primitive recursive; its Ackermann-hardness was shown in .
The decidability proofs, both for DPDA and PDA, are involved and hard to understand. This paper aims to contribute to a clarification of the more general decidability proof, showing an algorithm deciding bisimilarity of PDA with alternative-free -steps.
The proof is shown in the framework of labelled transition systems generated by first-order grammars (FO-grammars), which seems to be a particularly convenient formalism. Here the states (or configurations) are first-order terms over a specified finite set of function symbols (or “nonterminals”); the transitions are induced by a first-order grammar, which is a finite set of labelled rules that allow to rewrite the roots of terms. This framework is equivalent to the framework of ; cf., e.g.,  for early references, or  for a concrete transformation of PDA to FO-grammars. The proof here is in principle based on the high-level ideas from the proof in  but with various simplifications and new modifications. The presented proof has resulted by a thorough reworking of the conference paper , aiming to get an algorithm that might be amenable to a complexity analysis.
Proof overview. We give a flavour of the process that is formally realized in the paper. It is standard to characterize bisimulation equivalence (also called bisimilarity) in terms of a turn-based game between Attacker and Defender, say. If two PDA-configurations, modelled by first-order terms in our framework, are non-bisimilar, then Attacker can force his win within rounds of the game, for some number ; in this case for the least such can be viewed as the equivalence-level of terms : we write and . If are bisimilar, i.e. , then Defender has a winning strategy and we put . A natural idea is to search for a computable function attaching a number to terms and a grammar so that it is guaranteed that or ; this immediately yields an algorithm that computes (concluding that when finding that ).
We will show such a computable function by analysing optimal plays from ; such an optimal play gives rise to a sequence , , , of pairs of terms where for , and (hence ). This sequence is then suitably modified to yield a certain sequence
such that and for all ; here we use simple congruence properties (if arises from by replacing a subterm with such that , then ). Doing this modification carefully, adhering to a sort of “balancing policy” (inspired by one crucial ingredient in [1, 5], used also in ) we derive that if is “large” w.r.t. , then the sequence (1) contains a “long” subsequence
called an -sequence, where the variables in all “tops” , are from the set , is the common “tail” substitution (maybe with “large” terms ), and the size-growth of the tops is bounded: for . The numbers are elementary in the size of the grammar . Then another fact is used (whose analogues in different frameworks could be traced back to [1, 5] and other related works): if , then there is and a term reachable from or within moves (i.e. root-rewriting steps) such that . This entails that for the tops in (2) can be replaced with , where is the regular term , without changing the equivalence-level; hence . Though might be an infinite regular term, its natural graph presentation is not larger than the presentation of . Moreover, does not occur in , and thus the term ceases to play any role in the pairs ().
By continuing this reasoning inductively (“removing” one in each of at most phases), we note that the length of -sequences (2) is bounded by a (maybe large) constant determined by the grammar . By a careful analysis we then show that such a constant is, in fact, computable when a grammar is given.
Further remarks on related research. Further work is needed to fully understand the bisimulation problems on PDA and their subclasses, also regarding their computational complexity. E.g., even the case of BPA processes, generated by real-time PDA with a single control-state, is not quite clear. Here the bisimilarity problem is EXPTIME-hard  and in 2-EXPTIME  (proven explicitly in ); for the subclass of normed BPA the problem is polynomial  (see  for the best published upper bound). Another issue is the precise decidability border. This was also studied in ; allowing that -steps can have alternatives (though they are restricted to be stack-popping) leads to undecidability of bisimilarity. This aspect has been also refined, for branching bisimilarity . For second-order PDA the undecidability is established without -steps . We can refer to the survey papers [22, 23] for the work on higher-order PDA, and in particular mention that the decidability of equivalence of deterministic higher-order PDA remains open; some progress in this direction was made by Stirling in .
2 Basic Notions and Facts
In this section we define basic notions and observe their simple properties. Some standard definitions are restricted when we do not need full generality.
By and we denote the sets of nonnegative integers and of positive integers, respectively. By , for , we denote the set . For a set , by we denote the set of finite sequences of elements of , which are also called words (over ). By we denote the length of , and by the empty sequence; hence . We put .
Labelled transition systems.
A labelled transition system, an LTS for short, is a tuple where is a finite or countable set of states, is a finite or countable set of actions and is a set of -transitions (for each ). We say that is a deterministic LTS if for each pair , there is at most one such that (which stands for ). By , where , we denote that there is a path ; the length of such a path is , which is zero for the (trivial) path . If , then is reachable from . By we denote that is enabled in , or is performable from , i.e., for some . If is deterministic, then the expressions and also denote a unique path.
Given , a set covers if for any there is such that , and for any there is such that . For we say that covers if covers each . A set is a bisimulation if covers . States are bisimilar, written , if there is a bisimulation containing . A standard fact is that is an equivalence relation, and it is the largest bisimulation, namely the union of all bisimulations.
We also put , and define (for ) as the set of pairs covered by . It is obvious that are equivalence relations, and that . For the (first limit) ordinal we put if for all ; hence . We will only consider image-finite LTSs, where the set is finite for each pair , . In this case is a bisimulation (for each and , in the finite set there must be one such that for infinitely many , which entails ), and thus .
To each pair of states we attach their equivalence level (eq-level):
Hence iff (i.e., and enable different sets of actions). The next proposition captures a few additional simple facts; we should add that we handle as an infinite amount, stipulating and for all .
If , then .
If , then there is either a transition such that for all transitions we have , or a transition such that for all transitions we have .
If and , then for such that .
1. If , , and , then and .
The points 2 and 3 trivially follow from the definition of (for ). ∎
First-order terms, regular terms, finite graph presentations.
We will consider LTSs in which the states are first-order regular terms.
The terms are built from variables taken from a fixed countable set
and from function symbols, also called (ranked) nonterminals, from some specified finite set ; each has . We reserve symbols to range over nonterminals, and to range over terms. An example of a finite term is , where the arities of nonterminals are , respectively. Its syntactic tree is depicted on the left of Fig.1.
We identify terms with their syntactic trees. Thus a term over is (viewed as) a rooted, ordered, finite or infinite tree where each node has a label from ; if the label of a node is , then the node has no successors, and if the label is , then it has (immediate) successor-nodes where . A subtree of a term is also called a subterm of . We make no difference between isomorphic (sub)trees, and thus a subterm can have more (maybe infinitely many) occurrences in . Each subterm-occurrence has its (nesting) depth in , which is its (naturally defined) distance from the root of . E.g., is a depth-2 subterm of ; is a subterm with a depth-1 and a depth-2 occurrences.
We also use the standard notation for terms: we write or with the obvious meaning; in the latter case , , and are the ordered depth- occurrences of subterms of , which are also called the root-successors in .
A term is finite if the respective tree is finite. A (possibly infinite) term is regular if it has only finitely many subterms (though the subterms may be infinite and may have infinitely many occurrences). We note that any regular term has at least one graph presentation, i.e. a finite directed graph with a designated root, where each node has a label from ; if the label of a node is , then the node has no outgoing arcs, if the label is , then it has ordered outgoing arcs where . We can see an example of such a graph presenting a term on the right in Fig. 1. The standard tree-unfolding of the graph is the respective term, which is infinite if there are cycles in the graph. There is a bijection between the nodes in the least graph presentation of and (the roots of) the subterms of .
Sizes, heights, and variables of terms.
By we denote the set of all regular terms over (and ); we do not consider non-regular terms. By a “term” we mean a general regular term unless the context makes clear that the term is finite.
By we mean the number of nodes in the least graph presentation of . E.g., in Fig.1 ( has six subterms) and . By we mean the number of nodes in the least graph presentation in which a distinguished node corresponds to the (root of the) term , for each . (Since can share some subterms, can be smaller than .) We usually write instead of . E.g., in Fig. 1.
For a finite term we define as the maximal depth of a subterm; e.g., in Fig.1.
We put occurs in and occurs in or . E.g., in Fig.1.
Substitutions, associative composition, iterated substitutions.
A substitution is a mapping whose support
is finite; we reserve the symbol for substitutions. By applying a substitution to a term we get the term that arises from by replacing each occurrence of with ; given graph presentations, in the graph of we just redirect each arc leading to a node labelled with towards the root of (which includes the special “root-designating arc” when ). Hence implies . The natural composition of substitutions, where is defined by , can be easily verified to be associative. We thus write instead of or . For we define inductively: is the empty-support substitution, and .
By , where for , we denote the substitution such that for all and for all . We will use just for the special case , where is clearly well-defined; a graph presentation of the term arises from a graph presentation of by redirecting each arc leading to (if any exists) towards the root; we have if , or if . In Fig.1, for we have and .
By we denote the substitution arising from by removing from its support (if it is there): hence and for all .
We note a trivial fact (for later use):
If , then for the term we have , and thus for any . We also have .
A first-order grammar, or just a grammar for short, is a tuple where is a finite nonempty set of ranked nonterminals, viewed as function symbols with arities, is a finite nonempty set of actions (or “letters”), and is a finite nonempty set of rules of the form
where , , , and is a finite term over in which each occurring variable is from the set ; we can have for some .
LTSs generated by rules, and by actions, of grammars.
Given , by we denote the (rule-based) LTS where each rule of the form induces transitions for all substitutions . The transition induced by with is .
Using terms from Fig.1 as examples, if a rule is , then we have (since can be written as where ); the action only plays a role in the LTS defined below (where we have ). For a rule we deduce ; we note that the third root-successor in thus “disappears” since .
By definition, the LTS is deterministic (for each and there is at most one such that ). We note that variables are dead (have no outgoing transitions). We also note that implies (each variable occurring in also occurs in ) but not in general.
Remark. Since the rhs (right-hand sides) in the rules (3) are finite, all terms reachable from a finite term are finite. The “finite-rhs version” with general regular terms in LTSs has been chosen for technical convenience. This is not crucial, since the equivalence problem for the “regular-rhs version” can be easily reduced to the problem for our finite-rhs version.
The deterministic rule-based LTS is helpful technically, but we are primarily interested in the (image-finite nondeterministic) action-based LTS where each rule induces the transitions for all substitutions . (Hence the rules and in the above examples induce and .)
Fig.2 sketches a path in some LTS where we have, e.g., and for some actions (which would replace in the LTS ). In the rectangle just a part of a regular-term presentation is sketched. Hence the initial root-node might be accessible from later roots due to its possible undepicted ingoing arcs. On the other hand, the root-node after the steps is not accessible (and can be omitted) in the presentation of the final term.
Eq-levels of pairs of terms.
Given a grammar , by we refer to the equivalence level of (regular) terms in , with the following adjustment: though variables are handled as dead also in , we stipulate if (while ); this would be achieved automatically if we enriched with transitions where is a special action added to each variable . This adjustment gives us the point in the next proposition on compositionality.
We put if for all , and define
For all , and the following conditions hold:
If , then . Hence .
In particular, .
If , then . Hence .
In particular, .
It suffices to prove the claims for , since . We use an induction on , noting that for the claims are trivial.
Assuming and , we show that : We cannot have for some (since then by our definition). Hence either for some , in which case , or and . In the latter case every transition () is, in fact, () where (), and there must be a corresponding transition () such that (by Proposition 2(3)); by the induction hypothesis , which shows that (since is covered by ).
This gives us the point . For the point we note that implies , which is even more straightforward to verify. ∎
If , then there are , , and , , such that , or , , and .
We assume and use an induction on . If , then necessarily for some (since , would imply as well); the claim is thus trivial (if , i.e. , then and , which entails that ).
For we must have , . There must be a transition (or ) such that for all (for all ) we have (by Proposition 2(2)). On the other hand, for each (and each ) there is () such that (by Proposition 2(3)); since and , the transitions , can be written , , respectively, where , . Hence there is a pair of transitions , such that and . We apply the induction hypothesis and deduce that there are , , and , , such that , or , , and , which entails (since ). Since , or , , we are done. ∎
Bounded growth of sizes and heights.
We fix a grammar , and note a few simple facts to the aim of later analysis; we also introduce the constants (size increase), (height increase) related to . We recall that the rhs-terms in the rules (3) are finite, and we put
|is the rhs of a rule in .||(4)|
We add that in this paper we stipulate .
By we mean the number of nonterminal nodes in the least graph presentation of (hence the number of non-variable subterms of ). We put
|is the rhs of a rule in .||(5)|
The next proposition shows (generous) upper bounds on the size and height increase caused by (sets of) transition sequences. (It is helpful to recall Fig. 2, assuming that the rectangle contains a presentation of .)
If , then .
If where is a finite term, then .
If , , , , where for all , then .
The points and are immediate. A “blind” use of in the point would yield . But since the terms can share subterms of , we get the stronger bound . ∎
Shortest sink words.
If in (hence ), then we call an -sink word. We note that such can be written where ; hence “sinks” along a branch of to , or when . This suggests a standard dynamic programming approach to find and fix some shortest -sink words for all elements of the set for which such words exist. We can clearly (generously) bound the lengths of by where (i.e., is the rhs of a rule in ). We put
The above discussion entails that is a (quickly) computable number, whose value is at most exponential in the size of the given grammar .
Remark. For any grammar we can construct a “normalized” grammar in which exists for each , while the LTSs and are isomorphic. (We can refer to  for more details.) We do not need such normalization in this paper.
Convention. When having a fixed grammar , we also put
but we will often write even if might be not maximal. This is harmless since such could be always replaced with if we wanted to be pedantic. (In fact, the grammar could be also normalized so that the arities of nonterminals are the same  but this is a superfluous technical issue here.)
3 Decidability of Bisimulation Equivalence of First-Order Grammars
We use the notion of “small” numbers determined by a grammar ; by saying that a number is small we mean that it is a computable number (for a given grammar ) that is elementary in the size of .
We first note a fact that is obvious (by induction on ):
There is an algorithm that, given a grammar , terms , and , decides if in the LTS . Hence the next theorem adds the decidability of (i.e., of for ).
For any grammar there is a small number and a computable (not necessarily small) number such that for all we have:
|if then .||(8)|
It is decidable, given , , , if in .
Theorem 3 is proven in the rest of this section. We start with some useful notions.
We fix a grammar . By an eqlevel-decreasing sequence we mean a sequence of pairs of terms (where ) such that . The length of such a sequence is obviously at most