# Rewriting Higher-Order Stack Trees

Higher-order pushdown systems and ground tree rewriting systems can be seen as extensions of suffix word rewriting systems. Both classes generate infinite graphs with interesting logical properties. Indeed, the model-checking problem for monadic second order logic (respectively first order logic with a reachability predicate) is decidable on such graphs. We unify both models by introducing the notion of stack trees, trees whose nodes are labelled by higher-order stacks, and define the corresponding class of higher-order ground tree rewriting systems. We show that these graphs retain the decidability properties of ground tree rewriting graphs while generalising the pushdown hierarchy of graphs.

## Authors

• 2 publications
10/13/2020

### Higher-Order Recursion Schemes and Collapsible Pushdown Automata: Logical Properties

This paper studies the logical properties of a very general class of inf...
05/18/2021

### A Spatial Logic for a Simplicial Complex Model

Collective adaptive systems (CAS) consist of many heterogeneous componen...
10/21/2019

### Reasoning About Recursive Tree Traversals

Traversals are commonly seen in tree data structures, and performance-en...
10/27/2017

### Polymorphism and the obstinate circularity of second order logic: a victims' tale

The investigations on higher-order type theories and on the related noti...
03/14/2019

### A Functional (Monadic) Second-Order Theory of Infinite Trees

This paper presents a complete axiomatization of Monadic Second-Order Lo...
02/25/2022

### Dynamical systems on directed hyper-graphs

Networks and graphs provide a simple but effective model to a vast set o...
04/20/2021

### What are higher-order networks?

Modeling complex systems and data using the language of graphs and netwo...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Since Rabin’s proof of the decidability of monadic second order logic (MSO) over the full infinite binary tree [14], there has been an effort to characterise increasingly general classes of structures with decidable MSO theories. This can be achieved for instance using families of graph transformations which preserve the decidability of MSO - such as the unfolding or the MSO-interpretation and applying them to graphs of known decidable MSO theories, such as finite graphs or the graph .

This approach was followed in [8], where it is shown that the prefix (or suffix) rewriting graphs of recognisable word rewriting systems, which coincide (up to graph isomorphism) with the transition graphs of pushdown automata (contracting -transitions), can be obtained from using inverse regular substitutions, a simple class of MSO-compatible transformations. They also coincide with those obtained by applying MSO interpretations to [1]. Alternately unfolding and applying inverse regular mappings to these graphs yields a strict hierarchy of classes of trees and graphs with a decidable MSO theory [9, 7] coinciding with the transition graphs of higher-order pushdown automata and capturing the solutions of safe higher-order program schemes111This hierarchy was extended to encompass unsafe schemes and collapsible automata, which are out of the scope of this paper. See [4, 6, 3] for recent results on the topic., whose MSO decidability had already been established in [12]. We will henceforth call this the pushdown hierarchy and the graphs at its -th level -pushdown graphs for simplicity.

Also well-known are the automatic and tree-automatic structures (see for instance [2]), whose vertices are represented by words or trees and whose edges are characterised using finite automata running over tuples of vertices. The decidability of first-order logic (FO) over these graphs stems from the well-known closure properties of regular word and tree languages, but it can also be related to Rabin’s result since tree-automatic graphs are precisely the class of graphs obtained from using finite-set interpretations [10], a generalisation of WMSO interpretations mapping structures with a decidable MSO theory to structures with a decidable FO theory. Applying finite-set interpretations to the whole pushdown hierarchy therefore yields an infinite hierarchy of graphs of decidable FO theory, which is proven in [10] to be strict.

Since prefix-recognisable graphs can be seen as word rewriting graphs, another variation is to consider similar rewriting systems over trees. This yields the class of ground tree rewriting graphs, which strictly contains that of real-time order 1 pushdown graphs. This class is orthogonal to the whole pushdown hierarchy since it contains at least one graph of undecidable MSO theory, for instance the infinite 2-dimensional grid. The transitive closures of ground tree rewriting systems can be represented using ground tree transducers, whose graphs were shown in [11] to have decidable FO[] theories by establishing their closure under iteration and then showing that any such graph is tree-automatic.

The purpose of this work is to propose a common extension to both higher-order stack operations and ground tree rewriting. We introduce a model of higher-order ground tree rewriting over trees labelled by higher-order stacks (henceforth called stack trees), which coincides, at order 1, with ordinary ground tree rewriting and, over unary trees, with the dynamics of higher-order pushdown automata. Following ideas from the works cited above, as well as the notion of recognisable sets and relations over higher-order stacks defined in [5], we introduce the class of ground (order ) stack tree rewriting systems, whose derivation relations are captured by ground stack tree transducers. Establishing that this class of relations is closed under iteration and can be finite-set interpreted in -pushdown graphs yields the decidability of their FO[] theories.

The remainder of this paper is organised as follows. Section 2 recalls some of the concepts used in the paper. Section 3 defines stack trees and stack tree rewriting systems. Section 4 explores a notion of recognisability for binary relations over stack trees. Section 5 proves the decidability of FO[] model checking over ground stack tree rewriting graphs. Finally, Section 6 presents some further perspectives.

## 2 Definitions and notations

#### Trees.

Given an arbitrary set , an ordered -labelled tree of arity at most is a partial function from to such that the domain of , is prefix-closed (if is in , then every prefix of is also in ) and left-closed (for all and , is defined only if is for every ). Node is called the -th child of its parent node . Additionally, the nodes of are totally ordered by the natural length-lexicographic ordering over . By abuse of notation, given a symbol , we simply denote by the tree reduced to a unique -labelled node. The frontier of is the set . Trees will always be drawn in such a way that the left-to-right placement of leaves respects . The set of trees labelled by is denoted by . In this paper we only consider finite trees, i.e. trees with finite domains.

Given nodes and , we write if is a prefix of , i.e. if there exists , . We will say that is an ancestor of or is above , and symmetrically that is below or is its descendant. We call the prefix of of length . For any , is called the label of node in and is the sub-tree of rooted at . For any , we call the arity of , i.e. its number of children. When is understood, we simply write . Given trees and a -tuple of positions , we denote by the tree obtained by replacing the sub-tree at each position in by , i.e. the tree in which any node not below any is labelled , and any node with is labelled . In the special case where is a -context, i.e. contains leaves labelled by special symbol , we omit and simply write .

#### Directed Graphs.

A directed graph with edge labels in is a pair where is a set of vertices and is a set of edges. Given two vertices and , we write if , if there exists such that , and if there exists such that . There is a directed path in from to labelled by , written , if there are vertices such that , and for all , . We additionally write if there exists such that , and if there is such a path with . A directed graph is connected if there exists an undirected path between any two vertices and , meaning that . We omit from all these notations when it is clear from the context. A directed graph is acyclic, or is a DAG, if there is no such that . The empty DAG consisting of a single vertex (and no edge, hence its name) is denoted by . Given a DAG , we denote by its set of vertices of in-degree , called input vertices, and by its set of vertices of out-degree , called output vertices. The DAG is said to be of in-degree and of out-degree . We henceforth only consider finite DAGs.

#### Rewriting Systems.

Let and be finite alphabets. A -labelled ground tree rewriting system (GTRS) is a finite set of triples called rewrite rules, with and finite -labelled trees and a label. The rewriting graph of is , where and . The rewriting relation associated to is , its derivation relation is . When restricted to words (or equivalently unary trees), such systems are usually called suffix (or prefix) word rewriting systems.

## 3 Higher-Order Stack Trees

### 3.1 Higher-Order Stacks

We briefly recall the notion of higher-order stacks (for details, see for instance [5]). In order to obtain a more straightforward extension from stacks to stack trees, we use a slightly tuned yet equivalent definition, whereby the hierarchy starts at level and uses axs different set of basic operations.

In the remainder, will denote a fixed finite alphabet and a positive integer. We first define stacks of order (or -stacks). Let denote the set of -stacks. For , the set of -stacks is , the set of non-empty sequences of -stacks. When is understood, we simply write . For , we write , with and , for an -stack of size whose topmost -stack is . For example, is a -stack of size 2, whose topmost -stack contains three -stacks, etc.

#### Basic Stack Operations.

Given two letters , we define the partial function such that , if and is not defined otherwise. We also consider the identity function . For , the function is defined by , for every . As it is injective, we denote by its inverse (which is a partial function).

Each level operation is extended to any level stack by letting . The set of basic operations of level is defined as: , and for , .

### 3.2 Stack Trees

We introduce the set (or simply when is understood) of -stack-trees. Observe that an -stack-tree of degree 1 is isomorphic to an -stack, and that . Figure 1 shows an example of a 3-stack tree. The notion of stack trees therefore subsumes both higher-order stacks and ordinary trees.

#### Basic Stack Tree Operations.

We now extend -stack operations to stack trees. There are in general several positions where one may perform a given operation on a tree. We thus first define the localised application of an operation to a specific position in the tree (given by the index of a leaf in the lexicographic ordering of leaves), and then derive a definition of stack tree operations as binary relations, or equivalently as partial functions from stack trees to sets of stack trees.

Any operation of is extended to as follows: given , and an integer , with , where is the leaf of the tree, with respect to the lexicographic order. If is not applicable to , is not defined. We define , i.e. the set of stack trees obtained by applying to a leaf of .

The -fold duplication of a stack tree leaf and its label is denoted by . Its application to the leaf of a tree is: , with . Let be the set of stack trees obtained by applying to a leaf of . The inverse operation, written , is such that if . We also define . Notice that if .

For simplicity, we will henceforth only consider the case where stack trees have arity at most and , but all results go through in the general case. We denote by the set of basic operations over .

### 3.3 Stack Tree Rewriting

As already mentioned, is the set of trees labelled by . In contrast with basic stack tree operations, a tree rewrite rule expresses the replacement of an arbitrarily large ground subtree of some tree into , yielding the tree . Contrary to the case of order 1 stacks (which are simply words), composing basic stack tree operations does not allow us to directly express such an operation, because there is no guarantee that two successive operations will be applied to the same part of a tree. We thus need to find a way to consider compositions of basic operations acting on a single sub-tree. In our notations, the effect of a ground tree rewrite rule could thus be seen as the localised application of a sequence of and operations followed by a sequence of and operations. The relative positions where these operations must be applied could be represented as a pair of trees with edge labels in .

From level 2 on, this is no longer possible. Indeed a localised sequence of operations may be used to perform introspection on the stack labelling a node without destroying it, by first performing a operation followed by a sequence of level 1 operations and a operation. It is thus impossible to directly represent such a transformation using pairs of trees labelled by stack tree operations. We therefore adopt a presentation of compound operations as DAGs, which allows us to specify the relative application positions of successive basic operations. However, not every DAG represents a valid compound operation, so we first need to define a suitable subclass of DAGs and associated concatenation operation. An example of the model we aim to define can be found in Fig. 2.

#### Concatenation of DAGs.

Given two DAGs and with and and two indices and with and , we denote by the unique DAG obtained by merging the -th output vertex of with the -th input vertex of for all such that both and exist. Formally, letting denote the number of merged vertices, we have where is the DAG whose set of vertices is and set of edges is , and if for some , and otherwise. We call the -concatenation of and . Note that the -concatenation of two connected DAGs remains connected.

#### Compound Operations

We represent compound operations as DAGs. We will refer in particular to the set of DAGs associated with basic operations, which are depicted in Fig. 3. Compound operations are inductively defined below, as depicted in Fig. 4.

###### Definition 1

A DAG is a compound operation (or simply an operation) if one of the following holds:

1. ;

2. , with and ;

3. , with ;

4. with ;

5. , with ;

where and are compound operations.

Additionally, the vertices of are ordered inductively in such a way that every vertex of in the above definition is smaller than the vertices of , the order over being the empty one. This induces in particular an order over the input vertices of , and one over its output vertices.

###### Definition 2

Given a compound operation , we define , its localised application starting at the -th leaf of a stack tree , as follows:

1. If , then .

2. If with ,

then .

3. If ,

then .

4. If ,

then .

5. If ,

then .

###### Remark 1

An operation may admit several different decompositions with respect to Def. 1. However, its application is well-defined, as one can show this process is locally confluent.

Given two stack trees , and an operation , we say that if there is a position such that . Figure 2 shows an example. We call the relation induced by : for any stack trees , if and only if . Finally, given a -tuple of operations of respective in-degrees and a -tuple of indices with for all , we denote by the parallel application of to , the set of all such applications and the induced relation.

Since the -concatenation of two operations as defined above is not necessarily a licit operation, we need to restrict ourselves to results which are well-formed according to Def. 1. Given and , we let . Given , we define222This unusual definition is necessary because is not associative. For example, is in but not in . , and let denote the set of iterations of . These notations are naturally extended to sets of operations.

###### Proposition 1

is precisely the set of all well-formed compound operations.

###### Proof

Recall that denotes the set of DAGs associated with basic operations. By definition of iteration, any DAG in is an operation. Conversely, by Def. 1, any operation can be decomposed into a concatenation of DAGs of . ∎

#### Ground Stack Tree Rewriting Systems.

By analogy with order 1 trees, given some finite alphabet of labels , we call any finite subset of labelled operations in a labelled ground stack-tree rewriting system (GSTRS). We straightforwardly extend the notions of rewriting graph and derivation relation to these systems. Note that for , this class coincides with ordinary ground tree rewriting systems. Moreover, one can easily show that the rewriting graphs of ground stack-tree rewriting systems over unary -stack trees (trees containing only unary operations, i.e. no edge labelled by or ) are isomorphic to the configuration graphs of order pushdown automata performing a finite sequence of operations at each transition.

## 4 Operation Automata

In this section, in order to provide finite descriptions of possibly infinite sets of operations, in particular the derivation relations of GSTRS, we extend the notion of ground tree transducers (or GTT) of [11] to ground tree rewriting systems.

A GTT is given by a tuple of pairs of finite tree automata. A pair of trees is accepted by if and for some -context , where for all , and for some . It is also shown that, given a relation recognised by a GTT, there exists another GTT recognising its reflexive and transitive closure .

Directly extending this idea to ground stack tree rewriting systems is not straightforward: contrary to the case of trees, a given compound operation may be applicable to many different subtrees. Indeed, the only subtree to which a ground tree rewriting rule can be applied is the tree . On stack trees, this is no longer true, as depicted in Fig. 2: an operation does not entirely describe the labels of nodes of subtrees it can be applied to (as in the case of trees), and can therefore be applied to infinitely many different subtrees. We will thus express relations by describing sets of compound operations over stack trees. Following [5] where recognisable sets of higher-order stacks are defined, we introduce operation automata and recognisable sets of operations.

###### Definition 3

An automaton over is a tuple , where

• is a finite set of states,

• is a finite stack alphabet,

• is a set of initial states,

• is a set of final states,

• is a set of transitions.

An operation is accepted by if there is a labelling of its vertices by states of such that all input vertices are labelled by initial states, all output vertices by final states, and this labelling is consistent with , in the sense that for all , and respectively labelled by states , and , and for all ,

 xθ→y ⟹(p,θ,q)∈Δ, x1→y∧x2→z ⟹(p,(q,r))∈Δ, x¯1→z∧y¯2→z ⟹((p,q),r)∈Δ.

We denote by the set of operations recognised by . denotes the class of sets of operations recognised by operation automata. A pair of stack trees is in the relation defined by if for some there is a -tuple of operations in such that . At order , we have already seen that stack trees are simply trees, and that ground stack tree rewriting systems coincide with ground tree rewriting systems. Similarly, we also have the following:

###### Proposition 2

The classes of relations recognised by order operation automataand by ground tree transducers coincide.

At higher orders, the class and the corresponding binary relations retains several of the good closure properties of ground tree transductions.

###### Proposition 3

is closed under union, intersection and iterated concatenation. The class of relations defined by operation automata is closed under composition and iterated composition.

The construction of automata recognising the union and intersection of two recognisable sets, the iterated concatenation of a recognisable set, or the composition of two automata-definable relations, can be found in the appendix. Given automaton , the relation defined by the automaton accepting is .

#### Normalised automata.

Operations may perform “unnecessary” actions on a given stack tree, for instance duplicating a leaf with a operation and later destroying both copies with . Such operations which leave the input tree unchanged are referred to as loops. There are thus in general infinitely many operations representing the same relation over stack trees. It is therefore desirable to look for a canonical representative (a canonical operation) for each considered relation. The intuitive idea is to simplify operations by removing occurrences of successive mutually inverse basic operations. This process is a very classical tool in the literature of pushdown automata and related models, and was applied to higher-order stacks in [5]. Our notion of reduced operations is an adaptation of this work.

There are two main hurdles to overcome. First, as already mentioned, a compound operation can perform introspection on the label of a leaf without destroying it. If can be applied to a given stack tree , such a sequence of operations does not change the resulting stack tree . It does however forbid the application of to other stack trees by inspecting their node labels, hence removing this part of the computation would lead to an operation with a possibly strictly larger domain. To adress this problem, and following [5], we use test operations ranging over regular sets of -stacks, which will allow us to handle non-destructive node-label introspection.

A second difficulty appears when an operation destroys a subtree and then reconstructs it identically, for instance a operation followed by . Trying to remove such a pattern would lead to a disconnected DAG, which does not describe a compound operation in our sense. We thus need to leave such occurrences intact. We can nevertheless bound the number of times a given position of the input stack tree is affected by the application of an operation by considering two phases: a destructive phase during which only and order basic operations (possibly including tests) are performed on the input stack-tree, and a constructive phase only consisting of and order basic operations. Similarly to the way ground tree rewriting is performed at order 1.

Formally, a test over is the restriction of the identity operation to 333Regular sets of -stacks are obtained by considering regular sets of sequences of operations of applied to a given stack . More details can be found in [5].. In other words, given , if , otherwise, it is undefined. We denote by the set of test operations over . We enrich our basic operations over with . We also extend compound operations with edges labelled by tests. We denote by the set of basic operations with tests. We can now define the notion of reduced operation analogously to that of reduced instructions with tests in [5].

###### Definition 4

For , we define the set of words over as:

• ,

• For , ,

• .

###### Definition 5

An operation with tests is reduced if for every , if , then .

Observe that, in the decomposition of a reduced operation , case 5 of the inductive definition of compound operations (Def. 1) should never occur, as otherwise, there would be a path on which appears before , which contradicts the definition of reduced operation.

An automaton is said to be normalised if it only accepts reduced operations, and distinguished if there is no transition ending in an initial state or starting in a final state. The following proposition shows that any operation automaton can be normalised and distinguished.

###### Proposition 4

For every automaton , there exists a distinguished normalised automaton with tests such that .

The idea of the construction is to transform in several steps, each modifying the set of accepted operations but not the recognised relation. The proof relies on the closure properties of regular sets of -stacks and an analysis of the structure of . We show in particular, using a saturation technique, that the set of states of can be partitioned into destructive states (which label the destructive phase of the operation, which does not contain the operation) and the constructive states (which label the constructive phase, where no occurs). These sets are further divided into test states, which are reached after a test has been performed (and only then) and which are the source of no test-labelled transition, and the others. This transformation can be performed without altering the accepted relation over stack trees.

## 5 Rewriting Graphs of Stack Trees

In this section, we study the properties of ground stack tree rewriting graphs. Our goal is to show that the graph of any -labelled GSTRS has a decidable FO theory. We first state that there exists a distinguished and reduced automaton recognising the derivation relation of , and then show, following [10], that there exists a finite-set interpretation of and every for from a graph with decidable WMSO-theory.

###### Theorem 5.1

Given a -labelled GSTRS , has a decidable FO theory.

To prove this theorem, we show that the graph with and obtained by adding the relation to has a decidable FO theory. To do so, we show that is finite-set interpretable inside a structure with a decidable WMSO-theory, and conclude using Corollary 2.5 of [10]. Thus from Section 5.2 of the same article, it follows that the rewriting graphs of GSTRS are in the tree-automatic hierarchy.

Given a -labelled GSTRS over , we choose to interpret inside the order Treegraph over alphabet . Each vertex of this graph is an -stack, and there is an edge if and only if with . This graph belongs to the -th level of the pushdown hierarchy and has a decidable WMSO theory444It is in fact a generator of this class of graphs via WMSO-interpretations (see [7] for additional details)..

Given a stack tree and a position , we denote by the -stack , where is obtained by adding the word at the top of the top-most 1-stack in , and . This stack is the encoding of the node at position in . Informally, it is obtained by storing in an -stack the sequence of -stacks labelling nodes from the root of to position , and adding at the top of each -stack the number of children of the corresponding node of and the next direction taken to reach node . Any stack tree is then encoded by the finite set of -stacks , i.e. the set of encodings of its leaves. Observe that this coding is injective.

###### Example 1

The coding of the stack tree depicted in Fig. 1 is:

 Xt= { \stack3\stack2\stack1aa\stack1bab21\stack2\stack1aa\stack1aaa11\stack2\stack1ab, \stack3\stack2\stack1aa\stack1bab22\stack2\stack1aa\stack1a\stack1b21\stack2\stack1ba\stack1ba\stack1b, \stack3\stack2\stack1aa\stack1bab22\stack2\stack1aa\stack1a\stack1b22\stack2\stack1abb\stack1ab}

We now represent any relation between two stack trees as a WMSO-formula with two free second-order variables, which holds in over sets and if and only if .

###### Proposition 5

Given a -labelled GSTRS , there exist WMSO-formulæ and such that:

• if and only if ,

• if and only if for some ,

• if and only if .

First note that the intuitive idea behind this interpretation is to only work on those vertices of which are the encoding of some node in a stack-tree. Formula will distinguish, amongst all possible finite sets of vertices, those which correspond to the set of encodings of all leaves of a stack-tree. Formulæ and then respectively check the relationship through (resp. ) of a pair of stack-trees. We give here a quick sketch of the formulæ and a glimpse of their proof of correction. More details can be found in appendix 0.C.

Let us first detail formula , which is of the form

 δ(X)=OnlyLeaves(X)∧TreeDom(X)∧UniqueLabel(X).

holds if every element of codes for a leaf. holds if the induced domain is the domain of a tree and the arity of each node is consistent with the elements of . holds if for every position in the induced domain, all elements which include agree on its label.

From here on, variables and will respectively stand for the encoding of some input stack tree and output stack-tree . For each , is the disjunction of a family of formulæ for each . Each is defined by induction over , simulating each basic operations in , ensuring that they are applied according to their respective positions, and to a single closed subtree of (which simply corresponds to a subset of ), yielding .

Let us now turn to formula . Since the set of DAGs in is finite, it is recognisable by an operation automaton. Since is closed under iteration (Cf. Sec. 4), one may build a distinguished normalised automaton accepting . What we thus really show is that given such an automaton , there exists a formula such that holds if and only if

for some vector

of DAGs accepted by . Formula is of the form

 ϕ(X,Y)=∃→Z,\init(X,Y,→Z)∧\diff(→Z)∧Trans(→Z).

Following a common pattern in automata theory, this formula expresses the existence of an accepting run of over some tuple of reduced DAGs , and states that the operation corresponding to , when applied to , yields . Here, defines a labelling of a subset of with the states of the automaton, each element of representing the set of nodes labelled by a given control state . Sub-formula checks that only the elements of (representing the leaves of ) are labelled by initial states, and only those in (leaves of ) are labelled by final states. ensures that the whole labelling respects the transition rules of . For each component of , and since every basic operation constituting is applied locally and has an effect on a subtree of height and width at most , this amounts to a local consistency check between at most three vertices, encoding two nodes of a stack tree and their parent node. The relative positions where basic operations are applied is checked using the sets in , which represent the flow of control states at each step of the transformation of into . Finally, ensures that no stack is labelled by two states belonging to the same part (destructive, constructive, testing or non-testing) of the automaton, thus making sure we simulate a unique run of . This is necessary to ensure that no spurious run is generated, and is only possible because is normalised.

## 6 Perspectives

There are several open questions arising from this work. The first one is the strictness of the hierarchy, and the question of finding simple examples of graphs separating each of its levels with the corresponding levels of the pushdown and tree-automatic hierarchies. A second interesting question concerns the trace languages of stack tree rewriting graphs. It is known that the trace languages of higher-order pushdown automata are the indexed languages [8], that the class of languages recognised by automatic structures are the context-sensitive languages [15] and that those recognised by tree-automatic structures form the class Etime [13]. However there is to our knowledge no characterisation of the languages recognised by ground tree rewriting systems. It is not hard to define a 2-stack-tree rewriting graph whose path language between two specific vertices is , which we believe cannot be recognised using tree rewriting systems or higher-order pushdown automata555 denotes the shuffle product. For every and , ,