# On Transforming Narrowing Trees into Regular Tree Grammars Generating Ranges of Substitutions

The grammar representation of a narrowing tree for a syntactically deterministic conditional term rewriting system and a pair of terms is a regular tree grammar that generates expressions for substitutions obtained by all possible innermost-narrowing derivations that start with the pair and end with particular non-narrowable terms. In this paper, under a certain syntactic condition, we show a transformation of the grammar representation of a narrowing tree into another regular tree grammar that overapproximately generates the ranges of ground substitutions generated by the grammar representation. In our previous work, such a transformation is restricted to the ranges w.r.t. a given single variable, and thus, the usefulness is limited. We extend the previous transformation by representing the range of a ground substitution as a tuple of terms, which is obtained by the coding for finite trees. We show a precise definition of the transformation and prove that the language of the transformed regular tree grammar is an overapproximation of the ranges of ground substitutions generated by the grammar representation. We leave an experiment to evaluate the usefulness of the transformation as future work.

## Authors

• 6 publications
• 1 publication
• ### Multiple Context-Free Tree Grammars: Lexicalization and Characterization

Multiple (simple) context-free tree grammars are investigated, where "si...
07/11/2017 ∙ by Joost Engelfriet, et al. ∙ 0

• ### An Improved Algorithm for E-Generalization

E-generalization computes common generalizations of given ground terms w...
09/03/2017 ∙ by Jochen Burghardt, et al. ∙ 0

• ### Restricted Global Grammar Constraints

We investigate the global GRAMMAR constraint over restricted classes of ...
06/29/2009 ∙ by George Katsirelos, et al. ∙ 0

• ### Program Language Translation Using a Grammar-Driven Tree-to-Tree Model

The task of translating between programming languages differs from the c...
07/04/2018 ∙ by Mehdi Drissi, et al. ∙ 0

• ### Balancing Straight-Line Programs

It is shown that a context-free grammar of size m that produces a single...
02/10/2019 ∙ by Moses Ganardi, et al. ∙ 0

• ### Alignment Elimination from Adams' Grammars

Adams' extension of parsing expression grammars enables specifying inden...
06/20/2017 ∙ by Härmel Nestra, et al. ∙ 0

• ### Eliminating Left Recursion without the Epsilon

The standard algorithm to eliminate indirect left recursion takes a prev...
08/28/2019 ∙ by James Smith, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Conditional term rewriting [26, Chapter 7] is known to be more complicated than unconditional term rewriting in the sense of analyzing properties, e.g., operational termination [18], confluence [30], and reachability [6]. A popular approach to the analysis of conditional rewriting is to transform a conditional term rewriting system (a CTRS, for short) into an unconditional term rewriting system (a TRS, for short) that is in general an overapproximation of the CTRS in terms of reduction. This approach enables us to use existing techniques for the analysis of TRSs. For example, a CTRS is operationally terminating if the unraveled TRS [19, 26] is terminating [5]. To prove termination of the unraveled TRS, we can use many techniques for proving termination of TRSs (cf. [26]). On the other hand, it is not so easy to analyze reachability which is relevant to, e.g., (in)feasibility of conditions.

Let us consider to prove confluence of the following syntactically deterministic 3-CTRS [26, Example 7.1.5] defining the gcd operator over the natural numbers represented by and :

A transformational approach in [12, 11] does not succeed in proving confluence of . On the other hand, a direct approach to reachability analysis to prove infeasibility of the conditional critical pairs (i.e., non-existence of substitutions satisfying conditions), which is implemented in some confluence provers, does not prove confluence of well, either. Let us consider the critical pairs of :

 ⟨ s(x),  gcd(x−x,s(x)) ⟩⇐x

Note that the above critical pairs are symmetric because they are caused by overlaps at the root position only. An operationally terminating CTRS is confluent if all critical pairs of the CTRS are infeasible (cf. [2, 4]). Operational termination of can be proved by, e.g., AProVE [9]. To prove infeasibility of the critical pairs above, it suffices to show both (i) non-existence of terms such that , and (ii) non-existence of terms such that and . Thanks to the meaning of , it would be easy for a human to notice that such terms do not exist. However, it is not so easy to mechanize a way to show non-existence of . In fact, confluence provers for CTRSs, ConCon [29], CO3 [21], and CoScart [10], based on e.g., transformations of CTRSs into TRSs or reachability analysis for infeasibility of conditional critical pairs, failed to prove confluence of (see Confluence Competition 2016, 2017, and 2018, 327.trs). In addition, a semantic approach in [17, 16] cannot prove confluence of using AGES [13], a tool for generating logical models of order-sorted first-order theories—non-existence of above cannot be proved via its web interface with default parameters. Timbuk 3.2 [8], which is based on tree automata techniques [7], cannot prove infeasibility of w.r.t. the rules for under the default use.

The non-existence of a term with can be reduced to the non-existence of substitutions such that , where denotes the narrowing step [15]—for example, . In addition, the non-existence of such substitutions can be reduced to the emptiness of the set of the substitutions, i.e., the emptiness of . From this viewpoint, for a pair of terms, the enumeration of substitutions obtained by narrowing would be useful in analyzing rewriting that starts with instances of the pair. To analyze sets of substitutions derived by innermost narrowing, narrowing trees [24] are useful. For example, infeasibility of conditional critical pairs of some normal 1-CTRS can be proved by using the grammar representation of a narrowing tree [22]. Simplification of the grammar representation implies the non-existence of substitutions satisfying the conditional part of a critical pair. However, there are some examples (shown later) for which the simplification method in [22] does not succeed in converting grammar representations to those explicitly representing the empty set.

In this paper, under a certain syntactic condition, we show a transformation of the grammar representation of a narrowing tree into a regular tree grammar [3] (an RTG, for short) that overapproximately generates the ranges of ground substitutions generated by the grammar representation. The aim of the transformation is to simplify grammar representations as much as possible together with the existing one in [22].

Let be a syntactically deterministic 3-CTRS (a 3-SDCTRS, for short) that is a constructor system, a basic term, and a constructor term, where basic terms are of the form with a defined symbol and constructor terms . A narrowing tree [24, 22] of with the root pair is a finite representation that defines the set of substitutions such that the pair narrows to a particular ground term consisting of a special binary symbol and a special constant by innermost narrowing with a substitution (i.e., and thus ). Note that is considered a binary symbol, is assumed to be implicitly included in , and denotes the constructor-based rewriting step which applies rewrite rules to basic terms. Such a narrowing tree can be the enumeration of substitutions obtained by innermost narrowing of to ground terms consisting of and

. The idea of narrowing trees has been extended to finite representations of SLD trees for logic programs

[25].

Using narrowing trees, it is easy to see that there is no substitution such that , and hence the above four critical pairs with are infeasible. Let us now consider to prove infeasibility of . A narrowing tree for can be represented by the following grammar representation [24, 22] that can be considered an RTG (see Section 4):

 Γx

We denote by the RTG with the initial non-terminal , the other non-terminals , and the above production rules. We also denote by the set of the above production rules, i.e., (1). Substitutions are considered constants, and the RTG generates terms over , , , rec, and substitutions. The binary symbols and are interpreted by standard composition and parallel composition [14, 27], respectively. Parallel composition of two substitutions returns a most general unifier of the substitutions if the substitutions are unifiable (see Definition 4.2). For example, returns and fails. The symbol rec is used for recursion, which is interpreted as standard composition of a renaming and a substitution recursively generated. To simplify the discussion in the remainder of this section, following the meaning of the operators, we simplify the rules of and as follows:

 Γx

In our previous work [22], to show the emptiness of the set of substitutions generated from e.g., , we transform the grammar representation to an RTG that overapproximately generates the ranges of ground substitutions w.r.t. a single variable. For example, for , the production rules of (2) is transformed into the following ones:

 Γxx

Note that non-terminal generates arbitrary ground constructor terms. Since we focus on only, non-terminals and generate and , respectively, and we cannot prove that there is no substitution generated from .

In this paper, we aim at showing that there is no substitution generated by (2) from the initial non-terminal , i.e., showing that . To this end, under a certain syntactic condition, we show a transformation of the grammar representation of a narrowing tree into an RTG that overapproximately generates the ranges of ground substitutions generated by the grammar representation (Section 5). More precisely, using the idea of coding for tuples of ground terms [3, Section 3.2.1] (see Figure 1), we extend a transformation in [22] w.r.t. a single variable to two variables. It is straightforward to further extend the transformation to three or more variables. We do not explain how to, given a constructor 3-SDCTRS, construct (the grammar representation of) a narrowing tree, and concentrate on how to transform a grammar representation into an RTG that generates the ranges of ground substitutions generated by the grammar representation.

#### Outline of Our Approach

Using the rules of (2), we briefly illustrate the outline of the transformation. Roughly speaking, we apply the coding for tuples of terms to the range of substitutions, e.g., and for . The rules for are transformed into

 Γ(x,y)x

where the non-terminal generates ground terms obtained by applying the coding to and ground constructor terms. The coding of and is . Variables are instantiated by substitutions generated from , and hence we replaced by . The rule for is transformed into

 Γ(x,y)y

Since are swapped by , we generate a new non-terminal and its rules as well as the above rules:

 Γ(y,x)x

where the non-terminal generates ground terms obtained by applying the coding to ground constructor terms and . Every ground term generated from contains , and every ground term generated from contains . Neither nor is shared by the languages of and , and hence there is no substitution which corresponds to an expression generated from . For this reason, we can transform of (1) into

 Γx

which means that there exist no constructor substitution satisfying the condition under the constructor-based rewriting.

One may think that tuples of terms are enough for our goal. However, substitutions are generated by standard compositions, and tuples makes us introduce composition of tuples. For example, the range of is represented as a tuple , where is a binary symbol for tuples of two terms. To apply to the tuple, we reconstruct a tuple from and . On the other hand, the coding of terms makes us avoid the reconstruction and use standard composition of substitutions to compute the range of composed substitution. For example, and can be represented by and , respectively, where both and are considered single variables.

Using the rules for of (2), we further show that the weakness of the above approach of using tuples. Let us try to transform the rules of into an RTG that generates . The first rule is transformed into with the rules of above. The second rule is transformed into with the rules of and above. These rules generates not only terms in but also other terms, e.g., . The term should not be generated because the term can be a common element generated by and and we cannot prove does not generate any substitution.

## 2 Preliminaries

In this section, we recall basic notions and notations of term rewriting [2, 26] and regular tree grammars [3]. Familiarity with basic notions on term rewriting [2, 26] is assumed.

### 2.1 Terms and Substitutions

Throughout the paper, we use as a countably infinite set of variables. Let be a signature, a finite set of function symbols each of which has its own fixed arity, denoted by . We often write instead of “an -ary symbol ”, and so on. The set of terms over and () is denoted by , and , the set of ground terms, is abbreviated to . The set of variables appearing in any of terms is denoted by . We denote the set of positions of a term by . For a term and a position of , the subterm of at is denoted by . The function symbol at the root position of a term is denoted by . Given terms and a position of , we denote by the term obtained from by replacing the subterm at by .

A substitution is a mapping from variables to terms such that the number of variables with is finite, and is naturally extended over terms. The domain and range of are denoted by and , respectively. The set of variables in is denoted by : . We may denote by if and for all . The identity substitution is denoted by . The set of substitutions that range over a signature and a set of variables is denoted by : . The application of a substitution to a term is abbreviated to , and is called an instance of . Given a set of variables, denotes the restricted substitution of w.r.t. : . A substitution is called a renaming if is a bijection on . The composition (simply ) of substitutions and is defined as . A substitution is called idempotent if (i.e., ). A substitution is called more general than a substitution , written by , if there exists a substitution such that . A finite set of term equations is called unifiable if there exists a unifier of such that for all term equations in . A most general unifier (mgu) of is denoted by if is unifiable. Terms and are called unifiable if is unifiable. The application of a substitution to , denoted by , is defined as .

### 2.2 Conditional Rewriting

An oriented conditional rewrite rule over a signature is a triple , denoted by , such that the left-hand side is a non-variable term in , the right-hand side is a term in , and the conditional part is a sequence of term pairs () where . In particular, a conditional rewrite rule is called unconditional if the conditional part is the empty sequence (i.e., ), and we may abbreviate it to . Variables in are called extra variables of the rule. An oriented conditional term rewriting system (a CTRS, for short) over is a set of oriented conditional rewrite rules over . A CTRS is called an (unconditional) term rewriting system (a TRS, for short) if every rule in the CTRS is unconditional and satisfies . The reduction relation of a CTRS is defined as , where , and for . To specify the position where the rule is applied, we may write instead of . The underlying unconditional system of is denoted by . A term is called a normal form (of ) if is irreducible w.r.t. . A substitution is called normalized (w.r.t. ) if is a normal form of for each variable . A CTRS is called Type 3 (3-CTRS, for short) if every rule satisfies that . for all .

The sets of defined symbols and constructors of a CTRS over a signature are denoted by and , respectively: and . Terms in are called constructor terms of . A substitution in is called a constructor substitution of . A term of the form with and is called basic. A CTRS is called a constructor system if for every rule in , is basic. A 3-DCTRS is called syntactically deterministic (an SDCTRS, for short) if for every rule , every is a constructor term or a ground normal form of .

A CTRS is called operationally terminating if there are no infinite well-formed trees in a certain logical inference system [18]—operational termination means that the evaluation of conditions must either successfully terminate or fail in finite time. Two terms and are said to be joinable, written as , if there exists a term such that . A CTRS is called confluent if for any terms such that .

### 2.3 Innermost Conditional Narrowing

We denote a pair of terms by (not an equation ) because we analyze conditions of rewrite rules and distinguish the left- and right-hand sides of . In addition, we deal with pairs of terms as terms by considering a binary function symbol. For this reason, we apply many notions for terms to pairs of terms without notice. For readability, when we deal with as a term, we often bracket it such as . As in [20], any CTRS in this paper is assumed to implicitly include the rule where is a special constant. The rule is used to test structural equivalence between two terms by means of .

To deal with a conjunction of pairs of terms ( is either or ) as a term, we write by using an associative binary symbol . We call such a term an equational term. Unlike [24], to avoid to be a defined symbol, we do not use any rule for , e.g., . Instead of derivations ending with , we consider derivations that end with terms in . We assume that none of , , or is included in the range of any substitution below.

In the following, for a constructor 3-SDCTRS , a pair of terms is called a goal of if the left-hand side is either a constructor term or a basic term and the right-hand side is a constructor term. An equational term is called a goal clause of if it is a conjunction of goals for . Note that for a goal clause , any instance with a constructor substitution is a goal clause.

###### Example 2.1

The equational term is a goal clause of .

The narrowing relation [28, 15] mainly extends rewriting by replacing matching with unification. This paper follows the formalization in [23], while we use the rule instead of the corresponding inference rule. Let be a CTRS. A goal clause with is said to conditionally narrow into an equational term at an innermost position, written as , if there exist a non-variable position of , a variant of a rule in , and a constructor substitution such that , is basic, and are unifiable, , and . Note that all extra variables of remain in as fresh variables which do not appear in . We assume that (i.e., is idempotent) and . We write to make the substitution explicit. An innermost narrowing derivation (and ) denotes a sequence of narrowing steps with an idempotent substitution. When we consider two (or more) narrowing derivations and , we assume that .

Innermost narrowing is a counterpart of constructor-based rewriting (cf. [23]). Following [23], we define constructor-based conditional rewriting on goal clauses as follows: for a goal clause with , we write if there exist a non-variable position of , a rule in , and a constructor substitution such that is basic, , and .

###### Theorem 2.2 ([22])

Let be a constructor SDCTRS, a goal clause, and .

• If , then (i.e., for all goals in ).

• For a constructor substitution , if , then there exists an idempotent constructor substitution such that and .

###### Example 2.3

Consider in Section 1 again. The following is an instance of innermost conditional narrowing of :

The following constructor-based rewriting derivation corresponds to the above narrowing derivation:

### 2.4 Regular Tree Grammars

A regular tree grammar (an RTG, for short) is a quadruple such that is a signature, is a finite set of non-terminals (constants not in ), , and is a finite set of production rules of the form with and . Given a non-terminal , the set is the language generated by from , denoted by . The initial non-terminal is not so relevant in this paper. A regular tree language is a language generated by an RTG from one of its non-terminals. The class of regular tree languages is equivalent to the class of recognizable tree languages which are recognized by tree automata. This means that the intersection (non-)emptiness problem for regular tree languages is decidable.

###### Example 2.4

The RTG

generates the sets of even and odd numbers over

and from and , respectively: and .

## 3 Coding of Tuples of Ground Terms

In this section, we introduce the notion of coding of tuples of ground terms [3, Section 3.2.1]. To simplify discussions, we consider pairs of terms.

Let be a signature. We prepare the signature , where is a new constant. For symbols , we denote the function symbol by , and the arity of is . The coding of pairs of ground terms, , is recursively defined as follows:

• if ,

• if ,

• , and

• .

Note that . Note also that for and for , if , then is complemented for