Revisiting Call-by-value Bohm trees in light of their Taylor expansion

09/07/2018 ∙ by Emma Kerinec, et al. ∙ Université Paris 13 0

The call-by-value lambda calculus can be endowed with permutation rules, arising from linear logic proof-nets, having the advantage of unblocking some redexes that otherwise get stuck during the reduction. We show that such an extension allows to define a satisfying notion of Bohm(-like) tree and a theory of program approximation in the call-by-value setting. We prove that all lambda terms having the same Bohm tree are observationally equivalent, and characterize those Bohm-like trees arising as actual Bohm trees of lambda terms. We also compare this approach with Ehrhard's theory of program approximation based on the Taylor expansion of lambda terms, translating each lambda term into a possibly infinite set of so-called resource terms. We provide sufficient and necessary conditions for a set of resource terms in order to be the Taylor expansion of a lambda term. Finally, we show that the normal form of the Taylor expansion of a lambda term can be computed by performing a normalized Taylor expansion of its Bohm tree. From this it follows that two lambda terms have the same Bohm tree if and only if the normal forms of their Taylor expansions coincide.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

In 1968, Böhm published a separability theorem – known as the Böhm Theorem – which is nowadays universally recognized as a fundamental theorem in -calculus [6]. Inspired by this result, Barendregt in 1977 proposed the definition of “Böhm tree of a -term” [3], a notion which played for decades a prominent role in the theory of program approximation. The Böhm tree of a -term represents the evaluation of as a possibly infinite labelled tree coinductively, but effectively, constructed by collecting the stable amounts of information coming out of the computation. Equating all -terms having the same Böhm tree is a necessary, although non sufficient, step in the quest for fully abstract models of -calculus.

In 2003, Ehrhard and Regnier, motivated by insights from Linear Logic, introduced the notion of “Taylor expansion of a -term” as an alternative way of approximating -terms [10]. The Taylor expansion translates a -term as a possibly infinite set111In its original definition, the Taylor expansion is a power series of multi-linear terms taking coefficients in the semiring of non-negative rational numbers. Following [16, 9, 7], in this paper we abuse language and call “Taylor expansion” the support (underlying set) of the actual Taylor expansion. This is done for good reasons, as we are interested in the usual observational equivalences between -terms that overlook such coefficients. of multi-linear terms, each approximating a finite part of the behaviour of . These terms populate a resource calculus [26] where -calculus application is replaced by the application of a term to a bag of resources that cannot be erased, or duplicated and must be consumed during the reduction. The advantage of the Taylor expansion is that it exposes the amount of resources needed by a -term to produce (a finite part of) a value, a quantitative information that does not appear in its Böhm tree. The relationship between these two notions of program approximation has been investigated in [11], where the authors show that the Taylor expansion can actually be seen as a resource sensitive version of Böhm trees by demonstrating that the normal form of the Taylor expansion of is actually equal to the Taylor expansion of its Böhm tree.

The notions of Böhm tree and Taylor expansion have been first developed in the setting of call-by-name (CbN) -calculus [4]. However many modern functional programming languages, like OCaml, adopt a call-by-value (CbV) reduction strategy — a redex of shape is only contracted when is a value, namely a variable or a -abstraction. The call-by-value -calculus has been defined by Plotkin in 1975 [22], but its theory of program approximation is still unsatisfactory and constitutes an ongoing line of research [9, 8, 17]. For instance, it is unclear what should be the Böhm tree of a -term because of the possible presence of -redexes that get stuck (waiting for a value) in the reduction. A paradigmatic example of this situation is the -term , where (see [1]). This term is a call-by-value normal form because the argument , which is not a value, blocks the evaluation (while one would expect to behave as the divergent term ). A significant advance in reducing the number of stuck redexes has been made in [8], where Carraro and Guerrieri introduce permutations rules naturally arising from the translation of -terms into Linear Logic proof-nets. Using -rules, the -term above rewrites in which in its turn rewrites to itself, thus giving rise to an infinite reduction sequence, as desired. In [13], Guerrieri et al. show that this extended calculus still enjoys nice properties like confluence and standardization, and that adding the -rules preserves the operational semantics of Plotkin’s CbV -calculus as well as the observational equivalence.

In the present paper we show that -rules actually open the way to provide a meaningful notion of call-by-value Böhm trees (Definition 2.1). Rather than giving a coinductive definition, which turns out to be more complicated than expected, we provide an appropriate notion of approximants, namely -terms possibly containing a constant , that are in normal form w.r.t. the reduction rules of (i.e., the -rules and the restriction of to values). As usual, represents the undefined and this intuition is reflected in the definition of a preorder between approximants which is generated by , for all approximants . The next step is to associate with every -term the set of its approximants and verify that they enjoy the following properties: the “external shape” of an approximant of is stable under reduction (Lemma 2.1); two interconvertible -terms share the same set of approximants (cf., Lemma 2.1); the set of approximants of is directed. Once this preliminary work is accomplished, it is possible to define the Böhm tree of as the supremum of , the result being a possibly infinite labelled tree , as expected.

More generally, it is possible to define the notion of (CbV) “Böhm-like” trees as those labelled trees that can be obtained by arbitrary superpositions of (compatible) approximants. The Böhm-like trees corresponding to CbV Böhm trees of -terms have specific properties, that are dues to the fact that -calculus constitutes a model of computation. Indeed, since every -term is finite, can only contain a finite number of free variables and, since represents a program, the tree must be a computable. In Theorem 2.1 we demonstrate that these conditions are actually sufficient, thus providing a characterization.

To show that our notion of Böhm tree is actually meaningful, we prove that all -terms having the same Böhm tree are operationally indistinguishable (Theorem 4.2) and we investigate the relationship between Böhm trees and Taylor expansion in the call-by-value setting. Indeed, as explained by Ehrhard in [9], the CbV analogues of resource calculus and of Taylor expansion are unproblematic to define, because they are driven by solid intuitions coming from Linear Logic: rather than using the CbN translation of intuitionistic arrow, it is enough to exploit Girard’s so-called “boring” translation, which transforms in and is suitable for CbV. Following [7], we define a coherence relation between resource terms and prove that a set of such terms corresponds to the Taylor expansion of a -term if and only if it is an infinite clique having finite height. Subsequently, we focus on the dynamic aspects of the Taylor expansion by studying its normal form, that can always be calculated since the resource calculus enjoys strong normalization.

In [8], Carraro and Guerrieri propose to extend the CbV resource calculus with -rules to obtain a more refined normal form of the Taylor expansion of a -term — this allows to mimic the -reductions occurring in at the level of its resource approximants. Even with this shrewdness, it turns out that the normal form of is different from the normal form of , the latter containing approximants that are not normal, but whose normal form is however empty (they disappear along the reduction). Although the result from [11] does not hold verbatim in CbV, we show that it is possible to define the normalized Taylor expansion of a Böhm tree and prove in Theorem 4.1 that the normal form of coincide with , which is the main result of the paper. An interesting consequence, among others, is that all denotational models satisfying the Taylor expansion (e.g., the one in [8]) equate all -terms having the same Böhm tree.

Related works

To our knowledge, in the literature no notion of CbV Böhm tree appears222Even Paolini’s separability result in [20] for CbV -calculus does not rely on Böhm trees.. However, there have been attempts to develop syntactic bisimulation equivalences and theories of program approximation arising from denotational models. Lassen [15] coinductively defines a bisimulation equating all -terms having (recursively) the same “eager normal form”, but he mentions that no obvious tree representations of the equivalence classes are at hand. In [25], Ronchi della Rocca and Paolini study a filter model of CbV -calculus and, in order to prove an Approximation Theorem, they need to define sets of upper and lower approximants of a -term. By admission of the authors [24], these notions are not satisfactory because they correspond to an “over” (resp. “under”) approximation of its behaviour.

We end this section by recalling that most of the results we prove in this paper are the CbV analogues of results well-known in CbN and contained in [4, Ch. 10] (for Böhm trees), in [7] (for Taylor expansion) and [11] (for the relationship between the two notions).

General notations

We denote by the set of all natural numbers. Given a set we denote by its powerset and by the set of all finite subsets of .

1. Call-By-Value -Calculus

The call-by-value -calculus , introduced by Plotkin in [22], is a -calculus endowed with a reduction relation that allows the contraction of a redex only when the argument is a value, namely when is a variable or an abstraction. In this section we briefly review its syntax and operational semantics. By extending its reduction with permutation rules , we obtain the calculus introduced in [8], that will be our main subject of study.

1.1. Its syntax and operational semantics.

For the -calculus we mainly use the notions and notations from [4]. We consider fixed a denumerable set of variables.

The set of -terms and the set of values are defined through the following simplified333This basically means that parentheses are left implicit. grammars (where :

As usual, we assume that application associates to the left and has higher precedence than -abstraction. For instance, . Given , we let stand for . Finally, we write for ( times).

The set of free variables of and the -conversion are defined as in [4, §2.1]. A -term is called closed, or a combinator, whenever . The set of all combinators is denoted by . From now on, -terms are considered up to -conversion, whence the symbol represents syntactic equality possibly up to renaming of bound variables.

Concerning specific combinators, we define:

where is the identity, is the paradigmatic looping combinator, is the composition operator, and are the first and second projection (respectively), is Plotkin’s recursion operator, and is a -term producing an increasing amount of external abstractions.

Given and we denote by the -term obtained by substituting444This notation actually differs from [4], where the same substitution was denoted . Similarly, Barendregt denotes the hole of a context by . We prefer to keep square brackets to denote “bags” (multisets) of terms, a notion that will be needed in Section 3. for every free occurrence of in , subject to the usual proviso of renaming bound variables in to avoid capture of free variables in .

It is easy to check that the set is closed by substitution of values for free variables, namely and entail .

A context is a -term possibly containing occurrences of a distinguished algebraic variable, called hole and denoted by . In the present paper we consider – without loss of generality for our purposes – contexts having a single occurrence of .

A (single-hole) context is generated by the simplified grammar:

A context is called a head context if it has shape for .

Given , we write for the -term obtained by replacing for the hole in , possibly with capture of free variables.

We consider a CbV -calculus endowed with the following notions of reductions. The -reduction is the standard one, from [22], while the -reductions have been introduced in [23, 8] and are inspired by the translation of -calculus into linear logic proof-nets.

The -reduction is the contextual closure of the following rule:

The -reductions , are the contextual closures of the following rules (for ):

We also set and .

The -term at the left side of the arrow in the rule (resp. , ) is called - (resp. -, -) redex, while the -term at the right side is the corresponding contractum. Notice that the condition for contracting a - (resp. -) redex can always be satisfied by performing appropriate -conversions.

Each reduction relation generates the corresponding multistep relation by taking its transitive and reflexive closure, and conversion relation by taking its transitive, reflexive and symmetric closure. Moreover, we say that a -term is in -normal form (-nf, for short) if there is no such that . We say that has an -normal form whenever for some in -nf, and in this case we denote by .

  1. , while is a -normal form.

  2. , whence is a looping combinator in the CbV setting as well.

  3. is a -nf, but contains a -redex, indeed .

  4. For all values , we have with . So we get:

  5. .

  6. Let for , then we have:

The next lemma was proved independently by Guerrieri [12].

A -term is in -normal form if and only if is a -term generated by the following simplified grammar (for ):

Proof.

Assume that is in -nf and proceed by structural induction. Recall that every -term can be written as for some . Moreover, the -terms must be in -nf’s since is -nf. Now, if then is of the form with in -nf and the result follows from the induction hypothesis. Hence, we assume and split into cases depending on (there are indeed only two possibilities):

  • for some . If then we are done since is an -term. If then where all the ’s are -terms by induction hypothesis. Moreover, cannot be an -term for otherwise would have a -redex. Whence, must be an -term and is of the form for .

  • for some variable and -terms in -nf. In this case we must have because cannot have a -redex. By induction hypothesis, are -terms, but cannot be an -term or a value for otherwise would have a - or a -redex, respectively. We conclude that the only possibility for the shape of is , whence must be an -term.

By induction on the grammar generating . The only interesting cases are the following.

  • could have a -redex if , but this is impossible by definition of an -term. As are in -nf by induction hypothesis, so must be .

  • where are in -nf by induction hypothesis. In the previous item we established that is in -nf. Thus, could only have a -redex if , but this is not the case by definition of .∎

Intuitively, in the grammar above, stands for “general” normal form, for “redex-like” normal form and for “head” normal form. The following properties are well-established.

[Properties of reductions [22, 8]]

  1. The -reduction is confluent and strongly normalizing.

  2. The - and -reductions are confluent.

Lambda terms are classified into valuables, potentially valuable and non-potentially valuable, depending on their capability of producing a value in a suitable environment.

A -term is valuable if for some . A -term is potentially valuable if there exists a head context555Equivalently, is potentially valuable if there is a substitution such that is valuable. , where , such that is valuable.

It is easy to check that valuable entails potentially valuable and that, for , the two notions coincide. As shown in [13], a -term is valuable (resp. potentially valuable) if and only if (resp. ) for some . As a consequence, the calculus can be used as a tool for studying the operational semantics of the original calculus .

In [22], Plotkin defines the following observational equivalence.

The observational equivalence is defined as follows (for ):

For example, we have and (see Example 1.1(6)), while .

It is well known that, in order to check whether holds, it is enough to consider head contexts (cf. [19, 21]). In other words, if and only if there exists a head context such that is valuable, while is not.

2. Call-by-value Böhm Trees

In the call-by-name setting there are several equivalent ways of defining Böhm trees. The most famous definition is coinductive666See also Definition 10.1.3 of [4], marked by Barendregt as ‘informal’ because at the time the coinduction principle was not as well-understood as today.  [14], while the formal one in Barendregt’s book exploits the notion of “effective Böhm-like trees” which is not easy to handle in practice. The definition given in Amadio and Curien’s book [2, Def. 2.3.3] is formal, does not require coinductive techniques and, as it turns out, generalizes nicely to the CbV setting. The idea is to first define the set of approximants of a -term , then show that it is directed w.r.t. some preorder and, finally, define the Böhm tree of as the supremum of .

2.1. Böhm trees and approximants

Let be the set of -terms possibly containing a constant , representing the undefined, and let be the context-closed preorder on generated by setting for all . Given compatible777Recall that are compatible if there exists such that and . w.r.t. , we denote their least upper bound by . The reduction from Definition 1.1 generalizes to terms in in the obvious way (assuming that is not a value), moreover we define the -reduction relation as the contextual closure of the rules , and for all . Finally, the -reduction is defined by .

A -context is a context possibly containing some occurrences of . We use for -contexts the same notations introduced for contexts in Section 1.1.

  1. A term is an approximant if either or is an -term generated by the following simplified grammar (for ):

    The approximant is called trivial and cannot be generated by the grammar above.

  2. Let be the set of all approximants, including the trivial one. In accordance with (1), we denote arbitrary approximants by and non-trivial ones by ().

  3. The set of free variables is extended to approximants by setting .

  4. Given , the set of approximants of is defined as follows:

  1. , and .

  2. . Neither nor belong to this set, because they are not valid approximants (by Definition 2.1(1)).

  3. .

  4. .

  5. The set of approximants of is particularly interesting to compute:

is in -normal form if and only if .

Proof.

The reasoning concerning -redexes is analogous to the proof of Lemma 1.1. Notice that no approximant containing -redexes can be generated by the grammar of Definition 2.1(1), since is not an -term. ∎

The following lemmas show that the “external shape” of an approximant is stable under -reduction. For instance, if then all non-trivial approximants have shape for some .

Let be a (single-hole) -context. Then and entail that there exists an such that and .

Proof.

Let . By Lemma 2.1, cannot have any -redex. Clearly, substituting for an occurrence of in does not create any new -redex, so if then the contracted redex must occur in . It is slightly trickier to check by induction on that such an operation does not introduce any -redex. The only interesting case is where is a -term. Indeed, since , would be a -redex for but this is impossible since is a -term and -terms cannot have this shape. The case is analogous. ∎

For and , and entail .

Proof.

If then can be obtained from by substituting each occurrence of for the appropriate subterm of . Hence, the redex contracted in must occur in a subterm of corresponding to an occurrence888An occurrence of a subterm in a -term is a (single-hole) context such that . of in . So we have and implies, by Lemma 2.1, that for an such that . We can therefore conclude that , as desired. ∎

For , entails .

Proof.

Straightforward from Definition 2.1(4) and Lemma 2.1. ∎

For all , the set is an ideal w.r.t. , namely it is non-empty, directed and downward closed.

Proof.

We check the three conditions:

  • is non-empty because it always contains .

  • To show that is directed, we need to prove that every have an upper bound .

    We proceed by induction on , the cases and being trivial.

    Case . In this case we must have with and for all such that . As , there exists a -term such that and . By Lemma 2.1 and Proposition 1.1(2) (confluence) we get . Whence, for some non-trivial approximants and for . Again, by confluence, there exist -terms satisfying . By Lemma 2.1, we obtain and for . By induction hypothesis, there exist such that and from which it follows that the upper bound of belongs to . By Lemma 2.1, we conclude that , as desired.

    Case . This entails that with , and for all such that . Reasoning as above, implies such that for some . Whence, where , and . By confluence, there exist such that and for . By Lemma 2.1, we obtain , and for . By induction hypothesis we get such that , such that and such that for , from which it follows that the upper bound of belongs to . By Lemma 2.1, we conclude .

    All other cases follow from Lemma 2.1, confluence of and the induction hypothesis.

  • To prove that is downward closed, we need to show that for all , if then , but this follows directly from its definition.∎

As a consequence, we can actually define the Böhm tree of a -term as the supremum of its approximants in .

  1. Let The (call-by-value) Böhm tree of , in symbols , is defined as follows:

    Therefore, the resulting structure is a possibly infinite labelled tree .

  2. More generally, every ideal determines a so-called Böhm-like tree .

  3. Given a Böhm-like tree , the height of is defined as usual for trees (and can be if the tree is infinite). Moreover, we set .

The difference between the Böhm tree of a -term and a Böhm-like tree is that the former must be “computable999The formal meaning of “computable” will be discussed in the rest of the section.” since it is -definable, while the latter can be arbitrary. In particular, any Böhm tree is a Böhm-like tree but the converse does not hold.

Notice that if and only if . Moreover, and the inclusion can be strict as in . Any Böhm-like tree can be depicted using the following “building blocks”:

  • If we actually draw a node labelled .

  • If we use an abstraction node labelled “”:

  • If , we use an application node labelled by “”:

  • If we combine the application and abstraction nodes as imagined:

Notable examples of Böhm trees of -terms are given in Figure 1. Interestingly, the -term from Example 1.1(6) satisfying

(1)

is such that . Indeed, substituting for a in (1) never gives a non-trivial approximant belonging to (cf. the grammar of Definition 2.1(1)).

Figure 1. Examples of CbV Böhm trees.

For , if then .

Proof.

By Proposition 1.1(2) (i.e. confluence of ), if and only if there exists a -term such that and . By an iterated application of Lemma 2.1 we get , so we conclude . ∎

Theorem 2.1 below provides a characterization of those Böhm-like trees arising as the Böhm tree of some -term, in the spirit of [4, Thm. 10.1.23]. To achieve this result, it will be convenient to consider a tree as a set of sequences closed under prefix.

We denote by the set of finite sequences of natural numbers. Given , the corresponding sequence of length is represented by . In particular, represents the empty sequence of length 0. Given as above and , we write for the sequence and for the sequence .

Given a tree , the sequence possibly determines a subtree that can be found going through the -th children of (if it exists) and then following the path . Of course this is only the case if actually belongs to the domain of the tree. The following definition formalizes this intuitive idea in the particular case of syntax trees of approximants.

Let . The subterm of at , written , is defined by:

As a matter of notation, given an approximant and a subset , we write whenever there exists such that is defined and .

Let be a set of approximants. There exists such that if and only if the following three conditions hold:

  1. is an ideal w.r.t. ,

  2. is r.e. (after coding),

  3. is finite.

Proof sketch.

Let be such that , then (1) is satisfied by Proposition 2.1 and (3) by Remark 2.1. Concerning (2), let us fix an effective bijective encoding . Then the set is r.e. because it is semi-decidable to determine if (just enumerate all -reducts of and check whether is one of them), the set and the relation restricted to are decidable.

Assume that is a set of approximants satisfying conditions (1-3). Since is r.e., if and are effectively given then the condition is semi-decidable.

Consider now the function (depending on ) defined by:

The above definition of can be rendered effective by going over to the codes of sequences and -terms. Let be the numeral associated with under an effective encoding and be the quote of defined by Mogensen101010This encoding is particularly convenient because it is effective, defined on open terms by exploiting the fact that and works in the CbV setting as well (straightforward to check). in [18] (see also [5, §6.1] for a nice treatment). Recall that there exists an evaluator such that for all -terms . The CbV -calculus being Turing-complete, as shown by Paolini in [20], there exists a -term -defining , namely if , otherwise is not potentially valuable. It is now easy to check that the -term defined as