    # Normalization by Evaluation for Call-by-Push-Value and Polarized Lambda-Calculus

We observe that normalization by evaluation for simply-typed lambda-calculus with weak coproducts can be carried out in a weak bi-cartesian closed category of presheaves equipped with a monad that allows us to perform case distinction on neutral terms of sum type. The placement of the monad influences the normal forms we obtain: for instance, placing the monad on coproducts gives us eta-long beta-pi normal forms where pi refers to permutation of case distinctions out of elimination positions. We further observe that placing the monad on every coproduct is rather wasteful, and an optimal placement of the monad can be determined by considering polarized simple types inspired by focalization. Polarization classifies types into positive and negative, and it is sufficient to place the monad at the embedding of positive types into negative ones. We consider two calculi based on polarized types: pure call-by-push-value (CBPV) and polarized lambda-calculus, the natural deduction calculus corresponding to focalized sequent calculus. For these two calculi, we present algorithms for normalization by evaluation. We further discuss different implementations of the monad and their relation to existing normalization proofs for lambda-calculus with sums. Our developments have been partially formalized in the Agda proof assistant.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The idea behind normalization by evaluation (NbE) is to utilize a standard interpreter, usually evaluating closed terms, to compute the normal form of an open term. The normal form is obtained by a type-directed reification procedure after evaluating the open term to a semantic value, mapping (reflecting) the free variables to corresponding unknowns in the semantics. The literal use of a standard interpreter can be achieved for the pure simply-typed lambda-calculus [8, 13] by modelling uninterpreted base types as sets of neutral (aka atomic) terms, or more precisely, as presheaves or sets of neutral term families, in order to facilitate fresh bound variable generation during reification of functions to lambdas. Thanks to -equality at function types, free variables of function type can be reflected into the semantics as functions applying the variable to their reified argument, forming a neutral term. This mechanism provides us with unknowns of function type which can be faithfully reified to normal forms.

Filinski  studied NbE for Moggi’s computational lambda calculus , shedding light on the difference between call-by-name (CBN) and call-by-value (CBV) NbE, where Danvy’s type-directed partial evaluation  falls into the latter class. The contribution of the computational lambda calculus is to make explicit where the monad is invoked during monadic evaluation, and this placement of the monad carries over to the NbE setting. Moggi’s studies were continued by Levy  who designed the call-by-push-value (CBPV) lambda-calculus to embed both the CBN and CBV lambda calculus.

In this work, we formulate NbE for CBPV (Section 3), with the aim to investigate later whether CBN and CBV NbE can be recovered from CBPV NbE via the standard translations of the CBN and CBV calculi into CBPV.

In contrast to the normal forms of CBN NbE, which is the algorithmic counterpart of the completeness proof for intuitionistic propositional logic (IPL) using Beth models, CBPV NbE gives us more restrained normal forms, where the production of a value via injections cannot be interrupted by more questions to the oracle. In the research field of focalization [5, 18] we speak of chaining non-invertible introductions. Invertible introductions are already chained in NbE thanks to extensionality (

) for function, and more generally, negative types. Non-invertible eliminations are also happening in a chain when building neutrals. What is missing from the picture is the chaining of invertible eliminations, i.e., case distinctions and, more generally, pattern matching. The picture is completed by extending NbE to

polarized lambda calculus [25, 10, 23] in Section 4.

In our presentation of the various lambda calculi we ignore the concrete syntax, only consider the abstract syntax obtained by the Curry-Howard-Isomorphism. A term is simply a derivation tree whose nodes are rule invocations. Thus, a intrinsically typed, nameless syntax is most natural, and our syntactic classes are all presheaves over the category of typing contexts and renamings. The use of presheaves then smoothly extents to the semantic constructions [11, 3].

Concerning the presentation of polarized lambda calculus, we depart from Zeilberger  who employs a priori infinitary syntax, modelling a case tree as a meta-level function mapping well-typed patterns to branches. Instead, we use a graded monad representing complete pattern matching over a newly added hypothesis, which is in spirit akin to Filinski’s [14, Section 4] and Krishnaswami’s  treatment of eager pattern matching using a separate context of variables to be matched on.

Our design choices were guided by an Agda formalization of sections 2 (complete) and 4 (partial), available at https://github.com/andreasabel/ipl. Agda was particularly helpful to correctly handle the renamings abundantly present when working with presheaves.

## 2 Normalization by Evaluation for the Simply-Typed Lambda Calculus with Sums

In this section, we review the normalization by evaluation (NbE) argument for the simply-typed lambda calculus (STLC) with weak sums, setting the stage for the later sections. We work in a constructive type-theoretic meta-language, with the basic judgement meaning that object is an inhabitant of type . However, to avoid confusion with object-level types such as the simple types of lambda calculus, we will refer to meta-level types as sets. Consequently, the colon takes the role of elementhood in set theory, and we are free to reuse the symbol for other purposes.

### 2.1 Contexts and indices

We adapt a categorical aka de Bruijn style for the abstract syntax of terms, which we conceive as intrinsically well-typed. In de Bruijn style, a context is just a snoc list of simple types , meaning we write context extension as , and the empty context as . Membership and sublist relations are given inductively by the following rules:

We consider the rules as introductions of the indexed types and and the rule names as constructors. For instance, for any , , and ; and if we read as unary number , then is exactly the (de Bruijn) index of in .

We can define and by recursion, meaning that the (proof-relevant) sublist relation is reflexive and transitive. Thus, lists form a category with morphisms , and the category laws hold propositionally, e.g., we have in propositional equality for all morphisms . The singleton weakening , also written or , is defined by .

The category allows us to consider as a presheaf over for any , witnessed by , which is the morphism part of functor from to , mapping object to the set of the indices of in . The associated functor laws and hold propositionally.

### 2.2 STLC and its normal forms

Simple types shall be distinguished into positive types and negative types , depending on their root type former. Function () and product types ( and ) are negative, while base types () and sum types ( and ) are positive.

 A,B,C::=P∣Nsimple typesP::=0∣A+B∣opositive typesN::=1∣A×B∣A⇒Bnegative types

Intrinsically well-typed lambda-terms, in abstract syntax, are just inhabitants of the indexed set , inductively defined by the following rules.

 unit 1⊣Γpair A1⊣ΓA2⊣ΓA1×A2⊣Γprji A1×A2⊣ΓAi⊣Γ

The skilled eye of the reader will immediately recognize the proof rules of intuitionistic propositional logic (IPL) under the Curry-Howard isomorphism, where is to be read as “ follows from ”. Using shorthand for the th variable, a term such as could in concrete syntax be rendered as We leave the exact connection to a printable syntax of the STLC to the imagination of the reader, as we shall not be concerned with considering concrete terms in this article.

Terms of type form a presheaf as witnessed by the standard weakening operation111Here, is short for renaming, but in a nameless calculus we should better speak of reindexing, which could, a bit clumsily, be also abbreviated to . defined by recursion over , and functor laws for analogously to .

Normal forms222There is also a stronger notion of normal form, requiring that two extensionally equal lambda-terms, i. e., those that denote the same set-theoretical function, have the same normal form [20, 2, 24]. Such normal forms do not have a simple inductive definition, and we shall not consider them in this article. are logically characterized as those fulfilling the subformula property [22, 15]. Normal forms are mutually defined with neutral normal forms . In the following inductive definition, we reuse the rule names from the term constructors.

 inji NfAiΓNf(A1+A2)Γcase Ne(A1+A2)ΓNfP(Γ.A1)NfP(Γ.A2)NfPΓabort Ne0ΓNfPΓ

These rules only allow the elimination of neutrals; this restriction guarantees the subformula property and prevents any kind of computational () redex. The new rule embeds into , but only at base types [3, Section 3.3]. Further, case distinction via and is restricted to positive types . As a consequence, our normal forms are -long, meaning that any normal inhabitant of a negative type is a respective introduction (, , or ). This justifies the attribute negative for these types: the construction of their inhabitants proceeds mechanically, without any choices. In contrast, constructing an inhabitant of a positive type involves choice: whether case distinction is required, and which introduction to pick in the end ( or ).

Needless to say, and are presheaves, i. e., support reindexing with just as terms do. From a normal form we can extract the term via an overloaded function and that discards constructor but keeps all other constructors. This erasure function naturally commutes with reindexing, making it a natural transformation between the presheaves (, resp.) and . We shall simply write, for instance, for such presheaf morphisms. (The point on the arrow is mnemonic for pointwise.) Slightly abusive, we shall extend this notation to -ary morphisms, e. g., write for .

While the coproduct eliminations and are limited to normal forms of positive types , their extension and to negative types is admissible, for instance:

 \omit\span\omit\span\omit\span\omitabortB : Ne0⋅→NfBabort1 u=unitabortP u=abortu  abortA×B u=pair(abortAu)(abortBu)abortA⇒B u=abs(abortB(renwkAu))

generalizes analogously, with a bit of care when weakening the branches.

### 2.3 Normalization

Normalization is concerned with finding a normal form for each term . The normal form should be sound, i. e., with respect to a equational theory on terms (see Appendix A). Further, normalization should decide , i. e., terms with should have the same normal form . In this article, we implement only the normalization function with proving its soundness and completeness. From a logical perspective, we will compute for each derivation of a normal derivation .

Normalization by evaluation (NbE) decomposes normalization into evaluation in the identity environment followed by reification (aka quoting). The role of evaluation is to produce from a term the corresponding semantic (i. e., meta-theoretic) function, which is finally reified to a normal form. Since we are evaluating open terms , we need to supply an environment which will map the free indices of to corresponding unknowns. To accommodate unknowns in the semantics, types are mapped to presheaves (rather than just sets), and in particular each base type is mapped to the presheaf with the intention that the neutrals take the role of the unknowns. The mapping from neutrals to unknowns is called reflection (aka unquoting), and defined mutually with reification by induction on type .

At this point, let us fix some notation for sets to prepare for some constructions of presheaves. Let 1 denote the unit set and its unique inhabitant, 0 the empty set and the ex falsum quod libet elimination into any set . Given sets and , their Cartesian product is written with projections , and their disjoint sum with injections and elimination for arbitrary .

Presheaves (co)products , , , and are constructed pointwise, e. g., , and given two presheaves and , . For the exponential of presheaves, however, we need the Kripke function space .

We will interpret simple types as corresponding presheaves . Let us start with the negative types, defining reflection and reification along the way.

 []=^1↑1Γu=()↓1Γ()=unit[[A×B]]=[[A]]^×[[B]]↑A×BΓu=(↑AΓ(prj1u),↑BΓ(prj2u))↓A×BΓ(a,b)=pair(↓AΓa)(↓BΓb)

In the reification at function types , the renaming makes room for a new variable of type , which is reflected into by . The ability to introduce fresh variables into a context, and to use semantic objects such as in a such extended context, is the reason for utilizing presheaves instead of just sets as semantic types.

Note also that in the equation for , the neutral is transported into via reindexing with , in order to be applicable to the normal form reified from the semantic value .

A direct extension of our presheaf semantics to positive types cannot work. For instance, with , simply would give us an inhabitant of the empty set, which means that reflection at the empty type would not be definable. Similarly, the setting is refuted by which would require us to make a decision of whether holds or holds while only be given a hypothesis of type . Not even the usual interpretation of base types works in the presence of sums, as we would not be able to interpret the term in our semantics, as is empty. What is needed are case distinctions on neutrals in the semantics, allowing us the elimination of positive hypotheses before producing a semantic value, and we shall capture this capability in a strong monad which can cover the cases.

To recapitulate, a monad on presheaves is first an endofunctor, i.e., it maps any presheaf to the presheaf and any presheaf morphism to the morphism satisfying the functor laws for identity and composition. Then, there are natural transformations (unit) and (multiplication) satisfying the monad laws.

We are looking for a cover monad that offers us these services:

 abortC:Ne0⋅→CBcase % on absurd neutralcaseCΓ:Ne(A1+A2)Γ→CB(Γ.A1)→CB(Γ.A2)→CBΓcase on neutralrunNfC:C(NfA)⋅→NfArun the monad (Nf% only)

To make things concrete, we shall immediately construct an instance of such a cover monad: the free cover monad defined as an inductive family with constructors , , and . One can visualize an element as binary case tree whose inner nodes () are labeled by a neutral term of sum type and its two branches by the context extensions and , resp. Leaves are either labeled by a neutral term of empty type (see ), or by an element of (see ). Functoriality amounts to replacing the labels of the -leaves, and the monadic bind (aka Kleisli extension) replaces these leaves by further case trees. (The uninspiring flattens a 2-level case tree, i. e., a case tree with case trees as leaves, into a single one.) Finally is a simple recursion on the tree, replacing and by the and constructions on normal forms, and by the identity.

Using the services of a generic cover monad , we can complete our semantics:

 [[o]]=C(Neo)↑o=returnC↓o=runNfC∘mapCne[]=C^0↑0=abortC↓0=runNfC∘mapCmagic

All semantic types fulfill the weak sheaf condition aka weak pasting, meaning there is a natural transformation for any simple type . In other words, we can run the monad, pushing its effects into . We proceed by induction on . Positive types are already monadic, and is simply the of the monad . At negative types we can recurse pointwise at a smaller type, exploiting that values of negative types are essentially (finite or infinite) tuples.

 \omit\span\omit\span\omit\span\omit\span\omit\span\omit\span\omitrunA : C[[A]]⋅→[[A]]run1 c  =  ()run0  =  joinCrunA×B c  =  (runA(mapCπ1c), runB(mapCπ2c))runA+B  =  joinCrunA⇒B cτa  =  runB(ˆmapC(λτ′f.fid(renτ′a))(renτc))runo  =  joinC

For the case of function types , we require the monad to be strong, which amounts to having already for a “local” presheaf morphism . The typings are and and , and now we want to apply every function in the cover to argument . Clearly, is not applicable since it would expect a global presheaf morphism , i. e., something that works in any context. However, applying to can only work in context or any extension , since we can transport to such a via but not to a context unrelated to . We obtain our input to of type as an instance of applied to the local presheaf morphism and the transported cover .

We extend the type interpretation pointwise to contexts, i. e., and and obtain a natural projection function from the semantic environments. The evaluation function can now be defined by recursion on . Herein, the environment lives in , thus, .

 ⦇unit⦈γ=()⦇pairt1t2⦈γ=(⦇t1⦈γ, ⦇t2⦈γ)⦇abst⦈γ=λ⦇t⦈γ⦇injit⦈γ=ιi⦇t⦈γ⦇varx⦈γ=lookupxγ⦇prjit⦈γ=πi⦇t⦈γ⦇apptu⦈γ=⦇t⦈γid⦇u⦈γ⦇caseut1t2⦈γ=⦇case⦈⦇u⦈γλ⦇t1⦈γλ⦇t2⦈γ⦇abortu⦈γ=⦇abort⦈⦇u⦈γ

For the interpretation of the binders and we use the mutually defined .

 λ⦇t:B⊣Γ.A⦈:[[Γ]]⋅→[[A⇒B]]=λ(γ:[[Γ]]Δ)(τ:Δ⊆Φ)(a:[[A]]Φ). ⦇t⦈(renτγ, a)

The coproduct eliminations and targeting an arbitrary semantic type are definable thanks to the weak sheaf property, i. e., the presence of pasting via for any type , and strong functoriality of .

 ⦇abort⦈B:[]⋅→[[B]]⦇abort⦈B=runB∘mapCmagic⦇case⦈B:[[A1+A2]]⋅→[[A1⇒B]]⋅→[[A2⇒B]]⋅→[[B]]⦇case⦈Bcf1f2=runB(ˆmapC(λτ. [f1τ, f2τ])c)

To complete the normalization function we define the identity environment , which maps each free index to its corresponding unknown in the semantics, by recursion on :

 freshε=()freshΓ.A=(renwkAfreshΓ, freshAΓ)

As already observed by Filinski [13, Section 5.4] [14, Section 3.2], normalization by evaluation can be carried out in the continuation monad. In our setting, we use a continuation monad on presheaves defined as

 CCJ=∀A. (J^⇒NfA)^⇒NfA.

The answer type of this continuation monad is always , however, we are polymorphic in the simple type of normal forms we produce.

Agda has been really helpful to produce the rather technical but straightforward evidence that is a strong monad. The method exists by definition, using the identity continuation . In the following, we demonstrate that enables matching on neutrals:

The NbE algorithm using is comparable to Danvy’s type-directed partial evaluation [12, Figure 8]. However, he uses shift-reset style continuations which can be programmed in the continuation monad, and relies on Scheme’s gensym to produce fresh variables names rather than using Kripke function space / presheaves.

## 3 Normalization to Call-By-Push Value

The placement of the monad in the type semantics of the previous section is a bit wasteful: Each positive type is prefixed by . In our grammar of normal forms, this corresponds to the ability to perform case distinctions (, ) at any positive type . In fact, our type interpretation corresponds to the translation of call-by-name (CBN) lambda-calculus into Moggi’s monadic meta-language [21, 17].

It would be sufficient to perform all necessary case distinctions when transitioning from a negative type to a positive type. Introduction of the function type adds hypotheses to the context, providing material for case distinctions, but introduction of positive types does not add anything in that respect. Thus, we could focus on positive introductions until we transition back to a negative type. Such focusing is present in the call-by-value (CBV) lambda-calculus, where positive introductions only operate on values, and variables stand only for values. This structure is even more clearly spelled out in Levy’s call-by-push-value (CBPV) , as it comes with a deep classification of types into positive and negative ones. In the following, we shall utilize pure (i. e., effect-free) CBPV to achieve chaining of positive introductions.

### 3.1 Types and polarization

CBPV calls positive types value types and negative types computation types , yet we shall stick to our terminology which is common in publications on focalization. However, we shall use for switch and for switch .

 Ty+∋P,Q::=o+∣1∣P1×P2∣0∣P1+P2∣ThunkN% positive typeTy−∋N,M::=o−∣⊤∣N1&N2∣P⇒N∣CompPnegative type

CBPV uses for and for , however, we find these names uninspiring unless you have good knowledge of the intended model. Further, CBPV employs labeled sums and labeled records for up to countably infinite label sets while we only have finite sums and records . However, this difference in not essential, our treatment extends directly to the infinite case, since we are working in type theory which allows infinitely branching inductive types. As a last difference, CBPV does not consider base types; in anticipation of the next section, we add them as both positive atoms () and negative atoms ().

Getting a bit ahead of ourselves, let us consider the mutually defined interpretations and of positive and negative types as presheaves.

 []=^1[[P1×P2]]=[[P1]]^×[[P2]][]=^0[[P1+P2]]=[[P1]]^+[[P2]][[ThunkN]]=[[N]][[o+]]=o+∈_[[⊤]]=^1[[N1&N2]]=[[N1]]^×[[N2]][[P⇒N]]=[[P]]^⇒[[N]][[CompP]]=C[[P]][[o−]]=C(Neo−)

Semantically, we do not distinguish between positive and negative products. Notably, sum types can now be interpreted as plain (pointwise) presheaf sums. The marker is ignored, yet , marking the switch from the negative to the positive type interpretation, places the cover monad. Positive atoms, standing for value types without constructors, are only inhabited by variables . Negative atoms stand for computation types without own eliminations, thus, their inhabitants stem only from eliminations of more complex types, made from positive eliminations captured in and negative eliminations chained together as neutral , which we shall define below. The method of cover monad can be extended to for negative types , by recursion on . Informally speaking, this makes all negative types monadic.

Contexts are lists of positive types since in CBPV variables stand for values. Interpretation of contexts is again defined pointwise and .

### 3.2 Terms and evaluation

Assuming a family of terms of negative type in context , values of positive type shall be constructed by the following rules:

 unit+ Val1Γpair+ ValP1ΓValP2ΓVal(P1×P2)Γinji ValPiΓVal(P1+P2)Γ

The terms of pure CBPV are given by the inductive family . It repeats the introductions and eliminations of negative types, except that application is restricted to values. Values of type are embedded via . Further, values of type can embedded via , producing a term of type . Such terms are eliminated by which is, unlike the usual monadic bind, not only available for -types but for arbitrary negative types . This is justified by the monadic character of negative types, by virtue of . Finally, there are eliminators (, , ) for values of positive product and sum types.

 ret ValPΓTm(CompP)Γabs TmN(Γ.P)Tm(P⇒N)Γpair− TmN1ΓTmN2ΓTm(N1&N2)Γunit− Tm⊤Γ force Val(ThunkN)ΓTmNΓapp Tm(P⇒N)ΓValPΓTmNΓprji Tm(N1&N2)ΓTmNiΓ

Interpretation of values and terms is straightforward, thanks to the pioneering work of Moggi  and Levy  put into the design of CBPV.

Since serves only as an embedding of negative into positive types and has no semantic effect, we interpret thunking and forcing by the identity. The eliminations for positive types deal now only with values, thus, need not reference the monad operations.

The use of the monad is confined to and . Note the availability of at any negative type for the interpretation of .

### 3.3 Normal forms and normalization

Positive normal forms are values referring only to atomic variables and whose thunks only contain negative normal forms.

 unit+ Vnf1Γpair+ VnfP1ΓVnfP2ΓVnf(P1×P2)Γinji VnfPiΓVnf(P1+P2)Γ

Neutral normal forms are negative eliminations starting from a forced rather than from variables of negative types (as those do not exist in CBPV). However, due to normality the cannot be a , but only a variable .

Variables are originally introduced by either or the ing of a neutral of type to a new variable of type . Variables of composite value type can be broken down by pattern matching, introducing variables of smaller type. These positive eliminations plus are organized in the inductively defined strong monad .

 return JΓCovJΓbind Ne(CompP)ΓCovJ(Γ.P)CovJΓ split P1×P2∈ΓCovJ(Γ.P1.P2)CovJΓ case P1+P2∈ΓCovJ(Γ.P1)CovJ(Γ.P2)CovJΓabort 0∈ΓCovJΓ

Finally, normal forms of negative types are defined as inductive family . They are generated by maximal negative introduction (, , ) until a negative atom or is reached. Then, elimination of neutrals and variables is possible through the monad until an answer can be given in form of a base neutral () or a normal value.

 ne Cov(Neo−)ΓNfo−Γret Cov(VnfP)ΓNf(CompP)Γ unit− Nf⊤Γpair− NfN1ΓNfN2ΓNf(N1&N2)Γabs NfNΓ.PNf(P⇒N)Γ

We again can run the cover monad on normal forms, i. e., have , which extends to negative semantic values