Proof nets and the instantiation overflow property

03/25/2018 ∙ by Paolo Pistone, et al. ∙ Università Roma Tre 0

Instantiation overflow is the property of those second order types for which all instances of full comprehension can be deduced from instances of atomic comprehension. In other words, a type has instantiation overflow when one can type, by atomic polymorphism, "expansion terms" which realize instances of the full extraction rule applied to that type. This property was investigated in the case of the types arising from the well-known Russell-Prawitz translation of logical connectives into System F, but is not restricted to such types. Moreover, it can be related to functorial polymorphism, a well-known categorial approach to parametricity in System F. In this paper we investigate the instantiation overflow property by exploiting the representation of derivations by means of linear logic proof nets. We develop a geometric approach to instantiation overflow yielding a deeper understanding of the structure of expansion terms and Russell-Prawitz types. Our main result is a characterization of the class of types of the form ∀ XA, where A is a simple type, which enjoy the instantiation overflow property, by means of a generalization of Russell-Prawitz types.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In his 1903, Principles of Mathematics, Bertrand Russell showed that the connectives can be expressed in the -fragment of second order logic. Russell’s translation was later extended by Prawitz ([Pra65]) to derivations, providing an embedding of full second order logic into its -fragment. The Russell-Prawitz translation ( translation for short) can be described as a method which allows to associate with any connective defined by natural deduction introduction rules111The picture can be extended to the case in which the rule I discharges a set of hypotheses, see [TPP17].

a formula 222Where indicates the type , for the list . belonging to this fragment.

When restricting to intuitionistic logic, the -fragment of second order logic corresponds to the polymorphic -calculus or System ([Gir72, Rey74]). The most characteristic rule of this system is the -elimination rule

(1)

also called extraction rule, which allows to give type , for any type , to a term having type . This rule expresses an impredicative comprehension principle and is responsible for the failure, in second order logic, of the subformula principle.

A salient feature of the types of the form (let us call them types) is the so-called instantiation overflow property, first described in [Fer06]. A type of the form has this property when any instance of the full extraction rule 1 can be deduced in System ([FF13]), that is, the subsystem of in which rule 1 is replaced by the atomic extraction rule below

(2)

More precisely, the type has instantiation overflow when for any second order type , there exists an “expansion term” which can be given type in . In other words, this property amounts to the possibility, for a given type, to deduce full comprehension from atomic, hence predicative, comprehension.

The instantiation overflow property of types was exploited in [Fer06] and [FF13] to define a variant of the translation based on atomic polymorphism. However, instantiation overflow is not restricted to types: in [FD16] is shown that it holds for all types , where and .

In [TPP16] instantiation overflow was related to functorial polymorphism ([BFSS90]), by exploiting a well-known connection between the translation and dinaturality. We recall that functorial polymorphism is the semantics of System in which types are interpreted as functors (in a generalized, “multivariant”, sense, see [EK66]) over a cartesian closed category, and well-typed terms as dinatural transformations between such functors. This semantics was proposed as a formalization of parametric polymorphism, one of the most investigated aspects of System (see [JFM96]

for an historical survey on parametricity). In particular, all parametric models of System

are dinatural models, as parametricity implies dinaturality ([PA93]).

The fact that the type preserves all properties of the original connective corresponds to the dinaturality condition for the type . In categorial terms, this means that the translation preserves universal properties of connectives only in parametric models of System ([PA93, Has09]). In proof-theoretic terms, this means that the translation, while preserving -equivalence in all models, preserves -equivalence and permuting conversions only up to the equational theory generated by dinaturality ([TPP17]).

When is a type, the expansion terms realizing instantiation overflow can be described in a “functorial” way, by considering the fact that must be of the form , where the only contain positive occurrences of . Such correspond then to covariant endofunctors over the category generated by derivations: given a derivation of hypothesis and conclusion , one can construct a derivation , of hypothesis and conclusion . Then, for any , one can construct a derivation of as illustrated in figure 1, by exploiting the functoriality of the over the derivation of hypothesis and conclusion (where is of the form ), made only of elimination rules. When is the translation of disjunction, conjunction or absurdity, such derivations correspond exactly to those described in [FF13]. Moreover, in the equational theory generated by dinaturality, the expansion terms just described are equivalent to the derivations consisting only of one instance of the full extraction rule ([TPP16]). This means in particular that expansion terms and instances of full extractions have the same denotations in all parametric models of System . In other words, atomic polymorphism and full polymorphism for types are indistiguishable modulo dinaturality/parametricity.

Figure 1: Instantiation Overflow for Russell-Prawitz types

In this paper we investigate the instantiation overflow property by exploiting, in addition to the functorial intuition, the representation of derivations by means of linear logic proof nets. Proof nets can be considered as a unified framework for structural and categorial proof theory, as they provide a well-known bridge between the sequent calculus of linear logic and the language of symmetric monoidal closed as well as -autonomous categories ([See89, Blu93, BCST96]), refining a paradigm originating in Lambek’s investigations on categories as deductive systems [Lam69]. Proof nets for Intuitionistic Multiplicative Linear Logic (without units), , essentially correspond to Eilenberg-Kelly-MacLane graphs (see [EK66, Blu93, Hug12]), a graphical formalism playing a central role in several coherence theorems (see [KM71, KL80]). Moreover, types (that we call linear types) can be described as multivariant functors over the category generated by proof nets/allowable graphs.

We develop a geometric approach to instantiation overflow yielding a deeper understanding of the structure of expansion terms and Russell-Prawitz types. Our main result is a characterization of instantiation overflow for the types of the form , where is a simple type (theorem 7): we define a class of types which generalize the translation and we show that, when is a simple type, has instantiation overflow if and only if it is either derivable or logically equivalent to a product of types belonging to this class.

We use proof nets to investigate the expansion property for the types of the linear simply typed -calculus . A linear type is expansible when, for all , there exists a variable and a proof net of hypothesis and conclusion , for some variable . When considering 333When , is a type, we say that is in (in short, ). types in (called linear types), the expansion terms, as the one in figure 1, correspond to proof nets called “simple expansion graphs”. In figure 2 is shown the simple expansion graph for the type (associated to the translation of the multiplicative conjunction ). Similarly to instantiation overflow, the expansion property is not limited to linear types: for instance the types and are expansible but are not linear Russell-Prawitz types.

Simple expansion graphs can be defined for any type having an equal number of positive and negative occurrences of a variable . However, such graphs need not be proof nets, that is, satisfy the correction criterion. We show that the correctness of such graphs depends on the possibility of pairing the occurrences of following a particular pattern (called an internal pairing). This property leads to introduce, for any variable , the class of generalized Russell-Prawitz types in ( types), which capture the geometrical properties of types. We prove that a linear type is expansible if and only if it is logically equivalent to a type. For instance, the type above and (as soon as intuitionsitic implication is replaced by linear implication) all types introduced in [FD16]) are ; the type above is not , but logically equivalent to the type .

The result just stated is actually a bit stronger, as it exploits a strict notion of logical equivalence, called collapse

, related to Craig interpolation: a type

collapses into a type when is an interpolant of a derivation of . For instance, the type above collapses into the type . Proof net interpolation algorithms are known from the literature ([BdG96, Car97]). As our results involve the implicational fragment of some intuitionistic systems, we had to consider the well-known fact that such fragments satisfy interpolation in a weaker form (see [Kan06]). To implement weak proof net interpolation in , we adapted the algorithm in [BdG96].

The characterization of expansible linear types is extended to the simply typed -calculus , by exploiting a folklore linearization argument relating simply typed -terms and proof nets. The characterization of expansible simple types is slightly different as one must consider that, if a type is derivable (that is, if there exists a closed term of that type), then, by weakening, it is also expansible, and that weak interpolation for is sensibly more complex than in the case of . We prove that a simple type is expansible iff it is either derivable or logically equivalent to the product of a finite family of types.

We finally adapt these results to : we show that a suitable extension of the expansion property yields a similar characterization of instantiation overflow for the types of the form , where is a simple type: as mentioned above, such types are either derivable or logically equivalent (in ) to the product of a finite family of types (i.e. types of the form , where is ).

There are many natural questions which are left open by the present investigations. In particular, we do not know whether the instantiation overflow property is decidable (as our characterization depends on the notions of derivability and logical equivalence, which are both undecidable in the case of ), nor how the ideas and techniques here presented can be extended to the case of an arbitrary second order type of the form . Finally, the relation between types and the translation should be investigated in more detail. We briefly discuss some of these questions at the end of the paper.

Figure 2: Simple expansion graph for the translation of

Related work

The instantiation overflow phenomenon was first introduced in [Fer06] and later investigated in [FF13] as a property of the Russell-Prawitz translation of disjunction. In particular, it is shown there that the usual Russell-Prawitz translation of logical connectives into can be transformed into a translation into by exploiting instantiation overflow. Similar results were independently proved in [San08].

The first investigation on the general class of formulas enjoying instantiation overflow is in [FD16], where the “Prawitz formulas of level ” are introduced. The results are the following: (1) Prawitz formulas of level 2 have instantiation overflow, (2) there exist Prawitz formulas of arbitrary level (the formulas mentioned above) having instantiation overflow, (3) the formula does not have instantiation overflow. Such results can be deduced from our characterization, since (1) Prawitz formulas of level 2 correspond to Russell-Prawitz types, (2) the formulas correspond to generalized Russell Prawitz types and (3) is not logically equivalent to any product of generalized Russell-Prawitz types.

As already mentioned, the functorial formulation of instantiation overflow as well as the result that the instantiation overflow derivations are equivalent to instances of full extraction modulo dinaturality first appeared in [TPP16] and will appear in a sequel paper to the journal version [TPP17]. These papers present the functorial interpretation of Russell-Prawitz types and their relation with dinaturality within a natural deduction frame.

Categories of allowable graphs are well-known in the literature since [KM71] and are used to establish coherence results (see [KL80]). Several proof net formalisms for have been used to establish coherence for symmetric monoidal categories, -autonomous categories and weakly distributive categories ([BCST96, LS04]). The main technical delicacy in such approaches involves the treatment of multiplicative unities and . For this reason we limited ourselves to the system and its -fragment . [Hug12] shows that this approach can be extended to treat , yielding a representation of the free -autonomous category. Following [HHS17], allowable graphs for should yield a representation of the free symmetric semi-monoidal closed category. Our category of allowable graphs essentially follows [Hug12, HHS17]. A major difference is that we define shapes as rooted DAGs, incorporating the correctness criterion for Lamarche essential nets ([Lam08, MO03]).

Interpolation for linear logic and proof nets was investigated in [Roo91], [BdG96] and [Car97]. Weak interpolation for the implicational fragment of intuitionistic logic was investigated in [Wro84, Pen97, Kan06]. Our proof of weak interpolation for (in appendix A) is essentially a variant of the one in [BdG96].

Structure of the paper

The paper can be subdivided in two parts. The first part, from section 2 to section 4, is preliminary to the treatment of instantiation overflow: we first introduce type systems, proof-nets and their categorial and functorial interpretations, and then we discuss proof net interpolation and some useful applications. The second part, from section 5 to section 6, is devoted to Russell-Prawitz types and the characterization of the expansion property and instantiation overflow.

More in detail, in section 2 we recall the four type systems () used in the paper and we describe the syntactic categories they generate as well as their functorial interpretations. Moreover, we introduce a graphical representation of linear terms through a category of allowable graphs (similarly to [Hug12]), corresponding to essential nets ([Lam08, MO03]). In section 3 we recall previous results on interpolation in and we prove a weak interpolation result for the fragment . Then we exploit this result to prove the positivity lemma 3.3, a fundamental result which allows to extract, through interpolation, a type containing only positive occurrences of a variable from any type for which a “functorial” action on arrows is defined. In section 4 we extend these results to , by exploiting a linearization theorem.

In section 5 we describe instantiation overflow and Russell-Prawitz types and their relationship with functorial polymorphism. We also introduce generalized Russell-Prawitz types and the expansion property, which are investigated in the last two sections. In section 6 we investigate the expansion property for linear types. We prove our first “density theorem”: a linear type is expansible iff it collapses into a linear generalized Russell-Prawitz type. This section contains our geometrical investigation of instantiation overflow through simple expansion graphs. In section 7 we prove a similar “density theorem” for simple types and we apply it to characterize simple types enjoying instantiation overflow.

Finally, in section 8 we discuss some open problems and further directions.

2 -terms, proof nets and categories

We recall the type systems which will be used in the paper and we introduce proof nets for Intuitionistic Multiplicative Linear Logic without units , by defining a category of allowable graphs similarly to [Hug12]. Then, we recall the syntactic categories generated by simply typed -terms and System typable -terms and their functorial interpretation, which will be exploited in section 5 to describe the instantiation overflow property for Russell-Prawitz types.

2.1 Type systems

We introduce the four type systems which will be used throughout the text:

  • the simply typed -calculus ;

  • the linear simply typed -calculus ;

  • the polymorphic -calculus or System ([Gir72, Rey74]);

  • the atomically polymorphic -calculus or System ([FF13]).

Given a basic set of types , built over a set of variables , we will consider two notions of -terms:

  1. -terms, defined by the grammar below

    where ; is linear in if occurs exactly once free in ;

  2. -terms, defined by the grammar below

    where and .

Observe that the definitions above depend on the choice of . This dependence will be often omitted, if it can be deduced from the context.

-terms and -terms are considered up to renaming of bound variables, as usual. Given a -term (resp. -term) , we let indicate the set of its free term (resp. term and type) variables, and indicate the set of its bound term (resp. term and type) variables.

For -terms and -terms we let indicate usual -equivalences, generated by the schemas in figure 3. By a normal -term (resp. -term) we indicate a term to which no -reduction can be applied. Following [Bar85], by a -theory (resp. a -theory) we indicate any set of equations over (resp. ) terms such that , where is obtained by adding the and equivalence as well as the usual axioms and rules of the -calculus.

Figure 3: and equivalences

For any normal -term , we define the set of its subterms as follows: if has no application, then ; otherwise, , then . We call a subterm proper if .

We introduce now the type systems. By a context in we indicate a list of type declarations , where the are types of and the are pairwise distinct term variables. We will indicate contexts as . Concatenation of contexts is indicated by comma .

All systems below include the exchange rule

where is a context in , is a type in and indicates a context obtained from by permuting the order of its elements.

By a partition of a context , we indicate a list of contexts such that .

()

the set of linear types is generated by the grammar ; the typing rules for linear -terms are and those shown in figure 3(a);

()

the set of simple types is generated by the grammar ; the typing rules for -terms are and those in figure 3(b);

()

the set of second order types is generated by the grammar ; the typing rules for -terms are those of plus those shown in figure 3(c);

()

same types as ; the typing rules for -terms are those of plus those shown in figure 3(d);

(a) System

(b) System

(c) System

(d) System
Figure 4: Type systems rules

Observe that the usual rules of contraction and weakening are derivable in . For any type in any of the systems above, we let (resp. ) indicate the set of its free (resp. bound) variables. There exist obvious inverse translations from and , given by , , and . If is derivable in , then is derivable in , where , for .

A type will be generally written , where is shorthand for a finite, possibly empty, sequence of quantifications .

Given any of the systems above, we say that a type is derivable if there exists a closed term having type . We say that two types are logically equivalent if there exist closed terms having type , respectively, in the case of , and , respectively, in all other cases. If, moreover, and , then and are called isomorphic. Finally, given types , we say that is logically equivalent to the product of when there exist closed terms having types , respectively (in the case of ) and types , respectively, in all other cases.

In any of the systems above, given a normal -term such that , we introduce the following terminology:

  • any variable occurring free or bound in is assigned a unique type that we indicate by ;

  • any is assigned a unique type, that we indicate by ;

  • is said in -long normal form when for any , if or , then , for some variable and term .

Observe that, if is in -long normal form, then for any type occurring positively (resp. negatively) in there exists (resp. ) such that (resp. ).

2.2 Proof nets and the category of allowable graphs

We introduce proof nets for Intuitionistic Multiplicative Linear Logic without units, , that is, the system obtained by adding to the connective. Typed -calculi for can be found in the literature (see [Abr93, BBdPH93]).

We recall that proof nets for and its subsystems can be considered as a graphical representation of -terms or as a graphical formalism for arrows in free monoidal closed categories. Our definition merges the two viewpoints: on the one hand, our definition corresponds to the usual definition of essential nets for ([Lam08, MO03]), characterizing linearly typable -terms; on the other hand, we introduce proof-structures by means of a category of graphs following [Hug12]; in particular, the correction criterion of essential nets generates the sub-category of allowable graphs.

We first define a category of graphs, which are defined as certain morphisms between signed sets, i.e. sets whose elements are assigned a polarity . Then we introduce shapes as certain rooted s whose leaves form a signed set and we define a category of allowable graphs, corresponding to graphs in satisfying the correction criterion. [Hug12] shows that this category can be extended to treat , yielding a representation of the free -autonomous category. Following [HHS17], might be seen as a representation of the free symmetric semi-monoidal closed category.

The objects of are signed sets, i.e. finite sets whose elements are assigned a polarity (i.e. edges , where ); Given , we let be the opposite polarity; given a signed set , we let be the signed set whose underlying set is the same as and whose polarities are reversed. Given two signed sets , we let denote their disjoint union.

Arrows in , called graphs, are bijections (where indicates disjoint union). Equivalently, a graph is a set of disjoint edges, i.e. disjoint edges of elements of which can be of three types:

Type I:

, where , , for ;

Type II:

, where ;

Type III:

, where .

A graph can be illustrated as a directed acyclic graph (as in fig 4(a)) by orienting edges from positive to negative. We will call a graph pure when it only consists of type I edges.

Composition of graphs is finite directed path composition (see [Hug12]), as illustrated in figure 4(b). More precisely, given and , is the bijection where

The definition of relies on the following:

Lemma 2.1.

If and , then for each (resp. ) there exists an such that (resp. ).

Proof.

By a maximal chain in we indicate a sequence of even length obtained by alternating a type III edge and a type II edge such that, for , if is , then is and, moreover, , where . If is the cardinality of , then any chain in must have length . Now, if , then either (then put ), or is the start of a chain in . Then for some , the chain ends in some , and then .

(a) A graph

(b) Composition of graphs

(c) A labeled graph
Figure 5: Examples of graphs and labeled graphs

In order to introduce allowable graphs, we first define shapes. A shape corresponds to the switching of the syntactic tree of a type. Hence, on the one hand the leaves of the shape form a signed set, so that an arrow between two shapes corresponds to a graph between the associated signed sets; on the other hand, by joining the graph with the shapes, we obtain a correction graph on which we can check the essential nets correction criterion ([MO03]).

Definition 2.1 (shape).

A shape is a rooted and labeled DAG whose leaves form a signed set , called the variable set of . The nodes of a shape are either leaves (hence labeled by or ) or labeled by (resp. ) or (resp. ). The root of is called the conclusion of . If is a shape, by we indicate the shape obtained from by reversing the sign of its leaves and replacing all labels - resp. - by - resp. . Shapes are defined inductively as follows:

  • is the shape , , ;

  • if are shapes, is the shape in figure 5(a), and ; and are called, respectively, left and right premiss of the node .

  • if are shapes, is the shape in figure 5(b), and ; and are called, respectively, left and right premiss of the node .

  • if are shapes, is the shape in figure 5(c), and ; and are called, respectively, left and right premiss of the node .

  • if are shapes, is the shape in figure 5(d), and ; and are called, respectively, left and right premiss of the node .

(a)

(b)

(c)

(d)
Figure 6: Definition of shapes

Figure 7:

Given shapes , by a graph we indicate a graph . More generally, given a (non-empty) list of shapes and a shape , by a graph we indicate a graph . Clearly, for any shape there exists a pure graph . Given graphs , we let be the graph .

For any graph , the correction graph of , noted , is the rooted directed graph , with root , called the conclusion of .

Definition 2.2 (allowable graph).

Let be shapes and be a graph. is allowable (or correct) for if satisfies:

(acyclicity)

is a connected ;

(functionality)

for every node of , every path going from the conclusion to the left premiss of the node passes through the node.

In figure 7 the correction graph is illustrated, where is the shape .

The category of allowable graphs has shapes as objects and allowable graphs as morphisms (with composition defined as in ). can be presented also as a symmetric multicategory ([Lei04]) whose objects are shapes and whose multiarrows are graphs , where is a (possibly empty) list of shapes. Multicomposition is defined as follows: given (multi)arrows , , where the indicate finite lists of shapes, one can define as . Observe that, in one can consider arrows

, corresponding to closed proofs. Due to the absence of the tensor unit in

, such arrows do not exist in . In the following we will often confuse and .

We let (resp. ) indicate the subcategory of (resp. the sub-multicategory of ) whose shapes do not contain and .

We let be the language given by the grammar , where . Any linear type is obviously assigned a shape and a labeling, i.e. a map associating the leafs of with a variable. is a labeled graph (or, simply, a graph when no ambiguity occurs) if and, by letting , . A labeled graph (illustrated in figure 4(c)) can be though as a graph over signed multisets of variables. Given , if and , then we say that the is over . Clearly, for any , if , then (which we will note simply ) is a correct pure labeled graph from to . In the following we will often confuse between a linear type and its associated shape. We will also often confuse the context of linear types with the list of shapes .

If is any (non necessarily correct) graph, then for any , induces a -pairing of , i.e. a partition of all occurrences of in in pairs whose elements have opposite polarity.

We say that two types are isomorphic when there exist correct graph and such that , .

We conclude the presentation of allowable graphs by showing how to associate to any normal linear -term such that is derivable in , an allowable graph 444Observe that we are here confusing the list with the context . We will often confuse them, if it creates no ambiguity.. The definition of actually depends on and , so it should be written more pedantically as , as different typings of the same -term give rise to different labeled graphs555Indeed all such graphs can be obtained by suitable expansions from the graph , where is a principal typing of .).

For any linear type we let (where stands for “left” and stands for “right”) be the set of all paths, i.e. all finite sequences of elements of , leading to variables in the syntactic tree of . Given a context and a linear type , any element of , where and , is uniquely determined by a pair made of an index (by letting ) and a path such that . Let be the set of such pairs. There exists then a bijection where is the node corresponding to the path in the syntactic tree of . In case , then there is a canonical bijection such that is the node corresponding to in the syntactic tree of . The translation can then be defined inductively as follows:

  1. if , then , , and we have . Then ;

  2. if , then we have , so by induction hypothesis, the graph is defined. Observe that is defined by when and , , when and , when . We put then .

  3. if , where , and , where the form a partition of , then by induction hypotheses the graphs are defined and we have bijections and