Towards a Semantic Measure of the Execution Time in Call-by-Value lambda-Calculus

04/23/2019 ∙ by Giulio Guerrieri, et al. ∙ University of Bath 0

We investigate the possibility of a semantic account of the execution time (i.e. the number of beta-steps leading to the normal form, if any) for the shuffling calculus, an extension of Plotkin's call-by-value lambda-calculus. For this purpose, we use a linear logic based denotational model that can be seen as a non-idempotent intersection type system: relational semantics. Our investigation is inspired by similar ones for linear logic proof-nets and untyped call-by-name lambda-calculus. We first prove a qualitative result: a (possibly open) term is normalizable for weak reduction (which does not reduce under abstractions) if and only if its interpretation is not empty. We then show that the size of type derivations can be used to measure the execution time. Finally, we show that, differently from the case of linear logic and call-by-name lambda-calculus, the quantitative information enclosed in type derivations does not lift to types (i.e. to the interpretation of terms). To get a truly semantic measure of execution time in a call-by-value setting, we conjecture that a refinement of its syntax and operational semantics is needed.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Type systems enforce properties of programs, such as termination or deadlock-freedom. The guarantee provided by most type systems for the -calculus is termination.

Intersection types have been introduced as a way of extending simple types for the -calculus to “finite polymorphism”, by adding a new type constructor and new typing rules governing it. Contrary to simple types, intersection types provide a sound and complete characterization of termination: not only typed programs terminate, but all terminating programs are typable as well (see [DBLP:journals/aml/CoppoD78, DBLP:journals/ndjfl/CoppoD80, Pottinger80, DBLP:books/daglib/0071545] where different intersection type systems characterize different notions of normalization). Intersection types are idempotent, that is, they verify the equation . This corresponds to an interpretation of a typed term as “ can be used both as data of type and as data of type ”.

More recently [DBLP:conf/tacs/Gardner94, DBLP:journals/logcom/Kfoury00, DBLP:conf/icfp/NeergaardM04, Carvalho07, deCarvalho18] (a survey can be found in [DBLP:journals/igpl/BucciarelliKV17]), non-idempotent variants of intersection types have been introduced: they are obtained by dropping the equation . In a non-idempotent setting, the meaning of the typed term is refined as “ can be used twice as data of type and once as data of type ”. This could give to programmers a way to keep control on the performance of their code and to count resource consumption. Finite multisets are the natural setting to interpret the associative, commutative and non-idempotent connective : if and are non-idempotent intersection types, the multiset represents the non-idempotent intersection type .

Non-idempotent intersection types have two main features, both enlightened by de Carvalho [Carvalho07, deCarvalho18]:

  1. Bounds on the execution time: they go beyond simply qualitative characterisations of termination, as type derivations provide quantitative bounds on the execution time (i.e. on the number of -steps to reach the -normal form). Therefore, non-idempotent intersection types give intensional insights on programs, and seem to provide a tool to reason about complexity of programs. The approach is defining a measure for type derivations and showing that the measure gives (a bound to) the length of the evaluation of typed terms.

  2. Linear logic interpretation: non-idempotent intersection types are deeply linked to linear logic () [DBLP:journals/tcs/Girard87]. Relational semantics [DBLP:journals/apal/Girard88, DBLP:journals/apal/BucciarelliE01] — the category of sets and relations endowed with the comonad of finite multisets — is a sort of “canonical” denotational model of ; the Kleisli category of the comonad is a CCC and then provides a denotational model of the ordinary (i.e. call-by-name) -calculus. Non-idempotent intersection types can be seen as a syntactic presentation of : the semantics of a term is the set of conclusions of all type derivations of .

These two facts together have a potential, fascinating consequence: denotational semantics may provide abstract tools for complexity analysis, that are theoretically solid, being grounded on .

Starting from [Carvalho07, deCarvalho18], research on relational semantics/non-idempotent intersection types has proliferated: various works in the literature explore their power in bounding the execution time or in characterizing normalization [DBLP:journals/tcs/CarvalhoPF11, DBLP:journals/apal/BucciarelliEM12, DBLP:journals/corr/BernadetL13, DBLP:conf/ictac/KesnerV15, DBLP:journals/iandc/BenedettiR16, DBLP:journals/iandc/CarvalhoF16, DBLP:journals/mscs/PaoliniPR17, DBLP:conf/rta/KesnerV17, DBLP:journals/igpl/BucciarelliKV17, DBLP:journals/pacmpl/MazzaPV18, DBLP:journals/pacmpl/AccattoliGK18]. All these works study relational semantics/non-idempotent intersection types either in proof-nets (the graphical representation of proofs in ), or in some variant of ordinary (i.e. call-by-name) -calculus. In the second case, the construction of the relational model sketched above essentially relies on Girard’s call-by-name translation of intuitionistic logic into , which decomposes the intuitionistic arrow as .

Ehrhard [DBLP:conf/csl/Ehrhard12] showed that the relational semantics of induces also a denotational model for the call-by-value -calculus111In call-by-value evaluation , function’s arguments are evaluated before being passed to the function, so that -redexes can fire only when their arguments are values, i.e. abstractions or variables. The idea is that only values can be erased or duplicated. Call-by-value evaluation is the most common parameter passing mechanism used by programming languages. that can still be viewed as a non-idempotent intersection type system. The syntactic counterpart of this construction is Girard’s (“boring”) call-by-value translation of intuitionistic logic into [DBLP:journals/tcs/Girard87], which decomposes the intuitionistic arrow as . Just few works have started the study of relational semantics/non-idempotent intersection types in a call-by-value setting [DBLP:conf/csl/Ehrhard12, DBLP:conf/lfcs/Diaz-CaroMP13, DBLP:conf/fossacs/CarraroG14, DBLP:conf/ppdp/EhrhardG16], and no one investigates their bounding power on the execution time in such a framework. Our paper aims to fill this gap and study the information enclosed in relational semantics/non-idempotent intersection types concerning the execution time in the call-by-value -calculus.

A difficulty arises immediately in the qualitative characterization of call-by-value normalization via the relational model. One would expect that the semantics of a term is non-empty if and only if is (strongly) normalizable for (some restriction of) the call-by-value evaluation , but it is impossible to get this result in Plotkin’s original call-by-value -calculus [DBLP:journals/tcs/Plotkin75]. Indeed, the terms and below are -normal but their semantics in the relational model are empty:


Actually, and should behave like the famous divergent term , since in they are observationally equivalent to with respect all closing contexts and have the same semantics as in all non-trivial denotational models of Plotkin’s .

The reason of this mismatching is that in there are stuck -redexes such as in Eq. (1), i.e. -redexes that -reduction will never fire because their argument is normal but not a value (nor will it ever become one). The real problem with stuck -redexes is that they may prevent the creation of other -redexes, providing “premature” -normal forms like and in Eq. (1). The issue affects termination and thus can impact on the study of observational equivalence and other operational properties in .

In a call-by-value setting, the issue of stuck -redexes and then of premature -normal forms arises only with open terms (in particular, when the reduction under abstractions is allowed, since it forces to deal with “locally open” terms). Even if to model functional programming languages with a call-by-value parameter passing, such as OCaml, it is usually enough to just consider closed terms and weak evaluation (i.e. not reducing under abstractions: function bodies are evaluated only when all parameters are supplied), the importance to consider open terms in a call-by-value setting can be found, for example, in partial evaluation (which evaluates a function when not all parameters are supplied, see [Jones:1993:PEA:153676]), in the theory of proof assistants such as Coq (in particular, for type checking in a system based on dependent types, see [DBLP:conf/icfp/GregoireL02]), or to reason about (denotational or operational) equivalences of terms in that are congruences, or about other theoretical properties of such as separability or solvability [DBLP:conf/ictcs/Paolini01, DBLP:series/txtcs/RoccaP04, DBLP:conf/flops/AccattoliP12, DBLP:conf/fossacs/CarraroG14].

To overcome the issue of stuck -redexes, we study relational semantics/non-idempotent intersection types in the shuffling calculus , a conservative extension of Plotkin’s proposed in [DBLP:conf/fossacs/CarraroG14] and further studied in [DBLP:conf/rta/Guerrieri15, DBLP:conf/tlca/GuerrieriPR15, DBLP:conf/aplas/AccattoliG16, DBLP:journals/lmcs/GuerrieriPR17]. It keeps the same term syntax as and adds to -reduction two commutation rules, and , which “shuffle” constructors in order to move stuck -redexes: they unblock -redexes that are hidden by the “hyper-sequential structure” of terms. These commutation rules (referred also as -reduction rules) are similar to Regnier’s -rules for the call-by-name -calculus [Reg:Thesis:92, DBLP:journals/tcs/Regnier94] and are inspired by the aforementioned translation of the -calculus into proof-nets.

Following the same approach used in [deCarvalho18] for the call-by-name -calculus and in [DBLP:journals/tcs/CarvalhoPF11] for proof-nets, we prove that in the shuffling calculus :

  1. (qualitative result) relational semantics is adequate for , i.e. a possibly open term is normalizable for weak reduction (not reducing under ’s) if and only if its interpretation in relational semantics is not empty (Thm. 16); this result was already proven in [DBLP:conf/fossacs/CarraroG14] using different techniques;

  2. (quantiative result) the size of type derivations can be used to measure the execution time, i.e. the number of -steps (and not -steps) to reach the normal form of the weak reduction (Prop. 21).

Finally, we show that, differently from the case of and call-by-name -calculus, we are not able to lift the quantitative information enclosed in type derivations to types (i.e. to the interpretation of terms) following the same technique used in [deCarvalho18, DBLP:journals/tcs/CarvalhoPF11], as our Ex. 28 shows. In order to get a genuine semantic measure of execution time in a call-by-value setting, we conjecture that a refinement of its syntax and operational semantics is needed.

Even if our main goal has not yet been achieved, this investigation led to new interesting results:

  1. all normalizing weak reduction sequences (if any) in from a given term have the same number of -steps (Cor. 22); this is not obvious, as we shall explain in Ex. 23;

  2. terms whose weak reduction in ends in a value has an elegant semantic characterization (Prop. 18), and the number of -steps needed to reach their normal form can be computed in a simple way from a specific type derivation (Thm. 24).

  3. all our qualitative and quantitative results for are still valid in Plotkin’s restricted to closed terms (which models functional programming languages), see Thm. 25, Cor. 26 and Thm. 27.

Proofs are omitted. They can be found in [Guerrieri18], the extended version of this paper.

1.1 Preliminaries and notations

The set of -terms is denoted by . We set and . Let .

  • The reflexive-transitive closure of is denoted by . The -equivalence is the reflexive-transitive and symmetric closure of .

  • Let be a term: is -normal if there is no term such that ; is -normalizable if there is a -normal term such that , and we then say that is a -normal form of ; is strongly -normalizable if there is no infinite sequence of terms such that and for all . Finally, is strongly normalizing if every is strongly -normalizable.

  • is confluent if . From confluence it follows that: iff for some term ; and any -normalizable term has a unique -normal form.

2 The shuffling calculus

Balanced contexts:
Figure 1: The shuffling -calculus [DBLP:conf/fossacs/CarraroG14].

In this section we introduce the shuffling calculus , namely the call-by-value -calculus defined in [DBLP:conf/fossacs/CarraroG14] and further studied in [DBLP:conf/rta/Guerrieri15, DBLP:conf/tlca/GuerrieriPR15, DBLP:conf/aplas/AccattoliG16, DBLP:journals/lmcs/GuerrieriPR17]: it adds two commutation rules — the - and -reductions — to Plotkin’s pure (i.e. without constants) call-by-value -calculus [DBLP:journals/tcs/Plotkin75]. The syntax for terms of is the same as Plotkin’s and then the same as the ordinary (i.e. call-by-name) -calculus, see Fig. 1.

Clearly, . All terms are considered up to -conversion (i.e. renaming of bound variables). The set of free variables of a term is denoted by : is open if , closed otherwise. Given , denotes the term obtained by the capture-avoiding substitution of for each free occurrence of in the term . Note that values are closed under substitution: if then .

One-hole contexts are defined as usual, see Fig. 1. We use for the term obtained by the capture-allowing substitution of the term for the hole in the context . In Fig. 1 we define also a special kind of contexts, balanced contexts .

Reductions in the shuffling calculus are defined in Fig. 1 as follows: given a root-step rule , we define the -reduction (resp. -reduction ) as the closure of under contexts (resp. balanced contexts). The -reduction is non-deterministic and — because of balanced contexts — can reduce under abstractions, but it is “morally” weak: it reduces under a only when the is applied to an argument. Clearly, since can freely reduce under ’s.

The root-steps used in the shuffling calculus are (the reduction rule in Plotkin’s ), the commutation rules and , and and . The side conditions for and in Fig. 1 can be always fulfilled by -renaming. For any , if then is a -redex and is its -contractum. A term of the shape is a -redex. Clearly, any -redex is a -redex but the converse does not hold: is a -redex but not a -redex. Redexes of different kind may overlap: for instance, the term is a -redex and contains the -redex ; the term is a -redex and contains the -redex , which contains in turn the -redex .

From definitions in Fig. 1 it follows that and , as well as and . The shuffling (resp. balanced shuffling) calculus (resp. ) is the set of terms endowed with the reduction (resp. ). The set endowed with the reduction is Plotkin’s pure call-by-value -calculus [DBLP:journals/tcs/Plotkin75], a sub-calculus of .

Proposition 1 (Basic properties of reductions, [DBLP:journals/tcs/Plotkin75, DBLP:conf/fossacs/CarraroG14]).

The - and -reductions are confluent and strongly normalizing. The -, -, - and -reductions are confluent.

Example 2.

Recall the terms and in Eq. (1): and are the only possible -reduction paths from and respectively: and are not -normalizable and . But and are -normal ( is a stuck -redex) and different, so by confluence of (Prop. 1). Thus, .

Example 2 shows how -reduction shuffles constructors and moves stuck -redex in order to unblock -redexes which are hidden by the “hyper-sequential structure” of terms, avoiding “premature” normal forms. An alternative approach to circumvent the issue of stuck -redexes is given by , the call-by-value -calculus with explicit substitutions introduced in [DBLP:conf/flops/AccattoliP12], where hidden -redexes are reduced using rules acting at a distance. In [DBLP:conf/aplas/AccattoliG16] it has been shown that and can be embedded in each other preserving termination and divergence. Interestingly, both calculi are inspired by an analysis of Girard’s “boring” call-by-value translation of -terms into linear logic proof-nets [DBLP:journals/tcs/Girard87, DBLP:journals/tcs/Accattoli15] according to the linear recursive type , or equivalently . In this translation, -reduction corresponds to cut-elimination, more precisely -steps (resp. -steps) correspond to exponential (resp. multiplicative) cut-elimination steps; -reduction corresponds to cut-elimination at depth .

Consider the two subsets of terms defined by mutual induction (notice that ):

Any is neither a value nor a -redex, but an open applicative term with a free “head variable”.

Proposition 3 (Syntactic characterization on -normal forms).

Let be a term:

  • is -normal iff ;

  • is -normal and is neither a value nor a -redex iff .

Stuck -redexes correspond to -normal forms of the shape . As a consequence of Prop. 3, the behaviour of closed terms with respect to -reduction (resp. -reduction) is quite simple: either they diverge or they -normalize (resp. -normalize) to a closed value. Indeed:

Corollary 4 (Syntactic characterization of closed - and -normal forms).

Let be a closed term: is -normal iff is -normal iff is a value iff for some term with .

3 A non-idempotent intersection type system

We recall the non-idempotent intersection type system introduced by Ehrhard [DBLP:conf/csl/Ehrhard12] (nothing but the call-by-value version of de Carvalho’s system [Carvalho07, deCarvalho18]). We use it to characterize the (strong) normalizable terms for the reduction . Types are positive or negative, defined by mutual induction as follows:

Negative Types: Positive Types:

where is a (possibly empty) finite multiset of negative types; in particular the empty multiset (obtained for ) is the only atomic (positive) type. A positive type has to be intended as a conjunction of negative types , for a commutative and associative conjunction connective that is not idempotent and whose neutral element is .

The derivation rules for the non-idempotent intersection type system are in Fig. 2. In this typing system, judgments have the shape where is a term, is a positive type and is an environment (i.e. a total function from variables to positive types whose domain is finite). The sum of environments is defined pointwise via multiset sum: . An environment such that with and for all is often written as . In particular, and (where ) are the same environment; and stands for the judgment where is the empty environment, i.e.  (that is, for any variable ). Note that the sum of environments is commutative, associative and its neutral element is the empty environment: given an environment , one has iff . The notation means that is a derivation with conclusion the judgment . We write if is such that for some environment and positive type .


Figure 2: Non-idempotent intersection type system for the shuffling calculus.

It is worth noticing that the type system in Fig. 2 is syntax oriented: for each type judgment there is a unique derivation rule whose conclusion matches the judgment .

The size of a type derivation is just the the number of rules in . Note that judgments play no role in the size of a derivation.

Example 5.

Let . The derivations (typing and with same type and same environment)

are such that and . Note that and .

The following lemma (whose proof is quite technical) will play a crucial role to prove the substitution lemma (Lemma 7) and the subject reduction (Prop. 8) and expansion (Prop. 10).

Lemma 6 (Judgment decomposition for values).

Let , be an environment, and be positive types (for some ). There is a derivation iff for all there are an environment and a derivation such that . Moreover, .

The left-to-right direction of Lemma 6 means that, given , for every and every decomposition of the positive type into a multiset sum of positive types , there are environments such that is derivable for all .

Lemma 7 (Substitution).

Let and . If and , then there exists such that .

We can now prove the subject reduction, with a quantitative flavour about the size of type derivations in order to extract information about the execution time.

Proposition 8 (Quantitative balanced subject reduction).

Let and .

  1. Shrinkage under -step: If then and there exists a derivation with conclusion such that .

  2. Size invariance under -step: If then and there exists a derivation with conclusion such that .

In Prop. 8, the fact that does not reduce under ’s is crucial to get the quantitative information, otherwise one can have a term such that every derivation is such that (and then there is no derivation with conclusion such that ): this is the case, for example, for . This shows that the quantitative study for evaluation reducing under ’s is subtler.

In order to prove the quantitative subject expansion (Prop. 10), we first need the following technical lemma stating the commutation of abstraction with abstraction and application.

Lemma 9 (Abstraction commutation).
  1. Abstraction vs. abstraction: Let . If and , then there is such that .

  2. Application vs. abstraction: If then there exists a derivation such that .

Proposition 10 (Quantitative balanced subject expansion).

Let and .

  1. Enlargement under anti--step: If then there is with .

  2. Size invariance under anti--step: If then and there is with .

Actually, subject reduction and expansion hold for the whole -reduction , not only for the balanced -reduction . The drawback for is that the quantitative information about the size of the derivation is lost in the case of a -step, see the comments just after Prop. 8 and Lemma 12.

Lemma 11 (Subject reduction).

Let and .

  1. Shrinkage under -step: If then there is with .

  2. Size invariance under -step: If then there is such that .

Lemma 12 (Subject expansion).

Let and .

  1. Enlargement under anti--step: If then there is with .

  2. Size invariance under anti--step: If then there is such that .

In Lemmas 11.1 and 12.1

it is impossible to estimate more precisely the relationship between

and . Indeed, Ex. 5 has shown that there are and such that and (where ). So, given , consider the derivations and below:

Clearly, (but ) and the (resp. ) is the only derivation typing (resp. ) with the same type and environment as (resp. ). One has and , thus the difference of size of the derivations and can be arbitrarely large (since ); in particular , so for the size of derivations does not even strictly decrease.

4 Relational semantics: qualitative results

Lemmas 11 and 12 have an important consequence: the non-idempotent intersection type system of Fig. 2 defines a denotational model for the shuffling calculus (Thm. 14 below).

Definition 13 (Suitable list of variables for a term, semantics of a term).

Let and let be pairwise distinct variables, for some .

If , then we say that the list is suitable for .

If is suitable for , the (relational) semantics, or interpretation, of for is

Essentially, the semantics of a term for a suitable list of variables is the set of judgments for and that can be derived in the non-idempotent intersection type system of Fig. 2.

If we identify the negative type with the pair and if we set where:

( is the set of finite multisets over the set )

then, for any and any suitable list for , one has ; in particular, if is closed and , then (up to an obvious isomorphism). Note that : [DBLP:conf/csl/Ehrhard12, DBLP:conf/fossacs/CarraroG14] proved that the latter identity is enough to have a denotational model for . We can also prove it explicitly using Lemmas 11 and 12.

Theorem 14 (Invariance under -equivalence).

Let , let and let be a suitable list of variables for and . If then .

An interesting property of relational semantics is that all -normal forms have a non-empty interpretation (Lemma 15). To prove that we use the syntactic characterization of -normal forms (Prop. 3). Note that a stronger statement (Lemma 15.1) is required for -normal forms belonging to , in order to handle the case where the -normal form is a -redex.

Lemma 15 (Semantics and typability of -normal forms).

Let be a term, let and let be a list of variables suitable for .

  1. If then for every positive type there exist positive types and a derivation .

  2. If then there are positive types and a derivation .

  3. If is -normal then .

A consequence of Prop. 8 (and Thm. 14 and Lemma 15) is a qualitative result: a semantic and logical (if we consider our non-idempotent type system as a logical framework) characterization of (strong) -normalizable terms (Thm. 16). In this theorem, the main equivalences are between Points 1, 3 and 5, already proven in [DBLP:conf/fossacs/CarraroG14] using different techniques. Points 2 and 4 can be seen as “intermediate stages” in the proof of the main equivalences, which are informative enough to deserve to be explicitely stated.

Theorem 16 (Semantic and logical characterization of -normalization).

Let and let be a suitable list of variables for . The following are equivalent:

  1. Normalizability: is -normalizable;

  2. Completeness: for some -normal ;

  3. Adequacy: ;

  4. Derivability: there is a derivation for some positive types ;

  5. Strong normalizabilty: is strongly -normalizable.

As implication (5)(1) is trivial, the proof of Thm. 16 follows the structure (1)(2)(3)(4)(5): essentially, non-idempotent intersection types are used to prove that normalization implies strong normalization for -reduction. Equivalence (5)(1) means that normalization and strong normalization are equivalent for -reduction, thus in studying the termination of -reduction no intricacy arises from its non-determinism. Although does not evaluate under ’s, this result is not trivial because does not enjoy any form of (quasi-)diamond property, as we show in Ex. 23 below. Equivalence (1)(2) says that -reduction is complete with respect to -equivalence to get -normal forms; in particular, this entails that every -normalizable term is -normalizable. Equivalence (1)(2) is the analogue of a well-known theorem [Barendregt84, Thm. 8.3.11] for ordinary (i.e. call-by-name) -calculus relating head -reduction and -equivalence: this corroborates the idea that -reduction is the “head reduction” in a call-by-value setting, despite its non-determinism. The equivalence (3)(4) holds by definition of relational semantics.

Implication (1)(3) (or equivalently (1)(4), i.e. “normalizable typable”) does not hold in Plotkin’s : indeed, the (open) terms and in Eq. (1) (see also Ex. 2) are -normal (because of a stuck -redex) but . Equivalences such as the ones in Thm. 16 hold in a call-by-value setting provided that -reduction is extended, e.g. by adding -reduction. In [DBLP:conf/aplas/AccattoliG16], is proved to be termination equivalent to other extensions of (in the framework Open Call-by-Value, where evaluation is call-by-value and weak, on possibly open terms) such as the fireball calculus [DBLP:series/txtcs/RoccaP04, DBLP:conf/icfp/GregoireL02, DBLP:conf/lics/AccattoliC15] and the value substitution calculus [DBLP:conf/flops/AccattoliP12], so Thm. 16 is a general result characterizing termination in those calculi as well.

Lemma 17 (Uniqueness of the derivation with empty types; Semantic and logical characterization of values).

Let be -normal.

  1. If and , then , , and . More precisely, consists of a rule if is a variable, otherwise is an abstraction and consists of a 0-ary rule .

  2. Given a list of variables suitable for , the following are equivalent:

    1. is a value;

    2.  ;

    3. there exists  ;

    4. there exists such that .

Qualitatively, Lemma 17 allows us to refine the semantic and logical characterization given by Thm. 16 for a specific class of terms: the valuable ones, i.e. the terms that -normalize to a value. Valuable terms are all and only the terms whose semantics contains a specific element: the point with only empty types.

Proposition 18 (Logical and semantic characterization of valuability).

Let be a term and be a suitable list of variables for . The following are equivalent:

  1. Valuability: is -normalizable and the -normal form of is a value;

  2. Empty point in the semantics: ;

  3. Derivability with empty types: there exists a derivation .

5 The quantitative side of type derivations

By the quantitative subject reduction (Prop. 8), the size of any derivation typing a (-normalizable) term is an upper bound on the number