Doubly F-Bounded Generics

08/18/2018 ∙ by Moez A. AbdelGawad, et al. ∙ Rice University 0

In this paper we suggest how f-bounded generics in nominally-typed OOP can be extended to the more general notion we call `doubly f-bounded generics' and how doubly f-bounded generics can be reasoned about.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

F-bounded generics, as found in mainstream OO programming languages such as Java, C#, Scala and Kotlin, allows a type variable to be used in defining the upper bound of the type variable, i.e., in defining its own upper bound. Examples of f-bounded generic class declarations include the following declarations.

class C<T> {} // used in definition of class D

class D<T extends C<T>> {} // T used to define its own upper bound

class E<T extends E<T>> {} // E & T used to define the bound of T

1.1. Doubly F-Bounded Generics

Simply stated, doubly f-bounded generics allows a type variable to be used in defining both an upper bound and a lower bound of the type variable.

Examples of doubly f-bounded generic class declarations include the following ones.

class C<T> {} // used in definitions below

class D<T> extends C<T> {} // used in definitions below

class E<T> extends D<T> {} // used in definitions below

class F<E<T> extends T extends C<T>> {} // T has lower & upper bounds111Some may prefer this declaration to be written as class F<T extends C<T> super E<T>> {} as suggested for example in earlier literature.

class G<G<T> extends T extends C<T>> extends D<T> {} // G & T used

// to define lower bound of T. T also used to define upper bound

class H<J<T> extends T extends H<T>> {} // H & T used

// to define upper bound of T. T also used to define lower bound

class I<T> extends H<T> {} // used in definition of class J

class J<T> extends I<T> {} // used in definition of class H

(Note that a declaration such as

class F<F<T> extends T extends F<T>> {}

is a useless declaration. No type argument can be used to instantiate class F, since no type argument can be simultaneously a subtype and a supertype of the same type yet be unequal to it222If such a declaration were allowed, the necessary antisymmetry property of subtyping forces T to be equal to F<T>, but only infinite types T can satisfy this equality. The nominality of subtyping, which necessitates the explicit declaration of inheritance/subtyping relations between classes, and the prohibition of expressing circular inheritance/subtyping relations between classes prohibits the explicit expression of subtyping relations that involve infinite types (since only finite types can be expressed explicitly).).

2. Illustrating Example

To better understand f-bounded generics and doubly f-bounded generics, let’s recall that the term ‘f-bounded generics’ actually means ‘function-bounded generics’ (or, more precisely, using category-theoretic language, it means ‘functor-bounded generics’). This means that a (lower or upper) bound of a type variable of some generic class is not a constant type (even if an infinite one) but that the bound varies with the value of the type variable that gets passed to the class. This in turn means that each type argument that may instantiate the generic class has two corresponding bounding types defined by the functions specified as the bounding functions. The type argument is a valid type argument if the type argument is a subtype of the corresponding upper bounding type and a supertype of the corresponding lower bounding type.

2.1. Unbounded Functions

To illustrate more vividly how we view f-bounded generics, and more generally how doubly f-bounded generics can be modeled, let’s consider functions from analysis, i.e., functions of type from real numbers to real numbers (extended with and ).

Example 1.

Consider the function plotted in Figure 2.1. Function is defined over all real numbers such that . For our purposes it is more convenient to include and in and to define and . Thus the domain of is the closed interval (i.e., is defined for ).333The infinite values and here play a role similar to the role played by types Null and Object, respectively, in the OO subtyping relation.

Figure 2.1. Function .

2.2. Doubly (Constant) Bounded Functions

To get a step closer to our model of doubly f-bounded generics, we first consider restricting the domain of a function using constants (also sometimes called ‘constant functions’, i.e., functions whose output value is independent of their input argument).

Example 2.

Consider restricting the domain of the function (of Example 1) to be the closed interval . This domain-restricted function can be expressed as

Figure 2.2 is a plot of this domain-restricted function.

Figure 2.2. Function for .

2.3. Doubly F-Bounded Functions

More interestingly, we can consider restricting or bounding the domain of using two (non-constant) functions over .

Example 3.

Consider the function

whose parameter is f-bounded (i.e., function-bounded) by the two functions (for lower bound) and (for upper bound), plotted in Figure 2.3.

Figure 2.3. Function for .

Notice that for plotting we had to first decide which values for are valid arguments to , i.e., which values simultaneously satisfy the two inequalities and .

Using simple reasoning, it is easy to see that both inequalities are satisfied only for values of (check Figure 2.4 where valid values of are those for which the corresponding green dotted line lies above the red line and below the blue line). Hence the plot of in Figure 2.3. It should be noted that the plot of can be made only after the domain of (i.e., valid ranges for arguments of ) is decided (e.g., using the plots of and ).

Figure 2.4. Functions and , together with .

To make things even more interesting and more “realistic”, we can use slightly more complex bounding functions.

Example 4.

Consider the f-bounded function

plotted in Figure 2.5. The approximate domain of can be decided using Figure 2.6. Approximately, the valid values of are (using the quadratic formula, valid values of precisely are ).

Figure 2.5. Function for .
Figure 2.6. Functions and , together with .

Finally, we make things even more interesting, where the restricted domain of an f-bounded function can be the union of multiple intervals over .

Example 5.

Consider the f-bounded function

where

and

plotted in Figure 2.7. The approximate domain of can be decided using Figure 2.8. Approximately, valid values of are . (The precise valid values of can be found using Cardano’s formula).

Figure 2.7. Function for .
Figure 2.8. Functions and , together with .

From the curves in Figure 2.8, and their crossing points, we deduce that no other intervals are included in the domain of (as noted earlier, valid values of must have the corresponding red curve below the dotted green line and the corresponding blue curve above the dotted green line, but one or both of these two conditions are not true in all intervals lying outside and ).

3. Bounded Generics

Understanding the simple example of f-bounded functions over the real numbers that we presented in Section 2, particularly how the domain of these functions is decided, is key to understanding how we view doubly f-bounded generics.

It should be noted that in all functions considered in Section 2 we had a fixed “template”

that got filled/instantiated with different pairs of functions and that define the lower and upper bound for each value of , respectively.

As the reader may have intuitively guessed by now, the two most significant differences between f-bounded functions and our model of doubly f-bounded generics are, firstly, switching from the totally ordered set of real numbers (ordered by less-than-or-equals, ) to the partially ordered set of ground generic types (ordered by subtyping, ), then, secondly, switching from functions over real numbers (which map real numbers to real numbers) to “functions”—more accurately, generic classes/type constructors—over types (which map types to types).

The definition of a function over a partially ordered f-bounded domain may not be visually intuitive as its totally ordered counterparts (as illustrated in Section 2), yet the abstract non-visual understanding of how such functions are defined can be almost as simple as understanding the definitions of the example functions (defined over the totally ordered set ) we presented in Section 2.

The iterative construction of the graph of —the subtyping relation between ground generic types in nominally-typed OOP—was presented in [1], using the graph theoretic notion of partial Cartesian graph products [2]. Similar to how the different domains of function were decided in the examples of Section 2, a type is valid as a type argument to some doubly f-bounded generic class if the bounding ground types and define an interval type in  [3]. More precisely, a type is a valid type argument if there exists a path in the graph of that goes from the lower bound type to the upper bound type passing through , or equivalently, if both of and are interval types in .444While referring to the different plots of and in Section 2 (in which a dotted green line represents the identity function , a red curve represents the lower bounding function , and a blue curve represents the upper bounding function ), it should be noted that this condition corresponds to (i.e., is the partial order counterpart of) the condition that the dotted green line is above (i.e., ) the red curve and below (i.e., ) the blue curve.

4. Input-Side Recursion

The usefulness and value of the example of functions from analysis lies not only in providing a means to present (doubly) f-bounded functions in a simpler setting (i.e., that of a totally ordered set) but also in it possibly offering inspiration when answering questions that may seem hard in the context of doubly f-bounded generics but are simpler to answer in the context of functions in analysis, as illustrated by the following example.

Example 6.

Consider the generic class declaration

class Enum<T extends Enum<T>>.

This declaration is considered, by many OO software developers, to be among the most confusing class declarations, not only because of the use of type variable T in its own bound (which is the defining feature of f-bounded generics) but also because the very class getting declared (namely, class Enum) is also used to define the bound of the type variable T.

Fortified with the examples presented in Section 2, however, it should now be clear that this declaration is similar to the domain-restricted function =. Pondering a little over this definition of , it can be easily seen that the definition states that is defined only for values of that are less than the unbounded function , which (as if accidentally) happens to have the same expression as itself (but not the same domain).

Given the plot of in Figure 4.1 (which, except for the additional dotted green line for the identity function, is the same as the plot in Figure 2.1), we can see that is defined for values of (i.e., values of for which the green dotted line in Figure 4.1 is below the curve of ), and, accordingly, that has the graph plotted in Figure 4.2.

Figure 4.1. Function , together with .
Figure 4.2. Function for .

It may be argued, for good reasons, that the in the bound of (in the definition of ) should actually be interpreted, as is customary, as a recursive definition of (i.e., one that involves a self-reference) and thus that the definition of should rather be written as and that the domain (i.e., valid values of ) should be decided accordingly. However, it is our claim that for our purposes (namely, deciding valid values of , i.e., deciding the domain of ) this would make no difference (i.e., that the resulting domain of will be the same).

The reason behind our claim (which is corroborated by the example in Section 2, as well as many examples one can think of555Can our claim be proven? We believe it can, and we believe the proof, even for general functions on partially-ordered sets, will likely be a simple proof. As such we believe we may be able to produce this proof soon, instead of having to depend on corroborating examples (and the lack of counterexamples) to support our claim. (See Appendix A for a proof attempt.)) is that self-references in genuine recursive definitions of functions affect the value of the function itself (i.e., the “return/output value” of the function, e.g., as in the recursive definitions of the factorial/Gamma function and the Fibonacci function ), unlike the case we have at hand (i.e., f-bounded functions and f-bounded generics) where the self-reference plays a different role and is used rather differently, i.e., only to decide valid input values to the function. We tentatively call these two different uses of self-reference as ‘recursion on the output/codomain side of the function’ (customary recursion) and ‘recursion on the input/domain side of the function’ (i.e., input-side recursion/self-reference), respectively.

4.1. Valid Type Arguments and Admittable Type Arguments

An immediate implication on type checking and subtype checking in Java (and similar nominally-typed OO programming languages) that is suggested by our claim is that when particularly checking whether a type argument to a generic class with input-side recursion is a valid type argument to the class (i.e., checking that the type argument is a subtype of its upper bound and a supertype of its lower bound) no recursive referencing back to the subtyping relation (involving the same particular pair of types) is necessary, since (according to our model) all type arguments passed to the bounding functions in such a case are indeed valid type arguments that (as long as they are well-formed types) are in no need of validity checking.

Let us illustrate this with an example.

Example 7.

Consider the Java class declarations

class Enum<T extends Enum<T>> {}

class Color extends Enum<Color> {}.

During type checking a program containing these declarations, particularly when checking whether a type argument (such as Object or Color) is a valid type argument to class Enum (i.e., whether Enum<Object> or Enum<Color> are valid types) the type checking algorithm must confirm that the type argument satisfies its bound(s) (i.e., whether Object is a subtype of Enum<Object> or Color is a subtype of Enum<Color>). By our model and claim, these second instantiations of class Enum (i.e., types Enum<Object> and Enum<Color>), which appeared while checking the validity of type arguments to Enum, need not be checked for the validity of their own type arguments (i.e., types Object and Color), since (similar to the expression in Example 6 of Section 4) class Enum is treated—in only this context where the type checking algorithm is checking the validity of a type argument to the class—as having unrestricted/unbounded type parameters, and thus that these second instantiations of Enum are valid types (i.e., in no need of validation themselves).

Given that class Object (the standard class) does not extend class Enum, and thus type Object is not a subtype of Enum<Object> (the second instantiation), the type checking algorithm concludes that the type Enum<Object> (i.e., the original/first instantiation that we started with during type checking) is not a valid type. On the other hand, given the extends clause in the declaration of class Color, type Color is a subtype of Enum<Color> (the second instantiation), and thus the type checking algorithm concludes that the first instantiation Enum<Color> is a valid type.

It should be noted that the reasoning method used above (suggested by our model of f-bounded generics) differs significantly from the reasoning method upon which current implementations of type checking in OO compilers and OO type systems are based, which, although reaching the same decisions regarding class Enum as those we reached above, resort to much more complex infinite/coinductive logical arguments to justify such typing/subtyping decisions.

Given the discussion and the example above, to formalize our reasoning method we make a distinction regarding type arguments, where we differentiate between admittable type arguments of a class and valid type arguments of the class.

In particular, for any generic class G a type TA is an admittable type argument of class G as long as TA is a well-formed (reference/object) type, particularly disregarding any declared bounds on the corresponding type variable in G. On the other hand, in all but one of the program contexts where a parameterized type can occur, an admittable type argument TA of G is a valid type argument if TA also satisfies the bounds declared in G on the corresponding type variable (i.e., if TA is a supertype of its lower bound and a subtype of its upper bound). That is, in all such contexts G<TA> should be accepted by the type checker as a valid parameterized type. In the context where bounds of the type variable(s) of G are declared, however, our model of f-bounded generics necessitates that all admittable type arguments of G are also considered valid type arguments.

In other words, our model of f-bounded generics (including doubly F-bounded generics) states that all valid type arguments of a generic class G are (by definition) admittable ones, in all contexts, and it requires that the converse (i.e., that admittable arguments are valid) holds in the special context of declaring bounds of type variables of G. In all other contexts, an admittable type argument of G is valid if and only if it also satisfies the declared bounds in G.

5. Discussion

In this paper, using a notion we call ‘f-bounded functions’ from analysis, we illustrated that a bound of a type variable in f-bounded generics is a function (over types, i.e., is of type , where is the set of ground types) that specifies a bound for each value of the type variable, which in turn decides whether the value (i.e., a type argument) is a valid type argument.

Our illustration immediately suggested how f-bounded generics can be generalized to doubly f-bounded generics, where both an upper and a lower bounding function (over types) can be specified.

Our illustrating example further allowed us to consider how we may reason about functions (in analysis) that have (what we call) ‘input-side recursion,’ i.e., functions where the definition of a function specifies that the value of the function at some input value is an (upper or lower) bound of the input value.

Accordingly, we suggested how we can reason, in the same way, about the declaration of a generic class with input-side recursion (i.e., where the instantiation of the generic class having the type variable as the type argument is a bound of the type argument, e.g., as in the class declaration class C<T extends C<T>>, where the particular instantiation of class C whose type argument is T is an upper bound of T).

We finally also discussed one of the possible implications of our model of f-bounded generics on the type checking algorithm of nominally-typed OOP languages.

References

Appendix A On Deciding the Domains of Doubly F-bounded Functions Over Partially-Ordered Sets

In this appendix we analyze deciding the domain of doubly f-bounded functions (dfbfs, for short) defined over partially-ordered sets. We mathematically prove that the domain of dfbfs can be decided without resorting to any coinductive arguments (other than inside our proof itself), even in cases where a self-reference may exist in the definition of the domain of such functions. Our proof has immediate implications on supporting doubly F-bounded generics in nominally-typed OOP languages, and on the behavior of the type checking algorithm used in these languages when it checks the validity of parameterized types.

a.1. Motivation

To illustrate how doubly f-bounded generics for nominally-typed OOP may be defined, in the main body of this paper we presented the notion of doubly f-bounded functions that are defined over partially ordered sets. In doubly f-bounded generics, a type variable of a generic class can be lower bounded and upper bounded by instantiations of (other) generic classes that take the type variable as their type argument. As such, doubly f-bounded generic classes can be considered as instances of doubly f-bounded functions where the partially-order these functions are defined over is, specifically, the subtyping relation between ground generic types (which is a reflexive, antisymmetric and transitive relation, thus defining a poset over the set of ground generic types).

The question arose, during our presentation, on how to decide the domain of these functions, and whether it can be mathematically proven (probably using a coinductive argument 

[4]) that the domains can be decided easily (i.e., without explicitly resorting to coinductive arguments in the decision procedure).666Informally, as a proof principle, coinduction states that a property holds if there is no good reason for the property not to hold. Hence this appendix.

a.2. Preliminaries

Let be a partially-ordered set. Let be a function defined over (i.e., whose domain and codomain are the same, thus sometimes also called an endofunction or endomap over ). Let , be two other endofunctions over .

In this paper we consider restricting the domain of , using functions and . In particular, we stipulate that a value in the domain of has to be greater than or equal to the value of function at , i.e., that , and that it, i.e., , has to be smaller than or equal to the value of function at , i.e., that . This restricted-domain function can be expressed succinctly as

We call such restricted-domain functions doubly f-bounded functions (or, dfbfs, for short).

a.3. Deciding Domains of Doubly F-bounded Functions

In the main body of this paper we gave examples that illustrate how the domain of dfbfs from analysis (i.e., defined over the real numbers ) can be decided, seemingly easily using the plots of the functions involved. That included even examples for the special cases (of practical interest) where the defined function is itself one of the two bounds of its own parameter (but not both), i.e., the cases777We call these dfbfs as ones with ‘input-side recursion’ or with ‘input-side self-reference’. where the definition of can be expressed as

It should be noted that if is used as the bounding functions for both bounds of , then the restricted-domain will be defined only for the fixed points of (since then can be expressed as which then is equivalent to , which states that is defined only for its own fixed points.)

To the best of our mathematical knowledge (as of today), fixed points of functions can be found, iteratively, if is a complete partial order (CPO) and the function being defined is monotonic (i.e., if ). But for general functions (i.e., ones that may not be monotonic) defined over general (i.e., not necessarily complete) partial orders, no general method for finding fixed points exists.

Further, if is a pointed CPO (i.e., has a least member, , usually called ‘bottom’) and is a monotonic function over , then even a least fixed point of is guaranteed (by Banach/Tarski/Brouwer’s theorems? TODO ) to exist. In that case the least fixed point (lfp) of can be found simply by iterating the application of over , i.e., by computing the sequence , , , , until a fixed point is found (i.e., until two successive values in the sequence are the same).

Given, however, that while deciding the domain of dfbfs we are not specifically and explicitly seeking to find fixed points, we are guessing that our problem (i.e., deciding the domains of dfbfs) may be simpler than finding the fixed points, and thus in no need of a completeness condition on , in no need for a monotonicity condition on , and in no need for an explicit coinductive argument in solving it (i.e., deciding the domain, as suggested by the illustrating dfbfs from analysis).

A further reason for us to not consider seeking fixed points is the context of our application (i.e., the context in which we wish to apply our result). As we pointed out in Footnote 2 in the main body of this paper, in doubly F-bounded generics, due to nominal subtyping (i.e., that subtyping has to be explicitly declared), it is impossible for any type T to be equal to the instantiation of a generic class C with type T as the type argument of the class (i.e., in generic nominally-typed OOP, for no type T can we have T=C<T>).

As such we can safely, i.e., without loss of generality, restrict our attention to finding domains of dfbfs having definitions of the form

(without an equality possibility) whose domain (a subset of ) we call (the subset of having values of that are valid as arguments to ).

It is our assertion that the domain of such a function is the same as the domain of a dfbf (over ) with a definition of the form

where and are functions that have the same “expression” as , but where has/gives valid values corresponding to all elements of (i.e., the domain of is the whole of , and is not restricted to a subset of it).888Like the type Enum<Object>, the values produces for ‘admittable but invalid values of ’ (i.e., for ) are also called ‘admittable but invalid values for ’, i.e., ones that can be obtained by not restricting the domain of (i.e., are validly obtainable from ) but that cannot be (validly) obtained from .

Theorem 1.

The f-bounded functions and define the same function, i.e., .

Proof.

First, we prove that functions and have the same domains.

We reason by cases as follows:

If (i.e., is in the set of valid arguments to ) then , and thus we also have (since ).

If , then is undefined, or, more precisely, is an “invalid value”, and, by coinductive reasoning [4], we know that and thus, by the definition of (i.e., the domain of ), we have .999If we had then, by coinductive reasoning [4], we would also have (since the invalidity of the value is not a good reason for not to hold), and thus, by the definition of , we would have , which is a contradiction. Thus, we have and, by the definition of , . We believe coinductive reasoning—even though, as usual, sounding as ‘a sleight of hand’ [4] and although we do not present an explicit coinductive step—is correctly used here, and that coinductive reasoning is used here once and for all, i.e., that there is no need for coinductive reasoning to be used (or to be even mentioned) outside this proof. In particular, we believe Theorem 1 should be used directly (e.g., in analysis, in doubly F-bounded generics, or elsewhere), without need to reference its coinductive proof.

As such, we have and , and thus .

Secondly, since, by our choice of , we have , then, using the extensionality of functions, given that and have the same domain (and codomain), we have

An immediate consequence of proving Theorem 1 is that the reasoning method (i.e., assuming the bounding functions of dfbfs to have unbounded domains) that we used in Section 4 of the main body of this paper (to decide the domains of doubly F-bounded functions and doubly F-bounded generics, including even ones with input-side recursion) is mathematically sound.

A practical consequence of the proof, which we also discussed in the main body of this paper, is that the Java type checker (i.e., during the compilation of Java programs) does not need to resort to infinite types or to explicit coinductive arguments when it is checking the validity of type arguments of generic classes (e.g., during its checking of the validity of parameterized types).