A resource is a value that is hard to copy or dispose of. A typical resource is a large data structure, a file handle, a socket, a lock, a value from a cache, an exclusive access to a mutable, a value originating from a different runtime, or a continuation. It is also any data structure composed from such resources, such as a list of resources or a closure containing a resource. Support for resource management in programming languages (PLs) is a concern for safety, efficiency, interoperability and expressiveness.
This is a proposal for a resource-management model compatible in broad strokes with the OCaml111https://ocaml.org/ philosophy and runtime model. By abstracting a few low-level technical details, it can also be read more generally as a model for other languages in the ML family.
It considers new values and types that own or borrow resources, similar to ownership/borrowing in C++11222https://isocpp.org//Rust333https://www.rust-lang.org/, in addition to the current GC types. It is motivated by addressing issues that arose during discussions with several Serious Industrial OCaml Users concerning safety, efficiency, interoperability, and expressiveness in the presence of resources: the inadequacy of finalizers for timely disposal of large pools of resources, the unpredictability and limitations of unboxing, the difficulty of interfacing with resource-intensive libraries such as Qt444https://www.qt.io/, the difficulty of cleaning-up resources reliably with fibers in ocaml-multicore555https://github.com/ocamllabs/ocaml-multicore, the need for affine closures with effect handlers, and more.
By OCaml philosophy, what is meant is reaching a sweet spot combining:
A safe type system that helps instead of hindering,
Lightweight and expressive abstractions,
An efficient runtime.
This proposal focuses on the levels 2 and 3 above. It is voluntarily vague about level 1, for the reason that resource-friendliness is deeply rooted in the computational behaviour (i.e. level 3), as it will become clear. Let us put aside the idea that one can start from ideas for a type system and expect interesting computational behaviour to suddenly appear; instead, the type system should come in a second time, in service of a convincing design for computational aspects. The first challenge for OCaml, tackled here, is to get level 2 right, such that level 3 can become a realistic, conservative, and useful extension of the current runtime. Level 1 is expected to require substantial effort; hopefully the proposal provides sufficient motivations for such an effort. Besides, we will see that there is ample prior work addressing various questions for a practical type system, already at work separately in the languages Rust and Alms.
The model is inspired by Stroustrup’s RAII (Stroustrup, 1994) and Hinnant et al.’s move semantics in C++11 (Hinnant, Dimov, and Abrahams, 2002). RAII (“Resource acquisition is initialisation”) proposes to integrate error handling and resource management by attaching destructors to types: clean-up functions that are called automatically and predictably when a scope ends, whether by returning or due to an exception being raised. It is an essential ingredient in the basic exception-safety guarantee (Stroustrup, 2001) which requires that functions that raise an exception do not leak resources and leave all data in a valid state. In the words of Ramananandro, Dos Reis, and Leroy (2012), RAII enforces invariants about the construction and the destruction of resources predictably and reliably.
In contrast, finalizers as currently used in OCaml, that is clean-up functions called by the garbage collector, are not predictable, are not guaranteed to be run, and allow making values reachable again (Minsky, Madhavapeddy, and Hickey, 2013, Chapter 21; similar points are made for finalizers in other languages in Stroustrup, Sutter, and Dos Reis, 2015). In OCaml, which thread calls the finalizer is even explicitly unspecified666http://caml.inria.fr/pub/docs/manual-ocaml/libref/Gc.html. Finalizers appear commonly considered inappropriate for managing resources.
In addition, while the original resource-management model of C++ was criticised for its over-reliance on deeply copying values (among others), Hinnant et al. proposed to introduce a new kind of types in C++ (rvalue references) that allowed to express the moving of resources. In particular:
it supported a polymorphism of resource management: the management of a type is by default deduced from its components,
it supported the conservative extension of data structures and algorithms from the standard library to operate on resources,
all the while retaining backwards compatibility with existing code (sometimes even speeding it up by removing unnecessary copies).
Together with the extensive use of (unsafe) passing by reference and a rudimentary form of reference-counting garbage collection expressible with RAII (shared_ptr), this is advocated as the new resource-management model of C++11 (Stroustrup, Sutter, and Dos Reis, 2015). Ownership types, and regions as in MLKit (Tofte and Birkedal, 1998) and Cyclone (Jim et al., 2002; Grossman et al., 2002), have also been proposed as abstractions amenable to static analyses, which has inspired Rust’s ownership-and-borrowing model strengthening the C++11 model with static safety guarantees and a novel design for preventing data races (Anderson et al., 2016).
This proposal, however, should not be seen as just trying to extend OCaml with C++ idioms. Its starting point was the similarities between C++’s resource polymorphism and polarisation in proof theory, as well as a rational reconstruction of destructors in the linear call-by-push-value categorical model (Curien, Fiore, and Munch-Maccagnoni, 2016). This suggested several aspects of this proposal, by bringing to light the deep compatibility of the C++11/Rust resource-management model with functional programming. This continues a synthesis of systems programming’s resources and linear logic’s resources initiated in a series of rarely-mentioned articles by Baker (1994a, b, 1995).
In the end, polishing RAII brings additional similarities with Rust, a runtime model that fits that of OCaml, and applications that go beyond a simple replacement of finalizers. This proposal is an element in a broader thesis that RAII hides a fundamental computational structure that has not been given yet the exposure it deserves.
1.1 The relevance of C++11 for OCaml
The proposal will remind of linear types, regions, uniqueness types, ownership types, and borrowing, la Linear Lisp, Clean, Cyclone, Rust, etc. (sec:Comparison-with-existing offers a more detailed comparison.) But in comparison to Linear Lisp, Clean, Cyclone, Rust, etc., the experience of the move from C++98 to C++11 stands out for OCaml for three reasons:
Both are established languages that need to preserve the meaning and performance of large amounts of legacy code.
Both have to deal with exceptions, a core part of their design, much more prominently than in Rust in which exceptions (panic) are restrictive and discouraged by design, or than in other languages in which they are absent.
Both are designed around light, efficient, and predictable abstractions.
Of course, C++11 and OCaml differ greatly in other technical and principal aspects, and their communities do not overlap much, which would explain why, if there is any value to this proposal, it has not been proposed before, besides the example set by Rust. In addition, some experience with mathematical models of PLs in the categorical tradition has helped (the author at least) reading between the lines of the C++11 specification and of various idioms that arose from it, and extracting its substance. The inputs from semantics are explained in the next section.
While the crucial ideas for this proposal are RAII and move semantics from Stroustrup and Hinnant et al., the end result is probably closer to Rust. This is because Rust itself was inspired by both C++ and ML among others(Anderson et al., 2016). Compared to C++, Rust offered language support for isolating the “unsafe” parts of the code in libraries. This means that in most of user code, what are merely best practices in C++11 are enforced by the Rust compiler, thereby providing strong static safety guarantees. Moreover, by treating mutable state as a resource, Rust tracks aliasing, ensuring that no data races are due to bugs in “safe” mode. This tour de force for an industrial language drew the attention of the academic PL community. C++ also takes example on Rust with the ongoing Core Guidelines initiative (Stroustrup and Sutter, 2015) aimed at standardizing and tooling a “smaller, simpler, safer language”, with similar ideas as Rust but an emphasis on easy migration of legacy code.
This proposal suggests that a similar path is possible for OCaml, where “small, simple” (and efficient) is retained from the OCaml that everyone likes, and where “safer” is achieved for resources and concurrency, without sacrificing the expressiveness of a functional, GCed language. This proposal focuses on resource safety; Rust also tries to solve the problem of data races. In the current proposal, OCaml’s unrestricted shared mutable state is kept for the sake of backwards compatibility. Proposing a solution to data races in OCaml la Rust in one go would be ambitious. The requirement of backwards compatibility can seem a convenient excuse to not propose one right away, but in fact it shows an opportunity to first integrate some language features that are already useful for resource management, while at the same time providing a richer playground for tackling concurrency problems in the future. Nevertheless, subsec:Languages-with-control comes back on this suggestion on an optimistic note.
1.2 RAII, resource polymorphism, and GC, from a semantic point of view
RAII is an idiom in which a destructor is attached to a resource, according to its type. The destructor is called predictably when the variable goes out of scope, including due to an exception being raised. With RAII, one can allocate resources on the heap, and have the resources automatically collected predictably and reliably, bypassing garbage collection. The destructor deallocates memory, and can be customised by the language or the user to perform other clean-up duties.
Three intuitions arising from mathematical models were the starting points of this proposal:
Hinnant et al.’s resource polymorphism for C++11 coincides with Girard’s polarity tables in proof theory, which describes compound connectives in terms of basic ones (Girard, 1991, 1993). In proof theory, the goal of polarity tables is to minimize the number of modalities, so as to maximize the number of type isomorphisms (Girard, 1991) and “” conversions (Danos, Joinet, and Schellinx, 1997); for PLs these concerns have an immediate application in making programs easier to reason about. (Girard 1993 describes the mixing of linear, intuitionistic, and classical logics in the same system, which in fact corresponds to a mixing of resources and continuations.) For the purposes of this proposal, a polarity is a type of types closed under constructions. This view is close to that of “kinds as calling convention” (Bolingbroke and Peyton Jones, 2009; Eisenberg and Peyton Jones, 2017), itself inspired in part by Levy (Bolingbroke and Peyton Jones).
Both Girard and Hinnant et al. describe systems in which, instead of a non-copyable pair distinct from a copyable one (maybe written differently, ), there is a single pair , whose polarity (e.g. whether it is copyable or not) is deduced from its components: is copyable by default as soon as both and are; in this case the copy operation consists in the sequence of copies of members, which turns out to have an abstract description in the form of a canonical algebraic structure (Bierman, 1995). (With linearity interpreted as counting uses, a similar idea was proposed in some form by the Clean language, and it has reappeared several times. See subsec:Aside-Origins-of.)
We have noticed, in joint work with Guillaume Combette, that the notion of scoped-tied destructors arises naturally when modelling exceptions and local control (return, break…) in the linear call-by-push-value (LCBPV) model of effects and resources. CBPV (Levy, 1999, 2004) can be seen as an idealised model of ML-style languages (higher-order, typed, with strict data types and effects) refining the call-by-value -calculus, and LCBPV is a natural decomposition of it generalizing linear logic (Girard, 1987) with hopes that it could serve to model the interaction of effects and resources. Nothing prefigured that the model had to do with idioms from a systems programming language.
The lesson is that trying to model exceptions or local control in LCBPV naturally leads to the rediscovery of scope-based destructors, and of several of their peculiarities which are described below. It provides the perspective that the interpretation of resources as affine types is not at odds with linear logic, but in fact arises from it. In contrast, LCBPV does not justify other resource-management idioms: it attributes no meaning to finalizers, andtry…finally (e.g. Java) is as ad-hoc as one expects. This suggests that RAII is a fundamental computing concept, similarly to CPS given that it arises from the same kind of algebraic considerations.
While the previous points suggest that a practical resource-management model has to mix different resource-management techniques as determined by the type, it still remains to explain how an ownership-and-borrowing model can be integrated in a language with a GC. For this, let us draw from the consensus that the GC should not have to perform non-trivial finalization, and make it a definitional principle:
the GC is a run-time optimisation that either delays or anticipates the collection of values that can be trivially disposed of.
From this definition, we will derive a way of mixing GCed values with resources, alternative to using finalizers.
In any case, there is a leap of faith between the linear call-by-push-value models as they currently stand and the current proposal. It would be pointless to get into more mathematical details at this point. Then, what is the value of the abstract point of view? To begin with, a contribution of the semantic view on RAII is to reassure about various peculiarities which might otherwise seem ad-hoc:
It gives rise to a notion of affine types that, instead of looking at odds with the linear logic narrative (why affine rather than linear?), arises naturally from it, and happens to match idioms used in successful industrial languages.
It sheds a natural light on complicated rules, such as the rules for automatic generation of destructors (for instance in C++ if two types and have destructors and then has destructor which performs and in sequence). Such rules actually describe a canonical mathematical construction and therefore enjoy good properties not necessarily visible from the surface of a language.
It attributes no meaning to exceptions escaping from destructors. In C++ too, this is undefined behaviour: otherwise one can end up with several exceptions being raised at the same time if destructors throw during stack unwinding.
It explains peculiarities of pattern-matching in the presence of ownership: the common explanation in terms of universal properties seems to recover the intuitive fact that assuming ownership during pattern matching is only possible if the destructor of the pattern is the default one. This predicts an integration of ownership with pattern matching already explored by Rust.
The value of the model is therefore to encourage a bold change, to which one would not necessarily come in small steps by trying to fix separately the various issues related to resource management in OCaml. We hope to explain the (modest) mathematical aspects in greater detail elsewhere.
sec:Integrating-ownership-and describes the integration of ownership and borrowing with a traced garbage collector based on a notion of resource polymorphism. subsec:Aside-Origins-of come back on move semantics and resource polymorphism from a historical perspective. From sec:Ownership-types-affine to sec:Additional-thoughts, the proposed new abstractions are described and examples using them are given. Implementations of the examples are given in current OCaml through a whole-program translation, that clarifies the computational meaning of the abstractions and highlights the current limitations of the runtime and type system. In sec:Comparison-with-existing, the proposal is compared to existing PLs.
Thanks to Rapha l Proust for introducing me to the topic of resource management in PLs, and to Leo White for freely sharing thoughts on the situation in OCaml. Thanks to Guillaume Combette: a milestone for this proposal was an elementary reconstruction of RAII in LCBPV which we obtained during his visit at LS2N, and which motivated several aspects of it. Thanks to Fr d ric Bour, Thomas Braibant, and Fran ois Pottier for freely sharing their interest and experience with this topic. Thanks to Ga tan Gilbert, Adrien Guatto, Jacques-Henri Jourdan, Gabriel Scherer, and Leo White, for comments and questions about an earlier version of this proposal.
2 Integrating ownership and borrowing with a GC
Let us now describe the integration of a GC in an ownership-and-borrowing system la
C++/Cyclone/Rust following the definition of the GC as a runtime optimisation for the collection of trivially destructible values. (For the moment the unboxing optimisations are not considered, but they do not fundamentally change the model, see subsec:Unboxing.)
2.1 Owned, borrowed and GCed values
There are GCed values typed with GC types, which never have a destructor. They can be passed and returned freely. These are those already present in OCaml.
Let us add types for resources, which are not managed by the GC but with RAII, and which do have destructors that are called in a predictable fashion. A value of the latter type is owned. Owned values can be moved, which transfers ownership, for instance to the caller, to a callee, or to a data structure. This means that responsibility for calling the destructor is transferred along with it. Ownership types combine in order to form other ownership types (for instance a list of ownership type is an ownership type). Static analysis ensures that a moved value is no longer accessed by the previous owner, for instance with an affine type system.
In addition, owned values can be borrowed. A borrowed value is a copy which is given a borrow type, denoting that the responsibility of calling the destructor belongs to somebody else. Borrow types combine to form other borrow types. There is no restriction on the amount of times a borrowed value can be passed, but it should not be accessed after the original value has been disposed of. This can be ensured by a static analysis (typically inspired by type-and-effect systems as introduced in Tofte and Talpin, 1994); practical static analyses combining this idea with ownership/linear types have already been experimented in Cyclone and Rust (see in particular Fluet, Morrisett, and Ahmed, 2006). Thus borrow types will need to carry annotations similar to Rust’s lifetimes.
There are now three modes of management:
GCed values and GC types
Owned values and ownership types
Borrowed values and borrow types
Each have pros and cons:
GCed values can be copied freely, but cannot have destructors.
Owned values can only be moved, which allow them to support destructors, to be used in a producer/consumer interaction (such as between components of a program, or to receive and pass values from/to other runtimes), to denote uniqueness, and to deal with large structures without impacting the cost of tracing.
Borrowed values can be copied, but subject to the restriction that it does not outlive the resource it originates from.
Such a diversity is not realistic without a plausible notion of resource polymorphism:
for the conciseness of the language,
for the expressiveness when mixing GC types and non-GC types,
at the level of types and their meaning, for simplicity and clarity for the user, and
at runtime, for a simple and efficient implementation.
The example of C++11 shows that such a notion of resource polymorphism also helps for backwards compatibility and extensibility of libraries.
The core part of the design is to understand GCed values polymorphically both as borrowed values and as owned values.
2.2 Resource polymorphism and runtime representation (level 3)
RAII is a notion tightly integrated into the runtime. Let us start there. In addition to traced pointers (with lowest bit set to 0), let us use untraced pointers (with lowest bit set to 1). The latter are allocated in the major heap and deallocated using RAII. An untraced pointer can either be borrowed or owned. If borrowed, there is nothing to deallocate. If owned, a compiler-generated destructor is called at the end of the scope, which deallocates the memory.
The following invariant is maintained throughout: any live GCed value is reachable either from the stack/registers, or from registered roots. In order to use a GCed value as a sub-value of an owned value, the GCed value is registered as a root at allocation. The compiler-generated destructor is then in charge of unregistering the root. Thus RAII is essential for the absence of leaks in the presence of exceptions. If the sub-value is in the minor heap, the new pointer is registered as a major-to-minor pointer and treated as such.
This leads to a first table for combining G, O, B types according to whether the resulting value is allocated by GC or not (in order: the strict pair, the type of lists, and the type of borrows, the latter of which is an addition of the proposal):
Structures comprised of owned values are allocated with RAII,
Discarding borrowed values is trivial, so structures comprised of borrowed values and/or GCed values are allocated with the GC,
Any combination of GCed values and owned values is owned; RAII is used to register and unregister the GCed value as a root as explained above.
Thus, the runtime representation of structures alternates GCed phases and non-GCed phases: a GCed value can contain a non-GCed value by borrowing, and a non-GCed value can contain GCed values by rooting. Notice that without borrowing, the heap would have a much simpler structure consisting of a RAII phase with leaves pointing to a GCed phase. From this angle it is clear that borrowing is essential for expressiveness.
Passing (copying) a borrowed or GCed value is done by copying the pointer. Owned values cannot be copied; instead, passing (moving) an owned value involves copying the pointer and setting the original to zero. Recording the move by setting the pointer to zero is required because destructor calls are determined statically, and since resources are affine, it cannot be known statically whether a resource has moved: for instance there can be branching code paths in which only one path moves the resource. Destructors therefore need to know at runtime whether a resource has moved. Of course, this imposes a linear (affine) treatment of owned values.
Thus, for each type, the compiler generates a destructor (whose representation can be defunctionalised), which: 1) tests for zero to detect whether the variable has moved, 2) if not, applies user-supplied destructors, and 3) deallocates. On the back-end, modular implicits (or at least their back-end) can be re-used for generating the destructor, which in turn allows the proposal to scale to abstract types and polymorphic functions: functions polymorphic in an Ownership type variable take an implicit module as an argument, which contains the compiler-generated destructor.
Reference counting can be used for optimising the case where several roots point to the same value. However, this is not the form of reference counting that has been criticised for garbage collection. It avoids the well-known drawbacks of reference counting: the difficulty to collect cycles, the up-front cost, and cascades of reference count updates. Indeed, the collection is still ultimately performed by tracing, and the cascades are avoided for two reasons:
only the pointer at the interface between the RAII phase and the GCed phase is reference-counted, and
copying roots can only happen by borrowing, and in this case it is not necessary to update the reference count since the reference does not outlive its resource.
All in all, GCed values are determined to be reachable by a mix of tracing and reference counting. There was a definite influence in the design of this proposal from the thesis that all GCs lie in a spectrum between tracing and reference counting (Bacon et al., 2004).
2.3 Resource polymorphism, types, and meaning (level 2)
Let us now provide an explanation to the runtime model in terms of types.
A GCed value is typed by a GC type, an owned value by an ownership type and a borrowed value by a borrow type. The mode of resource management (or polarity) of a type is determined by induction according to the following polarity table:
The modes G, O, B are closed under constructions.
A GC type can be used to form both ownership types (in combination with ownership types) and borrow types (in combination with borrow types).
The type of borrows is a borrow type, except for the type of borrows to a GC type which is a GC type (itself, obviously).
In other words, a GC type can be seen both as an ownership type and a borrow type. Indeed:
A GCed value is owned in the sense that holding the pointer (i.e. copying it to the stack) is sufficient to prolong the life of the value. Moreover, GCed values can be moved in the same way as owned values are moved. The fact that GCed values can be used like owned values can therefore be reflected in the language, so that it is possible to give a single resource-polymorphic implementation to the function which takes two lists as argument and merges them, for instance.
A GC type can also be seen as a borrow type, in the sense that GCed values can be copied without restriction. There is no difference between a GCed value and a resource with trivial destructor allocated at the largest region of the program, if the GC is considered as an optimisation anticipating its collection.
For instance, a GCed value can be extracted from inside a borrowed value and passed in an owned context, prolonging its lifetime.
Lastly, notice that if no borrowed or ownership types appear in a type, then it is GC. Substituting “GC” with “default-copiable”, this is the same design that made the addition of non-copiable classes in C++11 backwards-compatible with C++98.
3 Aside: history of move semantics and resource polymorphism
3.1 The promises of linear logic
It has been suggested from the beginning that the design and implementation of functional programming languages could take inspiration from linear logic (Girard, 1987) and its decomposition of intuitionistic logic. Lafont (1988) took inspiration from intuitionistic linear logic to propose the mixing of strict and lazy evaluation as well as GC-less automatic memory allocation, and justify in-place update of linear values. Safety of parallel and concurrent programs by means of static typing has been another promise of linear logic (Abramsky, 1993). These applications are closely related to continuation-passing-style models (Berdine et al., 2000) and syntactic control of interference (Reynolds, 1978; O’Hearn et al., 1999).
In a series of visionary articles, Baker (1994a, b, 1995) has proposed the integration in functional programming of concepts and implementation techniques from systems programming, using abstractions directly inspired from linear logic and supported by an extensive bibliography of implementation techniques spanning more than three decades. He described many ideas that are now at the basis of the C++11 and Rust resource-management models.
Moving values, Baker (1994a) argues, is a more fundamental operation than copying, and can be implemented by permutations of the stack reminiscent of the structural rules of linear logic. Copying should be explicit, or disabled when meaningless. It is noticed that this linear treatment supports C++-like destructors, and helps avoid synchronisation. It is also noticed that tail-call optimisation, far from being hindered, is in fact the default behaviour in this model.
Baker (1994b) describes what is essentially the modern usage of reference-counted pointers in C++11 and Rust, in which an alternation of moving, borrowing, and deferred copying, is used to minimise reference-count updates. Moreover it is suggested that similar linearity considerations can also be useful for tracing GCs.
In Baker (1995), linear values are advocated as a modular abstraction for resources in the sense of systems programming. Swapping with an empty value is mentioned as an alternative to permuting the stack. Linear abstract data types are proposed as a way to enforce the linearity of types that mix linear and non-linear components by protecting the underlying representation. Linearity of continuations justifies efficient implementations of control operators. The compatibility of the model with exceptions and non-local exits similarly to C++ destructors is mentioned, as well as the added expressiveness of moving resources compared to old C++. And more: only limited accounts of these rich texts can be offered here.
It is clear, at least, that Baker has made the connection between resource management and linear logic, including the compatibility with RAII, and invented move semantics in the process.
Baker admits that the linear discipline is heavy. The only way to pass an argument is to move it, and functions have to return unconsumed arguments alongside the return value in a tuple. Borrowing, which had been considered for reference counting, has not been considered for resources. Minsky (1996) proposed the unique pointer, precursor to the C++11 unique_ptr, which relaxes the move-only discipline by allowing non-consumable parameters, essentially the possibility to pass the resource by copiable reference. In order to ensure the absence of use after free, references to unique pointers are subject to drastic syntactic usage restrictions. Static analyses in the style of Cyclone (Jim et al., 2002) had not emerged yet. Hinnant et al. (2002) managed to integrate move semantics in a backwards-compatible extension of C++ including unique_ptr.
These works of Baker were perhaps too in advance of their time. We have found no mention of them in the rest of the literature about applications of linearity in functional programming. Otherwise, when they were mentioned, it was to succinctly point out their limitations.777Minsky, 1996; Clarke et al., 1998; Clarke and Wrigstad, 2003, and other articles remote from the current discussion, mostly in the context of control of aliasing, often in the context of object-oriented programming. They do not appear to have been accounted for what they are: considerations of language design for resource management, inspired by a connection between foundational works in logic and the practice in PL and systems implementations. Here, too, resource management is seen as more general than control of aliasing, and linear logic inspires aspects of language design that come before the type system.
3.2 Resource polymorphism
Many lessons on substructural type systems are summarised in Walker (2005), such as the practical imperative of a notion of polymorphism for the various linearity restrictions, or the interpretation of the exponential modality of linear logic as reference-counting by Chirimar et al. (1996). The latter interpretation suggests an analogy between resource modalities and smart pointers, which hinted at a further understanding of practical resource management from the point of view of Girard’s polarities.
To sum up, the proposed notion of resource polymorphism is supported by three features:
Resource management modes are polarities.
Polarity tables define a polymorphism of data types.
A notion of subtyping between polarities extends polymorphism: especially, GCed values are both owned and borrowed.
This phrasing uses concepts from Girard (1991, 1993), but these three
features are at the basis of C++’s RAII and its later extension with
move semantics. For instance, (1.) In C++, default destructors and
copy operations are automatically defined, and (2.) in C++11 the non-copiable
character of a type is inherited. In addition, (3.) containers such
std::vectorare polymorphic in the resource management mode: copy operations are disabled at compilation when they are meaningless, using the SFINAE idiom. These features are also at work in Rust (where traits such as Copy, Drop, or Sized, play the role of polarities), and, beyond RAII-based languages, they can be seen to some extent in Clean, ATS888http://www.ats-lang.org, Zhu and Xi (2005)., and others.
The view of GC types as simultaneously owning and borrowing can be approximated with reference-counting pointers in C++ and Rust, although this usage has the practical issues of reference-counted garbage collection, in addition to being syntactically heavy.
Polarity tables can be considered an automatic and predictable selection of the best resource-management mode for a value. In this sense, resource polymorphism is a way to fill the static-automatic gap (Proust, 2016) in the design space of resource management, using abstractions that are compositional.
An originality of C++ and Rust’s take on linearity is to emphasize the how? instead of the how many?, by assigning a computational contents to the copy, move, and drop operations. It is an old folklore in linear logic that distinct exponential modalities can coexist, so any interpretation in terms of “counting the uses” has to miss part of the message. Strikingly, RAII can be seen as arising from shifting attention from can this value be disposed of? to how is this value to be disposed of?.
In contrast, many investigations into linear type systems interpret linearity as counting uses, starting with Wadler (1990). Among the works that are of close interest to this proposal, this is the case in Kobayashi (1999), Hofmann (2000), Shi and Xi (2013) and Tov and Pucella (2011). Despite this limitation, they all present interesting use cases of linear/affine types, such as capabilities, optimisations and finer memory management. As an exception, there is a qualitative (as opposed to quantitative) interpretation of substructural type systems with type classes in Gan, Tov, and Morrisett (2014), which mentions the analogy with C++’s custom copy and destruction operators, although it misses developed examples, and is not designed for exception-safety. In this proposal, the source of inspiration for paying attention to the qualitative vs. quantitative aspects are constructions of actual models of linear logic where this computational contents appears, such as those of Bierman or Lafont (see Melliès, 2009, for a survey). From this angle, C++ move operators are analogous to monoidal symmetry.
In C++ and Rust, parametric resource-polymorphism (3.) is obtained with templates, with usual limitations and drawbacks (duplication, non scalable to richer type systems, and in the case of C++, poor error messages and slowness at compilation). In a language with proper parametric polymorphism and abstract types, type variables must be given polarities. Cyclone proposes a notion of subkinding (Grossman, 2006), and a similar feature closer to our context was further explored in Alms (Tov and Pucella, 2011). A notion of polymorphism of calling conventions, expressed as a polymorphism of kinds (as opposed to types), was developed in Eisenberg and Peyton Jones (2017), and later reused for multiplicity polymorphism in Bernardy, Boespflug, Newton, Peyton Jones, and Spiwack (2018).
The current proposal requires an approach that scales to the qualitative interpretation of linearity; essentially, destructors need to be passed when instantiating ownership type variables. Here, functions polymorphic in an ownership type are proposed to depend on an implicit module supplying the destructor. This idea, for which the design in Grossman’s and Tov and Pucella’s is the most fitting, was explored in Gan et al. (2014) with type classes. In terms of a runtime model, this idea is also similar in essence to the tag-free approach in Morrisett (1995), however applied to destructors only. Tov and Pucella (2011) is also a source of inspiration for showing the existence of principal usage qualifiers with the subkinding approach, and it will come back several times in the rest of this proposal.
4 Ownership types: affine types with destructors
Let us now describe additions to OCaml and give examples of uses. The code given throughout is of two kinds.
The model describes the observational behaviour, but does not respect the runtime model: it is heavy and inefficient. Also it does not check lifetimes, and linearity must be enforced by hand, so it does not model a type system. It models level 2, the language abstractions. It is similar in spirit to Stroustrup’s dynamic model of ownership alluded to in Stroustrup et al. (2015): “useless”, “inefficient”, “incompatible”, but “useful for thinking about ownership”. As such, it also underlines the limitations of the current OCaml runtime and type system.
4.1 Declaring a custom ownership type
Consider a new type declaration, for affine types given as a pair of a base type and a user-specified destructor .
Let us make more precise what is meant with a pair . In Rust’s terminology, consider a Drop trait.
Example: an input file.
This declaration creates a new type . It has the same runtime representation as . However, being a new type has two consequences: one cannot pass an in_channel to a function that expects a file_in, and one can define two different types of resource with different destructors over the same base type.
We do the same in the dynamic model:
To each affine type, associate a module as follows:
The runtime of the dynamic model is as follows:
An affine type is now declared as follows:
4.2 Creating an owned value
To create a file_in from an in_channel, consider some new syntax:
Example: create an opened file and return it as a resource.
Returning a resource transfers ownership of the resource to the caller.
An owned value is destroyed when it goes out of scope without being moved.
4.3 Moving an owned value
An affine value can be moved but not copied.
If one copies instead of moving, one can have other kinds of errors. The following two have no source equivalent:
A resource cannot be copied, but it can be borrowed. A borrowed value can be copied without restriction, but it cannot be used after its resource is destroyed. A borrowed value does not destroy the resource when it goes out of scope. It is created with the following syntax:
The borrow type satisfies
and, for any GC type t:
At runtime, &x and x have the same representation. When x is owned, the difference lies in &x being always copied and x being always moved.
5.1 Example: safe reading from a file
Example (compare with the equivalent one in https://ocaml.org/learn/tutorials/file_manipulation.html which has one try/with and two explicit calls to close_in):
5.2 Example: use-after-free
Borrowing can induce more subtle bugs than linearity violations, which requires to check that borrows do not outlive their resource. Cyclone and Rust propose a compositional analysis inspired by type-and-effect systems that assigns lifetime annotations to borrow types.
In OCaml, applying input_line on a closed in_channel gives:
Sys_error "Bad file descriptor"
In contrast, open_file enforces that a file_in is always open (if it is not possible to open it, it safely raises an exception before creating the resource). Thus, this error should not arise.
First, we need to hide the definition of file_in. Indeed, as long as it is known that &file_in = in_channel, then it is possible to let the file handle escape in the old way, given that in_channel is GCed and can therefore be taken possession of freely (it needs to be so, because we are interfacing with the legacy OCaml library: this is just the unsafety of old in_channel surfacing).
The following program tries to use the resource after it has been freed, and is expected to fail at compilation.
The dynamic model does not perform this static analysis, and is therefore allowed to violate the user’s invariant:
6 Polarity tables and pattern matching
6.1 Pair of owned
Recall the polarity tables from Section 1. The pair (x,y) is affine as soon as either x or y is affine. Its compiler-generated destructor is obtained by combining those of x and y in reverse order of creation (after testing for zero, and before deallocating the cell).
Implicit modules are not available yet, so the dynamic model uses plain modules with explicit instantiation.
Example: in the following example, if the first open_file raises an exception, the second one is closed automatically. If the result of this function is later dropped, both files are closed then.
6.2 Heterogeneous pair
If the pair is heterogeneous, e.g. x is owned and y is GCed or borrowed, then an implicit coercion of y from GCed to owned is introduced, by registering a root and assigning a destructor that unregisters the root. The runtime representation of the value remains the same. These runtime details do not appear in the model, and the coercion from GCed to affine here is trivial.
One can pattern-match on an affine tensor with default destructor, and this involves no additional operation compared to usual pattern-matching. For instance, the following function takes an affine type and returns the string:
Consistently with the type which indicates that the result is not a resource, the result value can be copied without restrictions.
In the model, this has been expressed by defining Tensor.t and Affine_of_GCd.t non-abstractly.
When returning the second component instead, the resource y is still alive after the destructor of (x,y) runs: indeed, when that one runs, y has already moved, and in the proposed runtime, all that the destructor sees in place of y is a null pointer, which it ignores.
In contrast, one cannot pattern-match on a custom affine value whose underlying type is a tensor, but only on borrows of such values.
One can define sum, option and list types similarly:
6.4 Example: a resource-safe interface to Mutex
Using RAII one can ensure that all locks are released. This example could be given earlier, except for try_unlock which returns an affine option type.
First we recall the mutex signature from the library.
6.5 Example: try-locking a list of mutexes and releasing them reliably
(While it can be polymorphic in the type of lockable values, it is not done here for simplicity.)
In particular the locks are released in reverse order. While not important for mutexes, the ability to enforce the order of destruction can be important for some other resources (think of transactions that need to be rolled back).
6.6 Borrowed values and pattern-matching
When borrowing a pair, one gets a pair of borrows.
In other words, & is a homomorphism from affine to copiable types. (Remember that &string = string due to its G polarity.) This allows us to pattern-match on borrows.
In the above example, the value of type &b * &a * &b, of polarity B, is allocated with the GC (see tab:Runtime-representation-in). Thus, a pair of borrows can either be allocated with RAII, or with the GC: the allocation method of a value of polarity B is not always statically known. In particular,
Indeed, string * in_channel is of polarity G, and always allocated with the GC, whereas string * &file_in can be obtained by borrowing an owned pair. The equation
only holds in outermost position in the type, as in:
Other data types are treated similarly. The following are implicit definitions in the language.
From the point of view of types, this design is likely to raise interesting questions for type inference, with variants to investigate.
6.7 Example: Zipper
As an example of application of this design, it is possible to explore an owned tree with a Zipper (Huet, 1997). Its implementation is not reproduced here because it is identical to the original one, up to checking of lifetimes; we also assume it polymorphic in a sense made more precise in sec:Parametric-resource-polymorphism. Then one can have an owned Zipper that takes ownership of the tree. But one can also take a borrowed Zipper to explore the owned tree. Then the initial Zipper is obtained at no cost by borrowing the owned tree. It is therefore entirely allocated with RAII. Subsequent Zippers are obtained by allocating new values with the GC, and therefore they are allocated in part with the GC and in part with RAII. Thus, the same polymorphic Zipper can be used both with owned and borrowed resources, with in both cases the cost properties expected from a Zipper.
6.8 Other data types
Extending this approach to all OCaml data types raises interesting questions, for instance with abstract types and GADTs. By adopting a design la Tov and Pucella (2011), in which the polarity can depend on type variables with an operator <’a> (“the polarity of ’a”), it is possible to declare that an abstract type has the polarity of its argument:
or infer that the equality type is GCed even if it is an equality between ownership types:
It is also possible to reflect the polarity of phantom types in the type of the GADT with the following idea:
where owned_unit is a dummy type of Owned polarity.
It remains to be seen how polarities scale to all OCaml data-types; but in case of stumbling blocks, polarity tables always leave the option of disabling certain combinations.
6.9 Example: capabilities
An example of application of an affine GADT is to encode an existential type, and can be used to tie a capability to a data structure (although this was not the original intent of the proposal). The following example is from Tov and Pucella (2011).
Nothing spectacular happens inside the dynamic model, because capabilities do not require RAII.
The explicit threading in set and get is reminiscent of the shortcoming of ownership without borrowing in Baker (1995). A variation on this idea is to consider affine borrows &mut as in Rust. In Rust, &mut is used for read-write operations whereas & is usually restricted to be read-only. Values cannot be borrowed both with &mut and & at the same time. Together with aliasing control provided by linearity, this prevents data races on resources, and similar issues such as iterator invalidation. There is no obstacle to including &mut in the current proposal. We come back to aliasing control in subsec:Languages-with-control.
7 Parametric resource polymorphism
Let us go back to the drop example. Make it polymorphic.
In the proposed runtime model, drop already knows how to move or borrow resources of type ’a, because these operations are the same for all types. However, drop needs to know the compiler-generated destructor for ’a. Therefore, the universal quantification on an affine type requires an implicit module supplying the destructor (conversely, abstract types need to supply the destructor).
Since any GCed value can trivially be seen as an owned value, drop above is also polymorphic in GCed values and behaves as expected. In this case, a special null value for the destructor indicates at runtime that there is no destructor to run and that allocations have to be done with the GC.
For distinguishing between borrowed and owned variables during type inference, one possibility is that by default inputs are inferred to be borrows, and outputs to be owned. Given that GCed are polymorphically borrowed and owned, this preserves the meaning of current polymorphic code. Then the user can explicitly mark an owned input. Below, the symbol * is used, but we avoid going into details of syntactic choices. (See “open questions” for more on this matter.)
7.1 Example: merging two ordered lists
7.2 Design and implementation
The combination of polarities and parametric polymorphism gives rise to interesting and crucial questions of type inference, principal typing, and polarity specification for abstract types. These questions are addressed by Tov and Pucella (2011) using subkinding and dependent kinds (subtyping of polarities and polarities depending on the polarities of type variables), extending an ML-like design. The hope for the type system (level 1) is that it can be extended without major stumbling blocks.
One question is whether all four polarities are needed for type variables, or if only two polarities need to be presented to the user (affine/copiable, as in Alms). The difference between ’a:G and ’a:B is that the equation &t=t for t:G can be used during type-checking, e.g. (’a:G) &list = (’a:G) list. It remains to be seen whether this is necessary for expressiveness in concrete situations. But this question might only be superficial: in all cases one wants that the lifetime of a type ’a t : is deduced from the lifetime of ’a.
As for the efficiency, one will likely want a guarantee that no efficiency is lost due to passing a null destructor to polymorphic functions every time a G is coerced to O. This case can be optimised by treating polymorphic functions on a “write once, compile twice” basis: whenever static knowledge allows, one can call a specialised version that does not require the implicit argument, that replaces moves with copies, and that always allocates with the GC. The specialised version corresponds to what OCaml would compile to currently. This optimisation is likely to apply often: it will preserve the efficiency of current OCaml programs, and will also apply in all cases involving borrow types.
8 Owned mutable state
Let us implement a resource-aware Stack module.
8.1 Owned mutable cells
Take OCaml’s type of stacks:
’a is now allowed to be a resource: polarity tables are the same with or without the mutable keyword.
In addition, if a is owning, then the borrow type a &t is:
where &mutable is a new keyword. In other words the computation of & stops at the mutable field. Both mutable and &mutable cells are lvalues of the specified type. In addition, when used as an rvalue, the &mutable cell has the borrow type of its contents:
such that (&x).c and &(x.c) are (intuitively) equivalent.
As in Rust and Alms, the primitive operation is swapping between two lvalues:
Indeed, if the location contains a resource, swapping is more expressive than (<-) because it does not involve any destruction of resource. (s1.c <- l) can be expressed as follows:
A mutable cell owns its resource, and therefore (<-) proceeds with the destruction of the previous value. (Unfortunately, defining it as a function of <-> is impossible without adding support in OCaml for lvalues as arguments of functions).
8.2 Example: Stack
One idiom in Rust for dealing with mutable structures is to temporarily take ownership of the contents, as with take and swap below. Making use of this idiom in push, pop, and clear is the only substantial difference with the original OCaml Stack module.999One can also compare it with a similar data structure written in Rust: http://cglab.ca/abeinges/blah/too-many-lists/book/first-final.html.
In the example below, ’a is considered to be of most generic polarity (O+B), although for backwards-compatibility one will need to reserve ’a for variables of polarity G, and find another notation (”a or ’a : A).
In the dynamic model, let us implement a resource-polymorphic ref following this model. It is essentially aref from Alms enriched with RAII support and a swap function that operates on borrowed values.
For a performance-critical library such as Stack, it is preferable to have an unsafe…end block la Rust, and write the more efficient:
In an unsafe block, the checking of lifetimes and ownership is reverted back to the user. The programmer can (and is encouraged to) reason about the correctness of their code by showing it equivalent to the previous one that uses take. In addition to being more efficient, this code coincides on GCed values with the one in OCaml’s Stack implementation. unsafe…end can be considered without altering the rest of the resource-management model.
When will the following partial application destroy x?
when drop2 is applied to x, or when the scope of f ends? One criterion is that Currification preserves the meaning. Then, partial application must have the meaning of:
and therefore drop x only once y has been passed.
9.1 Owning closures
As in C++/Rust, closures can take possession of their variables. A function such as the above where the captured variable is not used is not necessarily a fabricated corner case: in RAII it is common to use guards, that is values whose sole purpose is to exist for the duration of a scope and not be used otherwise.
These languages provide a syntax to decide whether a variable has to be moved, copied, or borrowed in a closure. This is always expressible with lets (as illustrated above), so the problem of expressing which affine variables are captured is reduced to a question of syntactic sugar (into which this proposal does not go).
A consequence of the closure owning their resources is that function values have a polarity determined by the closure (and neither by their argument nor by their return type). This is predicted by the linear call-by-push-value model. In semantic terms, a linear call-by-value function has polarised type:
(readers of Levy might write its non-linear version , the is implicit because it can be deduced from the context).
Here is the type constructor of closures. In linear call-by-push-value, there are several types of closures: linear (), copiable (). The current OCaml function type, copiable at will, is:
This leads to the introduction of an affine function type, as in Alms, or as FnOnce in Rust. Affine means that the function has to be used at most once (but it can use its argument as many times as the argument’s polarity allows).
Thus drop2 can be given type:
denoting that the second closure holds a resource which will be consumed, in this case the first argument.
Now what if an affine function value only uses its closure by borrowing? In that case, although the closure is affine, it makes sense to call the function several times. A third function space is introduced, and is similar to except that borrowed values of type &(a b) can be applied to values of type a as well. This is Fn in Rust. (Let us hope this is enough function spaces!)
9.2 Functions with static closures
What is the meaning of LCBPV’s negative then? Interpret it as the type of functions not yet wrapped into (dynamic) closures, as a distinction reminiscent of the one between function pointers and closures seen in C++ and Rust.
The introduction of the closure can be statically delayed until a value is actually required for the function, that is, when the function is passed to another function, wrapped into a data structure, or fully applied. (A full application is one which results in an expression with positive type.) At either of these points, the set of free variables used in the expression is known statically, and therefore introducing the closure can be done then.
There are at least three reasons for introducing the negative function space. If the function uses a resource, introducing the closure forces to move the resource. Introducing it early can move the resource earlier than necessary. For instance:
Delaying the introduction of the closure until the last line lets us accept this program.
The second reason is that by typing differently returned functions that do not need a closure yet, one reduces the number of intermediate closures in call-by-value, statically and compositionally, unlike the (non-compositional) nested redex optimisation (Danvy and Nielsen, 2005) or in the (dynamic) ZINC machine (Leroy, 1990).
This idea sheds light on the right-to-left evaluation order of arguments seen in the ZINC. Indeed, it naturally leads to a right-to-left order, because in the double application:
the type of (f e1) is negative and requires waiting for the value of e2, and because left-to-right evaluation:
is not macro-expressible given that its size is linear in the number of arguments. (The idea of relating optimisations of closure introduction in call-by-value to call-by-push-value is not new, a sketch of a relationship with the nested redex optimisation and the ZINC machine is given in Spiwack, 2014, and a different computational description for the CBPV arrow is sketched.)
The third reason is an extension of the second one, in the presence of affine closures. Tov and Pucella (2011) have noticed that currified functions tend to accumulate annotations, often in a predictable manner, e.g.:
where denotes the type of closures with the same polarity as . This makes dependent kinds necessary in their approach. This happens because a function obtained by currying just adds the variable to the closure, as opposed to a function that performs some computation before returning a closure. By introducing the closure in a delayed manner, such a currified function can be typed without annotations:
Then, the closure is created, its variables moved, and its polarity determined, either after full application, or when one tries to pass or return the result of a partial application.
10 Tail-call optimisation and control operators
Tail-call optimisation (TCO) suffers from a bad interaction with destructors. Given that a resource must be destroyed at the end of a scope, it has to remain on the stack and prevents any TCO.
There are three separate issues:
This can lead to surprises, when writing polymorphic code or during refactoring. Hence, the user should be allowed to specify that a call is expected to be tail, and get an error if this is not the case.
This can prevent desired tail calls. The solution is to have a convenient way to express that resources must be destroyed before the call, rather than after.
The last condition being met, one must be able to actually implement TCO.
1) and 2) are largely a matter of syntactic choices and 1) is implemented in OCaml with the [@tailcall] attribute.
10.1 Example: List.rev_map
Below is a solution for 2) and 3) involving an operator tail_call that destroys remaining resources before doing a tail call. In the dynamic model it is implemented with a combination of an exception and usual TCO.