Efficient Gradual Typing

Gradual typing combines static and dynamic typing in the same program. One would hope that the performance in a gradually typed language would range between that of a dynamically typed language and a statically typed language. Existing implementations of gradually typed languages have not achieved this goal due to overheads associated with runtime casts. Takikawa et al. (2016) report up to 100× slowdowns for partially typed programs. In this paper we present a compiler, named Grift, for evaluating implementation techniques for gradual typing. We take a straightforward but surprisingly unexplored implementation approach for gradual typing, that is, ahead-of-time compilation to native assembly code with carefully chosen runtime representations and space-efficient coercions. Our experiments show that this approach achieves performance on par with OCaml on statically typed programs and performance between that of Gambit and Racket on untyped programs. On partially typed code, the geometric mean ranges from 0.42× to 2.36× that of (untyped) Racket across the benchmarks. We implement casts using the coercions of Siek, Thiemann, and Wadler (2015). This technique eliminates all catastrophic slowdowns without introducing significant overhead. Across the benchmarks, coercions range from 15 also implement the monotonic references of Siek et al. (2015). Monotonic references eliminate all overhead in statically typed code, and for partially typed code, they are faster than proxied references, sometimes up to 1.48×.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

08/07/2019

Space-Efficient Gradual Typing in Coercion-Passing Style

Herman et al. (2007, 2010) pointed out that the insertion of run-time ch...
07/24/2020

Corpse Reviver: Sound and Efficient Gradual Typing via Contract Verification

Gradually-typed programming languages permit the incremental addition of...
02/20/2019

Optimizing and Evaluating Transient Gradual Typing

Gradual typing enables programmers to combine static and dynamic typing ...
01/08/2020

Deep Static Modeling of invokedynamic

Java 7 introduced programmable dynamic linking in the form of the invoke...
10/27/2020

Abstracting Gradual Typing Moving Forward: Precise and Space-Efficient (Technical Report)

Abstracting Gradual Typing (AGT) is a systematic approach to designing g...
06/23/2021

Native Implementation of Mutable Value Semantics

Unrestricted mutation of shared state is a source of many well-known pro...
06/07/2019

Datalog Disassembly

Disassembly is fundamental to binary analysis and rewriting. We present ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Gradual typing combines static and dynamic type checking in the same program, giving the programmer control over which typing discipline is used for each region of code (Anderson:2002kd; Siek:2006bh; Tobin-Hochstadt:2006fk; Gronski:2006uq; Matthews:2007zr). We would like gradually typed languages to be efficient, sound, and provide interoperability. Regarding efficiency, we would like the performance of gradual typing to range from being similar to that of a dynamically typed language to that of a statically typed language. Regarding soundness, programmers (and compilers) would like to trust type annotations and know that runtime values respect their compile-time types. Third, regarding interoperability, static and dynamic regions of code should interoperate seamlessly.

To date, implementations of gradual typing have only delivered two of these three properties. For example, Typed Racket (Tobin-Hochstadt:2008lr) provides soundness and interoperability but suffers from slowdowns of up to  (Takikawa:2015aa; Takikawa:2016aa) on a partially typed program. Thorn (Bloom:2009aa; Wrigstad:2010fk) and Safe TypeScript (Rastogi:2014aa) provide better performance but limit interoperability. TypeScript (Hejlsberg:2012aa; Bierman:2014aa) and Gradualtalk (Allende:2011fk; Allende:2012aa; Allende:2013aa) do not provide soundness and their performance is on par with dynamic languages but not static ones, but they provide seamless interoperability.

Several papers at OOPSLA 2017 begin to address the efficiency concerns for gradually typed languages that are committed to soundness and interoperability. Bauman:2017aa demonstrate that a tracing JIT can eliminate 90% of the overheads in Typed Racket due to gradual typing. Richards:2017aa augment the Higgs JIT compiler and virtual machine (VM) (Chevalier-Boisvert:2015aa) for JavaScript, re-purposing the VM’s notion of shape to implement monotonic references (Siek:2015aa). Richards:2017aa reports that this reduces the worst slowdowns to 45%, with an average slowdown of just 7%. Meanwhile, Muehlboeck:2017aa show that for nominally-typed object-oriented languages, efficiency is less of a problem.

In this paper we demonstrate that efficient gradual typing can be achieved in structurally-typed languages by relatively straightforward means. We build and evaluate an ahead-of-time compiler that uses carefully chosen runtime representations and implements two important ideas from the theory of gradual typing. It uses space efficient coercions (Siek:2009ys; Herman:2010aa; Garcia:2013fk; Siek:2015ab) to implement casts and it reduces overhead in statically typed code by using monotonic references (Siek:2015aa).

Contributions

This paper makes these contributions.

  • A space-efficient semantics for monotonic references and lazy-D coercions (Section 3).

  • The first ahead-of-time compiler, named Grift, for a gradually typed language that targets native assembly code. The compiler is the first to implement space-efficient coercions (Section LABEL:sec:implementation).

  • Experiments (Section LABEL:sec:external-comparison) showing

    • performance on statically typed code that is on par with OCaml,

    • performance on dynamically typed code that is between Gambit and Racket, and

    • performance on partially typed code ranging from to that of Racket.

  • Experiments showing that coercions eliminate catastrophic slowdowns without adding significant overhead (Section LABEL:sec:cost-space-efficiency).

  • Experiments showing that monotonic references eliminate overhead in statically typed code (Section LABEL:sec:monotonic-versus-proxied).

Section 2 provides background on gradual typing, focusing on runtime casts and the tension between efficiency, soundness, and interoperability.

2. Tensions in Gradual Typing

From a language design perspective, gradual typing touches both the type system and the operational semantics. The key to the type system is the consistency relation on types, which enables implicit casts to and from the unknown type, here written , while still catching static type errors (Anderson:2002kd; Siek:2006bh; Gronski:2006uq). The dynamic semantics for gradual typing is based on the semantics of contracts (Findler:2002es; Gray:2005ij), coercions (Henglein:1994nz), and interlanguage migration (Tobin-Hochstadt:2006fk; Matthews:2007zr). Because of the shared mechanisms with these other lines of research, much of the ongoing research in those areas benefits the theory of gradual typing, and vice versa (Guha:2007kl; Matthews:2008qr; Greenberg:2010lq; Dimoulas:2011fk; Strickland:2012fk; Chitil:2012aa; Dimoulas:2012fk; Greenberg:2015ab). In the following we give a brief introduction to gradual typing by way of an example that emphasizes the three main goals of gradual typing: supporting interoperability, soundness, and efficiency.

Interoperability and Evolution

Consider the example program in Figure 1, written in a variant of Typed Racket that we have extended to support fine-grained gradual typing. On the left side of the figure we have an untyped function for the extended greatest common divisor. With gradual typing, unannotated parameters are dynamically typed and therefore assigned the type . On the right side of the figure is the same function at a later point in time in which the parameter types have been specified () but not the return type. With gradual typing, both programs are well typed because implicit casts are allowed to and from . For example, on the left we have the expression (modulo b a), so b and a are implicitly cast from to . On the right, there is an implicit cast around (list b 0 1) from (List Int) to . The reason that gradual typing allows implicit casts both to and from is to enable evolution. As a programmer adds or removes type annotations, the program continues to type check and also exhibits the same behavior up to cast errors, a property called the gradual guarantee (Siek:2015ac).

(define (egcd a b)   (if (= a 0)     (list b 0 1)     (let ([r (egcd (modulo b a) a)])       (list (car r)         (- (caddr r) (* (/ b a) (cadr r)))         (cadr r))))) (define (egcd [a:Int] [b:Int])   (if (= a 0)     (list b 0 1)     (let ([r (egcd (modulo b a) a)])       (list (car r)         (- (caddr r) (* (/ b a) (cadr r)))         (cadr r)))))
Figure 1. Two gradually typed versions of extended GCD.

Soundness

Next consider the function modinv defined below that computes the modular inverse using the second version of egcd. What happens when the code on the right forgets to convert the input string from (read) to an integer before passing it to modinv?

(define (modinv a m)
  (let ([r (egcd a m)])
    (if (not (= (car r) 1))
        (error ...)
        (modulo (cadr r) m))))
(let ([input (read)])
  (modinv 42 input))

Parameter m of modinv has type , but b of egcd has type Int, so there is an implicit cast from to Int. With gradual typing, this implicit cast comes with a runtime cast that will trigger an error if the input to this program is a string. This runtime cast is required to ensure soundness: without it a string could flow into egcd and masquerade as an Int. Soundness is not only important for software engineering reasons but it also impacts efficiency both positively and negatively.

Ensuring soundness in the presence of first-class functions and mutable references is nontrivial. When a function is cast from to a type such as , it is not possible for the cast to know whether the function will return an integer on all inputs. Instead, the standard approach is to wrap the function in a proxy that checks the return value each time the function is called (Findler:2002es). Similarly, when a mutable reference is cast, e.g., from to , the reference is wrapped in a proxy that casts from to on every read and from to on every write (Herman:2006uq; Herman:2010aa).

Efficiency

Ideally, statically typed code within a gradually typed program should execute without overhead. Likewise, partially typed or untyped code should execute with no more overhead than is typical of dynamically typed languages. Consider the egcd on the right side of Figure 1. Inside this egcd, the expression (modulo b a) should simply compile to an idiv instruction (on x86). However, if the language did not ensure soundness as discussed above, then this efficient compilation strategy would result in undefined behavior (segmentation faults at best, hacked systems at worst). It is soundness that enables type-based specialization. However, soundness comes at the cost of the runtime casts at the boundaries of static and dynamic code.

3. Semantics of a Gradual Language

The type system of Grift’s input language is the standard one for the gradually typed lambda calculus (Siek:2006bh; Siek:2008aa; Herman:2010aa). The operational semantics, as usual, is expressed by a translation to an intermediate language with explicit casts.

Source Program: (let ([add1 : (Int => Int)        (lambda ([x : Int]) (+ x 1))])   (let ([f : (Dyn => Dyn) add1])     (: (f 41) Int))) After Cast Insertion: (let ([add1 (lambda (x) (+ x 1))])   (let ([f (cast add1 () () L1)])     (cast (f (cast 41   L2))   L3)))
Figure 2. An example of the Grift compiler inserting casts. The L1, L2, etc. are blame labels that identify source code location.

Consider the source program in Figure 2 which calculates the value 42 by applying the add1 function, by way of variable f, to the integer value 41. The type of add1 does not exactly match the type annotation on f (which is Dyn => Dyn) so the compiler inserts the cast:

(cast add1 (Int => Int) (Dyn => Dyn) l2)

The application of f to 42 requires a cast on 42 from Int to Dyn. Also, the return type of f is Dyn, so the compiler inserts a cast to convert the returned value to Dyn to satisfy the type ascription.

In this paper we consider two approaches to the implementation of runtime casts: traditional casts, which we refer to as type-based casts, and coercions. Type-based casts provide the most straightforward implementation, but the proxies they generate can accumulate and consume an unbounded amount of space (Herman:2010aa). The coercions of Henglein:1994nz solve the space problem with a representation that enables the compression of higher-order casts (Herman:2010aa).

For type-based casts, the dynamic semantics that we use is almost covered in the literature. We use the lazy-D cast semantics which is described by Siek:2012uq. (They were originally described using coercions by Siek:2009rt.) The distinction between lazy-D and the more commonly used lazy-UD semantics (Wadler:2009qv) is not well-known, so to summarize the difference: in lazy-D, arbitrary types of values may be directly injected into type , whereas in lazy-UD, only values of a ground type may be directly injected into . For example, and are ground types, but is not.

The one missing piece for our purposes are the reduction rules for proxied references, which we adapt from the coercion-based version by Herman:2010aa. In this setting, proxied references are values of the form . The following are the reduction rules for reading and writing to a proxied reference.

For monotonic references with type-based casts, the dynamic semantics for lazy-D is given by Siek:2015aa.

Regarding coercions, the dynamic semantics that we used is less well-covered in the literature. Again, we use the lazy-D semantics of Siek:2009rt, but that work, despite using coercions, did not define a space-efficient semantics. On the other hand, Siek:2015ab give a space-efficient semantics with coercions, but for the lazy-UD semantics. To this end, they define a normal form for coercions and a composition operator that compresses coercions. Here we adapt that approach to lazy-D, which requires some changes to the normal forms and to the composition operator. Also, that work did not consider mutable references, so here we add support for both proxied and monotonic references. Regarding monotonic references, Siek:2015aa define the lazy-D semantics, but again, they did not define a space-efficient semantics. Here we make it space-efficient by defining the normal forms for reference coercions and the composition operation on them.

Types and coercions

Consistency

Meet operation (greatest lower bound)