1. Introduction
The lambdacalculus is a simple yet rich model of computation, relying on a single mechanism to activate a function in computation, betareduction, that replaces function arguments with actual input. While in the lambdacalculus itself betareduction can be applied in an unrestricted way, it is evaluation strategies that determine the way betareduction is applied when the lambdacalculus is used as a programming language. Evaluation strategies often imply how intermediate results are copied, discarded, cached or reused. For example, everything is repeatedly evaluated as many times as requested in the callbyname strategy. In the callbyneed strategy, once a function requests its input, the input is evaluated and the result is cached for later use. The callbyvalue strategy evaluates function input and caches the result even if the function does not require the input.
The implementation of any evaluation strategy must be correct, first of all, i.e. it has to produce results as stipulated by the strategy. Once correctness is assured, the next concern is efficiency. One may prefer better space efficiency, or better time efficiency, and it is well known that one can be traded off for the other. For example, time efficiency can be improved by caching more intermediate results, which increases space cost. Conversely, bounding space requires repeating computations, which adds to the time cost. Whereas correctness is well defined for any evaluation strategy, there is a certain freedom in managing efficiency. The challenge here is how to produce a unified framework which is flexible enough to analyse and guide the choices required by this tradeoff. Recent studies by Accattoli et al. [AD16, ABM14, Acc17] clearly establish classes of efficiency for a given evaluation strategy. They characterise efficiency by means of the number of betareduction applications required by the strategy, and introduce two efficiency classes, namely “efficient” and “reasonable”. The expected efficiency of an abstract machine gives us a starting point to quantitatively analyse the tradeoffs required in an implementation.
1.1. TokenPassing GoI
We employ Girard’s Geometry of Interaction (GoI) [Gir89], a semantics of linear logic proofs, as a framework for studying the tradeoff between time and space efficiency. In particular we focus on the tokenpassing style of GoI, which gives abstract machines for the lambdacalculus, pioneered by Danos and Regnier [DR96] and Mackie [Mac95]. These machines evaluate a term of the lambdacalculus by translating the term to a graph, a network of simple transducers, which executes by passing a datacarrying token around.
Tokenpassing GoI decomposes higherorder computation into local token actions, or lowlevel interactions of simple components. It can give strikingly innovative implementation techniques for functional programs, such as Mackie’s Geometry of Implementation compiler [Mac95], Ghica’s Geometry of Synthesis (GoS) highlevel synthesis tool [Ghi07], and Schöpp’s resourceaware program transformation to a lowlevel language [Sch14b]. The interactionbased approach is also convenient for the complexity analysis of programs, e.g. Dal Lago and Schöpp’s IntML type system of logarithmicspace evaluation [DS16], and Dal Lago et al.’s linear dependent type system of polynomialtime evaluation [DG11, DP12].
Fixedspace execution is essential for GoS, since in the case of digital circuits the memory footprint of the program must be known at compiletime, and fixed. Using a restricted version of the callbyname language Idealised Algol [GS11] not only the graph, but also the token itself can be given a fixed size. Surprisingly, this technique also allows the compilation of recursive programs [GSS11]. The GoS compiler shows both the usefulness of the GoI as a guideline for unconventional compilation and the natural affinity between its spaceefficient abstract machine and callbyname evaluation. The practical considerations match the prior theoretical understanding of this connection [DR96].
The token passed around a graph simulates graph rewriting without actually rewriting, which is in fact an extremal instance of the tradeoff we mentioned above. Tokenpassing GoI keeps the underlying graph fixed and use the data stored in the token to route it. It therefore favours space efficiency at the cost of time efficiency. The same computation is repeated when, instead, intermediate results could have been cached by saving copies of certain subgraphs representing values.
1.2. Interleaving Token Passing with Graph Rewriting
Our intention is to lift the tokenpassing GoI to a framework to analyse the tradeoff of efficiency, by strategically interleaving it with graph rewriting. We present the framework as an abstract machine that interleaves token passing with graph rewriting. The machine, called the Dynamic GoI Machine (DGoIM), is defined as a state transition system with transitions for token passing as well as transitions for graph rewriting. The key idea is that the token holds control over graph rewriting, by visiting redexes and triggering the rewrite transitions.
Graph rewriting offers fine control over caching and sharing intermediate results. Through graph rewriting, the DGoIM can reduce subgraphs visited by the token, avoiding repeated token actions and improving time efficiency. However, fetching cached results can increase the size of the graph. In short, introduction of graph rewriting sacrifices space while favouring time efficiency. We expect the flexibility given by a finegrained control over interleaving will enable a careful balance between space and time efficiency.
As a first step in our exploration of the flexibility of this machine, we consider the two extremal cases of interleaving. The first extremal case is “passesonly”, in which the DGoIM never triggers graph rewriting, yielding an ordinary tokenpassing abstract machine. As a typical example, the term is evaluated like this:

A token enters the graph on the left at the bottom open edge.

A token visits and goes through the left subgraph .

Whenever a token detects an occurrence of the variable in , it traverses the right subgraph , then returns carrying information about the resulting value of .

A token finally exits the graph at the bottom open edge.
Step 3 is repeated whenever the argument needs to be reevaluated. This passesonly strategy of interleaving corresponds to callbyname evaluation.
The other extreme is “rewritesfirst”, in which the DGoIM interleaves token passing with as much, and as early, graph rewriting as possible, guided by the token. This corresponds to both callbyvalue and callbyneed evaluations, with different trajectories of the token. In the case of lefttoright callbyvalue, the token enters the graph from the bottom, traverses the lefthandside subgraph, which happens to be already a value, then visits the subgraph even before the bound variable is used in a call. The token causes rewrites while traversing the subgraph , and when it exits, it leaves behind a graph corresponding to a value such that reduces to . For righttoleft callbyvalue, the token visits the subgraph straightaway after entering the whole graph, reduces the subgraph , to the graph of the value , and visits the lefthandside subgraph. The difference with callbyneed is that the token visits and reduces the subgraph only when the variable is encountered in .
In our framework, all these three evaluations involve similar tactics for caching intermediate results. Different trajectories of the token realise their only difference, which is the timing of cache creation. Cached values are fetched in the same way: namely, if repeated evaluation is required, then the subgraph corresponding now to the value is copied. One copy can be further rewritten, if needed, while the original is kept for later reference.
1.3. Contributions
This work presents a tokenguided graphrewriting abstract machine for callbyneed, lefttoright callbyvalue, and righttoleft callbyvalue evaluations. The abstract machine is given by the rewritesfirst strategy of the DGoIM, which turns out to be as natural as the passesonly strategy for callbyname evaluation. It switches the evaluations, by simply having different nodes that correspond to the three different evaluations, rather than modifying the behaviour of a single node to suite different evaluation demands. This can be seen as a case study illustrating the flexibility of the DGoIM, which is achieved through controlled interleaving of rewriting and tokenpassing, and through changing graph representations of terms.
We prove the soundness and completeness of the extended machine with respect to the three evaluations separately, using a “submachine” semantics, where the word “sub” indicates both a focus on substitution and its status as an intermediate representation. The submachine semantics is based on Sinot’s “tokenpassing” semantics [Sin05, Sin06] that makes explicit the two main tasks of abstract machines: searching redexes and substituting variables.
The timecost analysis classifies the machine as “efficient” in Accattoli’s taxonomy of abstract machines [Acc17]. We follow Accattoli et al.’s general methodology for quantitative analysis of abstract machines [ABM14, Acc17], however the method cannot be used “off the shelf”. Our machine is a more refined transition system with more transition steps, and therefore does not satisfy one of their assumptions [Acc17, Sec. 3], which requires onetoone correspondence of transition steps. We overcome this technical difficulty by building a weak simulation of the submachine semantics, which is also used in the proof of soundness and completeness. The submachine semantics resembles Danvy and Zerny’s storeless abstract machine [DZ13], to which the general recipe of cost analysis does apply.
Finally, an online visualiser^{1}^{1}1 Link to the online visualiser: https://kokom.github.io/GoIVisualiser/ is implemented, in which our machine can be executed on arbitrary closed (untyped) lambdaterms. The visualiser also supports some existing abstract machines based on the tokenpassing GoI, which will be discussed later, to illustrate various resource usage of abstract machines.
2. A Term Calculus with SubMachine Semantics
We use an untyped term calculus that accommodates three evaluation strategies of the lambdacalculus, by dedicated constructors for function application: namely, (callbyneed), (lefttoright callbyvalue) and (righttoleft callbyvalue). The term calculus uses all strategies so that we do not have to present three almost identical calculi. But we are not interested in their interaction, but in each strategy separately. In the rest of the paper, we therefore assume that each term contains function applications of a single strategy. As shown in the top of Fig. 1, the calculus accommodates explicit substitutions . A term with no explicit substitutions is said to be “pure”.
The submachine semantics is used to establish the soundness of the graphrewriting abstract machine. It is an adaptation of Sinot’s lambdaterm rewriting system [Sin05, Sin06], used to analyse a tokenguided rewriting system for interaction nets. It imitates an abstract machine by explicitly searching for a redex and decomposing the metalevel substitution into ondemand linear substitution, also resembling a storeless abstract machine (e.g. [DMMZ12, Fig. 8]). However the semantics is still too “abstract” to be considered an abstract machine, in the sense that it works modulo alphaequivalence to avoid variable captures.
Fig. 1 defines the submachine semantics of our calculus. It is given by labelled relations between enriched terms . In an enriched term , a subterm is not plugged directly into the evaluation context, but into a “window” which makes it syntactically obvious where the reduction context is situated. Forgetting the window turns an enriched term into an ordinary term. Basic rules are labelled with , or . The basic rules (2), (5) and (8), labelled with , apply betareduction and delay substitution of a bound variable. Substitution is done one by one, and on demand, by the basic rule (10) with label . Each application of the basic rule (10) replaces exactly one bound variable with a value, and keeps a copy of the value for later use. All other basic rules, with label , search for a redex by moving the window without changing the underlying term. Finally, reduction is defined by congruence of basic rules with respect to evaluation contexts, and labelled accordingly. Any basic rules and reductions are indeed between enriched terms, because the window is never duplicated or discarded.
Terms  
Values  
Answer contexts  
Evaluation contexts  
(1)  
(2)  
(3)  
(4)  
(5)  
(6)  
(7)  
(8)  
(9)  
(10)  
An evaluation of a pure term (i.e. a term with no explicit substitution) is a sequence of reductions starting from , which is simply . In any evaluation, a subterm in the window is always pure.
3. The TokenGuided GraphRewriting Machine
In the initial presentation of this work [MG17a], we used proof nets of the multiplicative and exponential fragment of linear logic [Gir87] to implement the callbyneed evaluation strategy. Aiming additionally at two callbyvalue evaluation strategies, we here use graphs that are closer to syntax trees but are still augmented with the box structures taken from proof nets. Moving towards syntax trees allows us to accommodate two callbyvalue evaluations in a uniform way. The box structures specify duplicable subgraphs, and help timecost analysis of implementations.
A graph is given by a set of nodes and a set of directed edges. Nodes are classified into proper nodes and link nodes. Each edge is directed, and at least one of its two endpoints is a link node. An interface of a graph is given by two sets of link nodes, namely input and output. Each link node is a source of at most one edge, and a target of at most one edge. Input links are the only links that are not a target of any edge, and output links are the only ones that are not a source of any edge. When a graph has input link nodes and output link nodes, we sometimes write to emphasise its interface. If a graph has exactly one input, we refer to the input link node as “root”.
The idea of using link nodes, as distinguished from proper nodes, comes from a graphical formalisation of string diagrams [Kis12].^{2}^{2}2 Our link nodes should not be confused with the terminology “link”, which refers to a counterpart of our proper nodes, of proof nets. String diagrams consist of “boxes” that are connected to each other by “wires”, and may have dangling or looping wires. In the formalisation, boxes are modelled by “boxvertices” (corresponding to proper nodes in our case), and wires are modelled by consecutive edges connected via “wirevertices” (corresponding to link nodes in our case). It is link nodes that allow dangling or looping wires to be properly modelled. The segmentation of wires into edges can introduce an arbitrary number of consecutive link nodes, however these consecutive link nodes are identified by the notion of “wire homeomorphism”. We will later discuss these consecutive link nodes, from the perspective of the graphrewriting machine. From now on we simply call a proper node “node”, and a link node “link”.
In drawing graphs, we follow the convention that input links are placed at the bottom and output links are at the top, and links are usually not drawn explicitly. The latter point means that edges are simply drawn from a node to a node, with intermediate links omitted. In particular if an edge is connected to an interface link, the edge is drawn as an open edge missing an endpoint. Additionally, we use a boldstroke edge/node to represent a bunch of parallel edges/nodes.
Nodes are labelled, and a node with a label is called an “node”. We use two sorts of labels. One sort corresponds to the constructors of the calculus presented in Sec. 2, namely (abstraction), (callbyneed application), (lefttoright callbyvalue application) and (righttoleft callbyvalue application). These three application nodes are the novelty of this work. The token, travelling in a graph, reacts to these nodes in different ways, and hence implements different evaluation orders. We believe that this is a more extensible way to accommodate different evaluation orders, than to let the token react to the same node in different ways depending on situation. The other sort consists of , , and for any natural number , used in the management of copying subgraphs. This sort is inspired by proof nets of the multiplicative and exponential fragment of linear logic [Gir87], and nodes generalise the standard binary contraction and subsume weakening.
The number of input/output and incoming/outgoing edges for a node is determined by the label, as indicated in Fig. 2. We distinguish two outputs of an application node (, or ), calling one “composition output” and the other “argument output” (cf. [AG09]). A bullet in the figure specifies a function output. The dashed box indicates a subgraph (“box”) that is connected to one node (“principal door”) and nodes (“auxiliary doors”). This box structure, taken from proof nets, assists the management of duplication of subgraphs by specifying those that can be copied.^{3}^{3}3 Our formalisation of graphs is based on the view of proof nets as string diagrams, and hence of boxes as functorial boxes [Mel06].
We define a graphrewriting abstract machine as a labelled transition system between graph states. [Graph states] A graph state is formed of a graph with its distinguished link , and token data that consists of:

a direction defined by ,

a rewrite flag defined by ,

a computation stack defined by , and

a box stack defined by , where is any link of the graph .
The distinguished link is called the “position” of the token. The token reacts to a node in a graph using its data, which determines its path. Given a graph with root , the initial state on it is given by , and the final state on it is given by . An execution on a graph is a sequence of transitions starting from the initial state .
Each transition between graph states is labelled by either , or . Transitions are deterministic, and classified into pass transitions that search for redexes and trigger rewriting, and rewrite transitions that actually rewrite a graph as soon as a redex is found.
A pass transition , always labelled with , applies to a state whose rewrite flag is . It simply moves the token over one node, and updates its data by modifying the top elements of stacks, while keeping an underlying graph unchanged. When the token passes a node or a node, a rewrite flag is changed to or , which triggers rewrite transitions. Fig. 3 defines pass transitions, by showing only the relevant node for each transition. The position of the token is drawn as a black triangle, pointing towards the direction of the token. In the figure, , and is a natural number. The pass transition over a node pushes the old position , a link drawn as a bullet, to a box stack.
The way the token reacts to application nodes (, and ) corresponds to the way the window moves in evaluating these function applications in the submachine semantics (Fig. 1). When the token moves on to the composition output of an application node, the top element of a computational stack is either or . The element makes the token return from a node, which corresponds to reducing the function part of application to a value (i.e. abstraction). The element lets the token proceed at a node, raises the rewrite flag , and hence triggers a rewrite transition that corresponds to betareduction. The callbyvalue application nodes ( and ) send the token to their argument output, pushing the element to a box stack. This makes the token bounce at a node and return to the application node, which corresponds to evaluating the argument part of function application to a value. Finally, pass transitions through nodes, nodes and nodes prepare copying of values, and eventually raise the rewrite flag that triggers ondemand duplication.
A rewrite transition , labelled with , applies to a state whose rewrite flag is either or . It changes a specific subgraph while keeping its interface, changes the position accordingly, and pops an element from a box stack. Fig. 4 defines rewrite transitions by showing a subgraph (“redex”) to be rewritten. Before we go through each rewrite transition, we note that rewrite transitions are not exhaustive in general, as a graph may not match a redex even though a rewrite flag is raised. However we will see that there is no failure of transitions in implementing the term calculus.
The first rewrite transition in Fig. 4, with label , occurs when a rewrite flag is . It implements betareduction by eliminating a pair of an abstraction node () and an application node ( in the figure). Outputs of the node are required to be connected to arbitrary nodes (labelled with and in the figure), so that edges between links are not introduced. The other rewrite transitions are for the rewrite flag , and they together realise the copying process of a subgraph (namely a box). The second rewrite transition in Fig. 4, labelled with , finishes off each copying process by eliminating all doors of the box . It replaces the interface of with output links of the auxiliary doors and the input link of the node, which is the new position of the token, and pops the top element of a box stack. Again, no edge between links are introduced.
The last rewrite transition in the figure, with label , actually copies a box. It requires the top element of the old box stack to be one of input links of the node (where is a natural number). The link is popped from the box stack and becomes the new position of the token, and the node becomes a node by keeping all the inputs except for the link . The subgraph must consist of parallel nodes that altogether have inputs. Among these inputs, must be connected to auxiliary doors of the box , and must be connected to nodes that are not in the redex. The subgraph is turned into by introducing inputs to these nodes as follows: if an auxiliary door of the box is connected to a node in , two copies of the auxiliary door are both connected to the corresponding node in . Therefore the two subgraphs consist of the same number of nodes, whose indegrees are possibly increased. The inputs, connected to nodes outside a redex, are kept unchanged. Fig. 5 shows an example where copying of the graph turns the graph into .
All pass and rewrite transitions are welldefined. The following “subgraph” property is essential in timecost analysis, because it bounds the size of duplicable subgraphs (i.e. boxes) in an execution. [Subgraph property] For any execution , each box of the graph appears as a subgraph of the initial graph .
Proof.
Rewrite transitions can only copy or discard a box, and cannot introduce, expand or reduce a single box. Therefore, any box of has to be already a box of the initial graph . ∎
When a graph has an edge between links, the token is just passed along. With this pass transition over a link at hand, the equivalence relation between graphs that identifies consecutive links with a single link—socalled “wire homeomorphism” [Kis12]—lifts to a weak bisimulation between graph states. Therefore, behaviourally, we can safely ignore consecutive links. From the perspective of timecost analysis, we benefit from the fact that rewrite transitions are designed not to introduce any edge between links. This means, by assuming that an execution starts with a graph with no consecutive links, we can analyse time cost of the execution without caring the extra pass transition over a link.
4. Implementation of Evaluation Strategies
The implementation of the term calculus, by means of the dynamic GoI, starts with translating (enriched) terms into graphs. The definition of the translation uses multisets of variables, to track how many times each variable occurs in a term. We assume that terms are alphaconverted in a form in which all binders introduce distinct variables. [Multiset] The empty multiset is denoted by , and the sum of two multisets and is denoted by . We write if the multiplicity of in a multiset is . Removing all from a multiset yields the multiset , e.g. . We abuse the notation and refer to a multiset of a finite number of ’s, simply as . [Free variables] The map of terms to multisets of variables is inductively defined as below, where :
For a multiset of variables, the map of evaluation contexts to multisets of variables is defined by:
A term is said be closed if . Consequences of the above definition are the following equations, where is not captured in .
We give translations of terms, answer contexts, and evaluation contexts separately. Fig. 8 and Fig. 8 define two mutually recursive translations and , the first one for terms and answer contexts, and the second one for evaluation contexts. In the figures, , and is the multiplicity of . The general form of the translations is as shown right.
The annotation of boldstroke edges means each edge of a bunch is labelled with an element of the annotating multiset, in a onetoone manner. In particular if a boldstroke edge is annotated by a variable , all edges in the bunch are annotated by the variable . These annotations are only used to define the translations, and are subsequently ignored during execution.
The translations are based on the socalled “callbyvalue” translation of linear logic to intuitionistic logic (e.g. [MOTW99]). Only the translation of abstraction can be accompanied by a box, which captures the fact that only values (i.e. abstractions) can be duplicated (see the basic rule (10) in Fig. 1). Note that only one node is introduced for each bound variable. This is vital to achieve constant cost in looking up a variable, namely in realising the basic rule (9) in Fig. 1.
The two mutually recursive translations and are related by the decompositions in Fig. 8, which can be checked by straightforward induction. In the third decomposition, is not captured in . Note that, in general, the translation of a term in an evaluation context cannot be decomposed into translations and . This is because a translation lacks a box structure, compared to a translation .
The inductive translations lift to a binary relation between closed enriched terms and graph states. [Binary relation ] The binary relation is defined by , where: (i) is a closed enriched term, and is given by with no edges between links, and (ii) there is an execution such that the position appears only in the last state of the sequence. A special case is , which relates the starting points of an evaluation and an execution. We require the graph to have no edges between links, which is based on the discussion at the end of Sec. 3 and essential for timecost analysis. Although the definition of the translations relies on edges between links (e.g. the translation ), we can safely replace any consecutive links in the composition of translations and with a single link, and yield the graph with no consecutive links.
The binary relation gives a weak simulation of the submachine semantics by the graphrewriting machine. The weakness, i.e. the extra transitions compared with reductions, comes from the locality of pass transitions and the bureaucracy of managing boxes. [Weak simulation with global bound]

If and hold, then there exists a number and a graph state such that and .

If holds, then the graph state is initial, from which only the transition is possible.
Proof outline.
The second half of the theorem is straightforward. For the first half, Fig. 9, Fig. 10 and Fig. 11 illustrate how the graphrewriting machine simulates each reduction of the submachine semantics. Annotations of edges are omitted. The figures altogether include ten sequences of translations , whose only first and last graph states are shown. Each sequence simulates a single reduction , and is preceded by a number (i.e. (1)) that corresponds to a basic rule applied by the reduction (see Fig. 1). Some sequences involve equations that apply the four decomposition properties of the translations and , which are given earlier in this section. These equations rely on the fact that reductions with labels and work modulo alphaequivalence to avoid name captures. This means that (i) free variables of (resp. ) are never captured by in the reduction (2) (resp. (5) and (8)), (ii) the variable is never captured by or , and (iii) free variables of are never captured by . Especially in simulation of the reduction (9), the variable is not captured by the evaluation context , and therefore the first token position is in fact an input of the node. ∎
5. TimeCost Analysis
We analyse how timeefficiently the tokenguided graphrewriting machine implements evaluation strategies, following the methodology developed by Accattoli et al. [ABM14, AS14, Acc17]. The timecost analysis focuses on how efficiently an abstract machine implements an evaluation strategy. In other words, we are not interested in minimising the number of reduction steps simulated by an abstract machine. Our aim is to see if the number of transitions of an abstract machine is “reasonable”, compared to the number of necessary reduction steps determined by a given evaluation strategy.
Accattoli’s methodology assumes that an abstract machine has three groups of transitions: 1) “transitions” that correspond to reduction in which substitution is delayed, 2) transitions perform substitution, and 3) other “overhead” transitions. We incorporate this classification using the labels , and of transitions.
Another assumption of the methodology is that, each step of reduction is simulated by a single transition of an abstract machine, and so is substitution of each occurrence of a variable. This is satisfied by many known abstract machines, including Danvy and Zerny’s storeless abstract machine [DZ13] that our submachine semantics resembles, however not by the tokenguided graphrewriting abstract machine. The machine has “finer” transitions and can take several transitions to simulate a single step of reduction, as we can observe in Thm. 8. In spite of this mismatch we can still follow the methodology, thanks to the weak simulation . It discloses what transitions of the tokenguided graphrewriting machine exactly correspond to reduction and substitution, and gives a concrete number of overhead transitions that the machine needs to simulate reduction and substitution.
The methodology of timecost analysis has four steps: (I) bound the number of transitions required in implementing evaluation strategies, (II) estimate time cost of each transition, (III) bound overall time cost of implementing evaluation strategies, by multiplying the number of transitions with time cost for each transition, and finally (IV) classify the abstract machine according to its execution time cost. Consider now the following taxonomy of abstract machines introduced in
[Acc17]. [classes of abstract machines [Acc17, Def. 7.1]]
An abstract machine is efficient if its execution time cost is linear in both the input size and the number of transitions.

An abstract machine is reasonable if its execution time cost is polynomial in the input size and the number of transitions.

An abstract machine is unreasonable if it is not reasonable.
In our case, the input size is given by the size of the term , inductively defined by:
The number of transitions is simply the number of transitions labelled with , which in fact corresponds to the number of reductions labelled with , thanks to Thm. 8.
Given an evaluation , the number of occurrences of a label is denoted by . The submachine semantics comes with the following quantitative bounds. For any evaluation that terminates, the number of reductions is bounded by and
Proof outline.
A term uses a single evaluation strategy, either callbyneed, lefttoright callbyvalue, or righttoleft callbyvalue. The proof is by developing the onetoone correspondence between an evaluation by the submachine semantics and a “derivation” in the linear substitution calculus. This goes in the same way Accattoli et al. analyse various abstract machines [ABM14], especially the proof of the second equation [ABM14, Thm. 11.3 & Thm. 11.5]. The first equation is a direct application of the bounds about the linear substitution calculus [AS14, Cor. 1 & Thm. 2]. ∎
We use the same notation , as for an evaluation, to denote the number of occurrences of each label in an execution . Additionally the number of rewrite transitions with the label is denoted by . The following proposition completes the first step of the cost analysis. [Soundness & completeness, with number bounds] For any pure closed term , an evaluation terminates with the enriched term if and only if an execution terminates with the graph . Moreover the number of transitions is bounded by , , , .
Proof.
The next step in the cost analysis is to estimate the time cost of each transition. We assume that graphs are implemented in the following way. Each link is given by two pointers to its child and its parent, and each node is given by its label and pointers to its outputs. Abstraction nodes () and application nodes (, and ) have two pointers that are distinguished, and all the other nodes have only one pointer to their unique output. Additionally each node has pointers to inputs of its associated nodes, to represent a box structure. Accordingly, a position of the token is a pointer to a link, a direction and a rewrite flag are two symbols, a computation stack is a stack of symbols, and finally a box stack is a stack of symbols and pointers to links.
Using these assumptions of implementation, we estimate time cost of each transition. All pass transitions have constant cost. Each pass transition looks up one node and its outputs (that are either one or two) next to the current position, and involves a fixed number of elements of the token data. Rewrite transitions with the label have constant cost, as they change a constant number of nodes and links, and only a rewrite flag of the token data. Rewrite transitions with the label remove a box structure, and hence have cost bounded by the number of the auxiliary doors. Finally, rewrite transitions with the label copy a box structure. Copying cost is bounded by the size of the box, i.e. the number of nodes and links in the box. Updating cost of the subgraph (see Fig. 4) is bounded by the number of auxiliary doors, that is less than the size of the copied box. The assumption about the implementation of graphs enables us to conclude updating cost of the node is constant.
With the results of the previous two steps, we can now give the overall time cost of executions and classify our abstract machine. [Soundness & completeness, with cost bounds] For any pure closed term , an evaluation terminates with the enriched term if and only if an execution terminates with the graph . The overall time cost of the execution is bounded by .
Proof.
Nonconstant cost of rewrite transitions are either the number of auxiliary doors of a box or the size of a box. The former can be bounded by the latter, which is no more than the size of the initial graph , by Lem. 5. The size of the initial graph can be bounded by the size of the initial term. Therefore any nonconstant cost of each rewrite transition, in the execution , can be also bounded by . By Prop. 5, the overall time cost of rewrite transitions labelled with is , and that of the other rewrite transitions and pass transitions is . ∎
The tokenguided graphrewriting machine is an efficient abstract machine implementing callbyneed, lefttoright callbyvalue and righttoleft callbyvalue evaluation strategies, in the sense of Def. 5.
Cor. 5 classifies the graphrewriting machine as not just “reasonable”, but in fact “efficient”. In terms of token passing, this efficiency benefits from the graphical representation of environments (i.e. explicit substitutions in our setting). The graphical representation is in such a way that each bound variable is associated with exactly one node, which is ensured by the translations and and the rewrite transition . Excluding any two sequentiallyconnected nodes is essential to achieve the “efficient” classification, because it yields the constant cost to look up a bound variable and its associated computation.
As for graph rewriting, the “efficient” classification shows that introduction of graph rewriting to token passing does not bring in any inefficiencies. In our setting, graph rewriting brings in two kinds of nonconstant cost. One is duplication cost of a subgraph, which is indicated by a box, and the other is elimination cost of a box that delimits abstraction. Unlike the duplication cost, the elimination cost leads to nontrivial cost that abstract machines in the literature usually do not have. Namely, our graphrewriting machine simulates a reduction step, in which an abstraction constructor is eliminated and substitution is delayed, at the nonconstant cost depending on the size of the abstraction. The timecost analysis confirms that the duplication cost and the unusual elimination cost have the same impact, on the overall time cost, as the cost of token passing. What is vital here is the subgraph property (Lem. 5), which ensures that the cost of each duplication and elimination of a box is always linear in the input size.
6. Rewriting vs. Jumping
The starting point of our development is the GoIstyle tokenpassing abstract machines for callbyname evaluation, given by Danos and Regnier [DR96], and by Mackie [Mac95]. Fig. 12 recalls these tokenpassing machines as a version of the DGoIM with the passesonly interleaving strategy (i.e. the DGoIM with only pass transitions). It follows the convention of Fig. 3, but a black triangle in the figure points along (resp. against) the direction of the edge if the token direction is (resp. ). Note that this version uses different token data, to which we will come back later.
Tokenpassing GoI keeps the underlying graph fixed, and reevaluates a term by repeating token moves. It therefore favours space efficiency at the cost of time efficiency. Repeating token actions poses a challenge for evaluations in which duplicated computation must not lead to repeated evaluation, especially callbyvalue evaluation [FM02, Sch14a, HMH14, DFVY15]. Moreover, in callbyvalue repeating token actions raises the additional technical challenge of avoiding repeating any associated computational effects [Sch11, MHH16, DFVY17]. A partial solution to this conundrum is to focus on the soundness of the equational theory, while deliberately ignoring the time costs [MHH16]. Introduction of graph reduction, the key idea of the DGoIM, is one total solution in the sense that it avoids repeated token moves and also improves time efficiency of tokenpassing GoI. Another such solution in the literature is introduction of jumps. We discuss how these two solutions affect machine design and space efficiency.
The most greedy way of introducing graph reduction, namely the rewritesfirst interleaving we studied in this work, simplifies machine design in terms of the variety of pass transitions and token data. First, some token moves turn irrelevant to an execution. This is why Fig. 3 for the rewritesfirst interleaving has fewer pass transitions than Fig. 12 for the passesonly interleaving. Certain nodes, like ‘’, always get eliminated before visited by the token, in the rewritesfirst interleaving. Accordingly, token data can be simplified. The box stack and the environment stack used in Fig. 12 are integrated to the single box stack used in Fig. 3. The integrated stack does not need to carry the exponential signatures. They make sure that the token exits boxes appropriately in the tokenpassing GoI, by maintaining binary tree structures, but the token never exits boxes with the rewritesfirst interleaving. Although the rewritesfirst interleaving simplifies token data, rewriting itself, especially duplication of subgraphs, becomes the source of spaceinefficiency.
A jumping mechanism can be added on top of the tokenpassing GoI, and enables the token to jump along the path it would otherwise follow stepbystep. Although no quantitative analysis is provided, it gives timeefficient implementations of evaluation strategies, namely of callbyname evaluation [DR96] and callbyvalue evaluation [FM02]. Jumping can reduce the variety of pass transitions, like rewriting, by letting some nodes always be jumped over. Making a jump is just changing the token position, so jumping can be described as a variation of pass transitions, unlike rewriting. However, introduction of jumping rather complicates token data. Namely it requires partial duplications of token data, which not only complicates machine design but also damages space efficiency. The duplications effectively represent virtual copies of subgraphs, and accumulate during an execution. Tracking virtual copies is the tradeoff of keeping the underlying graph fixed. Some jumps that do not involve virtual copies can be described as a form of graph rewriting that eliminates nodes.
Finally, we give a quantitative comparison of space usage between rewriting and jumping. As a case study, we focus on implementations of callbyname/need evaluation, namely on the passesonly DGoIM recalled in Fig. 12, our rewritesfirst DGoIM, and the passesonly DGoIM equipped with jumping that we will recall in Fig. 13. A similar comparison is possible for lefttoright callbyvalue evaluation, between our rewritesfirst DGoIM and the jumping machine given by Fernández and Mackie [FM02].
Fig. 13 recalls Danos and Regnier’s tokenpassing machine equipped with jumping [DR96], which is proved to be isomorphic to the Krivine abstract machine [Kri07] for callbyname evaluation. The machine has pass transitions as well as the jump transition that lets the token jump to a remote position. Compared with the tokenpassing GoI (Fig. 12), pass transitions for nodes related to boxes are reduced and changed, so that the jumping mechanism imitates rewrites involving boxes. The token remembers its old position, together with its current environment stack, when passing a node upwards. The token uses this information and make a jump back in the jump transition, in which the token exits a box at the principal door (node) and changes its position to the remembered link .
The quantitative comparison, whose result is stated below, shows partial duplication of token data impacts space usage much more than duplication of subgraphs, and therefore rewriting has asymptotically better space usage than jumping. After transitions from an initial state of a graph of size , space usage of three versions of the DGoIM is bounded as in the table below.
machines  tokenpassing only  rewriting added  jumping added 

(Fig. 12)  (Fig. 3 & Fig. 4)  (Fig. 13)  
evaluations implemented  callbyname  callbyneed  callbyname 
size of graph  
size of token position  
size of token data 
Proof.
The size of the underlying graph after transitions can be estimated using the size of the initial graph. Our rewritesfirst DGoIM is the only one that changes the underlying graph during an execution. Thanks to the subgraph property (Lem. 5), the size can be bounded as , where is the number of labelled transitions in the transitions. In the tokenpassing machines with and without jumping (Fig. 12 and Fig. 13), clearly . In any of the three machine, the token position can be represented in the size of .
Next estimation is of token data. Because stacks can have a link of the underlying graph as an element, the size of token data after transitions depends on . Both in the tokenpassing machine (Fig. 12) and our rewritesfirst DGoIM, at most one element is pushed in each transition. Therefore the size of token data is bounded by . On the other hand, in the jumping machine (Fig. 13), the size of token data, especially the box stack and the environment stack, can grow exponentially because of the partial duplication. Therefore token data has the size . For example, a term with many expansions, like , causes exponential grow of the box stack in the jumping machine. ∎
7. Related Work and Conclusion
In an abstract machine of any functional programming language, computations assigned to variables have to be stored for later use. Potentially multiple, conflicting, computations can be assigned to a single variable, primarily because of multiple uses of a function with different arguments. Different solutions to this conflict lead to different representations of the storage, some of which are examined by Accattoli and Barras [AB17] from the perspective of timecost analysis. We recall a few solutions below that seem relevant to our tokenguided graphrewriting.
One solution is to allow at most one assignment to each variable. This is typically achieved by renaming bound variables during execution, possibly symbolically. Examples for callbyneed evaluation are Sestoft’s abstract machines [Ses97], and the storeless and storebased abstract machines studied by Danvy and Zerny [DZ13]. Our graphrewriting abstract machine gives another example, as shown by the simulation of the submachine semantics that resembles the storeless abstract machine mentioned above. Variable renaming is trivial in our machine, thanks to the use of graphs in which variables are represented by mere edges.
Another solution is to allow multiple assignments to a variable, with restricted visibility. The common approach is to pair a subterm with its own “environment” that maps its free variables to their assigned computations, forming a socalled “closure”. Conflicting assignments are distributed to distinct localised environments. Examples include Cregut’s lazy variant [Cré07] of Krivine’s abstract machine for callbyneed evaluation, and Landin’s SECD machine [Lan64] for callbyvalue evaluation. Fernández and Siafakas [FS09] refine this approach for callbyname and callbyvalue evaluations, based on closed reduction [FMS05], which restricts betareduction to closed function arguments. This suggests that the approach with localised environments can be modelled in our setting by implementing closed reduction. The implementation would require an extension of rewrite transitions and a different strategy to trigger them, namely to eliminate auxiliary doors of a box.
Finally, Fernández and Siafakas [FS09] propose another approach to multiple assignments, in which multiple assignments are augmented with binary strings so that each occurrence of a variable can only refer to one of them. This approach is inspired by the tokenpassing GoI, namely a tokenpassing abstract machine for callbyvalue evaluation, designed by Fernández and Mackie [FM02]. The augmenting binary strings come from paths of trees of binary contractions, which are used by the tokenpassing machine to represent shared assignments. In our graphrewriting machine, trees of binary contractions are replaced with single generalised contraction nodes of arbitrary arity, to achieve time efficiency. Therefore, the counterpart of the paths over binary contractions is simply connections over single generalised contraction nodes.
To wrap up, we introduced the DGoIM, which can interleave tokenpassing GoI with graph rewriting, using the tokenpassing as a guide. As a case study, we showed how the DGoIM with the rewritesfirst interleaving can timeefficiently implement three evaluations: callbyneed, lefttoright callbyvalue and righttoleft callbyvalue. These evaluations have different control over caching intermediate results. The difference boils down to different routing of the token in the DGoIM, which is achieved by simply switching graph representations (namely, nodes modelling function application) of terms.
The idea of using the token as a guide of graph rewriting was also proposed by Sinot [Sin05, Sin06] for interaction nets. He shows how using a token can make the rewriting system implement the callbyname, callbyneed and callbyvalue evaluation strategies. Our development in this work can be seen as a realisation of the rewriting system as an abstract machine, in particular with explicit control over copying subgraphs.
The tokenguided graph rewriting is a flexible framework in which we can carry out the study of spacetime tradeoff in abstract machines for various evaluation strategies of the lambdacalculus. Our focus in this work was primarily on time efficiency. This is to complement existing work on operational semantics given by tokenpassing GoI, which usually achieves space efficiency, and also to confirm that introduction of graph rewriting to the semantics does not bring in any hidden inefficiencies. We believe that further refinements, not only of the interleaving strategies of token routing and graph reduction, but also of the graph representation, can be formulated to serve particular objectives in the spacetime execution efficiency tradeoff, such as full lazy evaluation, as hinted by Sinot [Sin05].
As a final remark, the flexibility of our framework also allows us to handle the operational semantics of exotic language features, especially dataflow features. One such feature is to turn a parameterised dataflow network into an ordinary function that takes parameters as an argument and returns the network, which we model using the tokenguided graph rewriting [CDG18]
. This feature can assist a common programming idiom of machine learning tasks, in which a dataflow network is constructed as a program, and then modified at runtime by updating values of parameters embedded into the network.
Acknowledgement
We are grateful to Ugo Dal Lago and anonymous reviewers for encouraging and insightful comments on earlier versions of this work. We thank Steven Cheung for helping us implement the online visualiser. The second author is grateful to Michele Pagani for stimulating discussions in the very early stages of this work.
References
 [AB17] Beniamino Accattoli and Bruno Barras. Environments and the complexity of abstract machines. In PPDP 2017, pages 4–16. ACM, 2017.
 [ABM14] Beniamino Accattoli, Pablo Barenbaum, and Damiano Mazza. Distilling abstract machines. In ICFP 2014, pages 363–376. ACM, 2014.
 [Acc17] Beniamino Accattoli. The complexity of abstract machines. In WPTE 2016, volume 235 of EPTCS, pages 1–15, 2017.
 [AD16] Beniamino Accattoli and Ugo Dal Lago. (leftmostoutermost) beta reduction is invariant, indeed. Logical Methods in Comp. Sci., 12(1), 2016.
 [AG09] Beniamino Accattoli and Stefano Guerrini. Jumping boxes. In CSL 2009, volume 5771 of Lect. Notes Comp. Sci., pages 55–70. Springer, 2009.
 [AS14] Beniamino Accattoli and Claudio Sacerdoti Coen. On the value of variables. In WoLLIC 2014, volume 8652 of Lect. Notes Comp. Sci., pages 36–50. Springer, 2014.
 [CDG18] Steven Cheung, Victor Darvariu, Dan R. Ghica, Koko Muroya, and Reuben N. S. Rowe. A functional perspective on machine learning via programmable induction and abduction. In FLOPS 2018, 2018. To appear.
 [Cré07] Pierre Crégut. Strongly reducing variants of the Krivine abstract machine. HigherOrder and Symbolic Computation, 20(3):209–230, 2007.
 [DFVY15] Ugo Dal Lago, Claudia Faggian, Benoît Valiron, and Akira Yoshimizu. Parallelism and synchronization in an infinitary context. In LICS 2015, pages 559–572. IEEE, 2015.
 [DFVY17] Ugo Dal Lago, Claudia Faggian, Benoît Valiron, and Akira Yoshimizu. The Geometry of Parallelism: classical, probabilistic, and quantum effects. In POPL 2017, pages 833–845. ACM, 2017.
 [DG11] Ugo Dal Lago and Marco Gaboardi. Linear dependent types and relative completeness. In LICS 2011, pages 133–142. IEEE Computer Society, 2011.
 [DMMZ12] Olivier Danvy, Kevin Millikin, Johan Munk, and Ian Zerny. On interderiving smallstep and bigstep semantics: a case study for storeless callbyneed evaluation. Theor. Comp. Sci., 435:21–42, 2012.
 [DP12] Ugo Dal Lago and Barbara Petit. Linear dependent types in a callbyvalue scenario. In PPDP 2012, pages 115–126. ACM, 2012.
 [DR96] Vincent Danos and Laurent Regnier. Reversible, irreversible and optimal lambdamachines. Elect. Notes in Theor. Comp. Sci., 3:40–60, 1996.
 [DS16] Ugo Dal Lago and Ulrich Schöpp. Computation by interaction for spacebounded functional programming. Inf. Comput., 248:150–194, 2016.
 [DZ13] Olivier Danvy and Ian Zerny. A synthetic operational account of callbyneed evaluation. In PPDP 2013, pages 97–108. ACM, 2013.
 [FM02] Maribel Fernández and Ian Mackie. Callbyvalue lambdagraph rewriting without rewriting. In ICGT 2002, volume 2505 of LNCS, pages 75–89. Springer, 2002.
 [FMS05] Maribel Fernández, Ian Mackie, and FrançoisRégis Sinot. Closed reduction: explicit substitutions without alphaconversion. Math. Struct. in Comp. Sci., 15(2):343–381, 2005.
 [FS09] Maribel Fernández and Nikolaos Siafakas. New developments in environment machines. Elect. Notes in Theor. Comp. Sci., 237:57–73, 2009.
 [Ghi07] Dan R. Ghica. Geometry of Synthesis: a structured approach to VLSI design. In POPL 2007, pages 363–375. ACM, 2007.
 [Gir87] JeanYves Girard. Linear logic. Theor. Comp. Sci., 50:1–102, 1987.
 [Gir89] JeanYves Girard. Geometry of Interaction I: interpretation of system F. In Logic Colloquium 1988, volume 127 of Studies in Logic & Found. Math., pages 221–260. Elsevier, 1989.
 [GS11] Dan R. Ghica and Alex Smith. Geometry of Synthesis III: resource management through type inference. In POPL 2011, pages 345–356. ACM, 2011.
 [GSS11] Dan R. Ghica, Alex Smith, and Satnam Singh. Geometry of Synthesis IV: compiling affine recursion into static hardware. In ICFP, pages 221–233, 2011.
 [HMH14] Naohiko Hoshino, Koko Muroya, and Ichiro Hasuo. Memoryful Geometry of Interaction: from coalgebraic components to algebraic effects. In CSLLICS 2014, pages 52:1–52:10. ACM, 2014.
 [Kis12] Aleks Kissinger. Pictures of processes: automated graph rewriting for monoidal categories and applications to quantum computing. CoRR, abs/1203.0202, 2012.
 [Kri07] JeanLouis Krivine. A callbyname lambdacalculus machine. HigherOrder and Symbolic Computataion, 20(3):199–207, 2007.
 [Lan64] Peter Landin. The mechanical evaluation of expressions. The Comp. Journ., 6(4):308–320, 1964.
 [Mac95] Ian Mackie. The Geometry of Interaction machine. In POPL 1995, pages 198–208. ACM, 1995.
 [Mel06] PaulAndré Melliès. Functorial boxes in string diagrams. In CSL 2006, volume 4207 of Lect. Notes Comp. Sci., pages 1–30. Springer, 2006.
 [MG17a] Koko Muroya and Dan R. Ghica. The dynamic Geometry of Interaction machine: a callbyneed graph rewriter. In CSL 2017, volume 82 of LIPIcs, pages 32:1–32:15, 2017.
 [MG17b] Koko Muroya and Dan R. Ghica. Efficient implementation of evaluation strategies via tokenguided graph rewriting. In WPTE 2017, 2017.
 [MHH16] Koko Muroya, Naohiko Hoshino, and Ichiro Hasuo. Memoryful Geometry of Interaction II: recursion and adequacy. In POPL 2016, pages 748–760. ACM, 2016.
 [MOTW99] John Maraist, Martin Odersky, David N. Turner, and Philip Wadler. Callbyname, callbyvalue, callbyneed and the linear lambda calculus. Theor. Comp. Sci., 228(12):175–210, 1999.
 [Sch11] Ulrich Schöpp. Computationbyinteraction with effects. In APLAS 2011, volume 7078 of Lect. Notes Comp. Sci., pages 305–321. Springer, 2011.
 [Sch14a] Ulrich Schöpp. Callbyvalue in a basic logic for interaction. In APLAS 2014, volume 8858 of Lect. Notes Comp. Sci., pages 428–448. Springer, 2014.
 [Sch14b] Ulrich Schöpp. Organising lowlevel programs using higher types. In PPDP 2014, pages 199–210. ACM, 2014.
 [Ses97] Peter Sestoft. Deriving a lazy abstract machine. J. Funct. Program., 7(3):231–264, 1997.
 [Sin05] FrançoisRégis Sinot. Callbyname and callbyvalue as tokenpassing interaction nets. In TLCA 2005, volume 3461 of Lect. Notes Comp. Sci., pages 386–400. Springer, 2005.
 [Sin06] FrançoisRégis Sinot. Callbyneed in tokenpassing nets. Math. Struct. in Comp. Sci., 16(4):639–666, 2006.
Comments
There are no comments yet.