1. Introduction
Equality graphs (“”) are data structures originally developed in the automated theorem proving community to implement congruence closure procedures. At a highlevel, store terms in a unionfindlike data structures (Nelson, 1980; Nieuwenhuis and Oliveras, 2005), maintaining two additional invariants: (1) the equivalence relation stored by the unionfind is also a congruence relation,^{1}^{1}1 If , then . and (2) equivalent terms are stored without duplication, i.e. equivalent subterms are shared whenever possible.
Over the past decade, several projects have demonstrated how can be used for program synthesis and optimization. Theses projects have roughly followed the strategy of equality saturation (Joshi et al., 2002; Tate et al., 2009; Stepp et al., 2011). First, an initial Egraph is constructed which represents the input program’s AST. Then a set of directed rewrite rules are used to expand the Egraph. Each rewrite includes a pattern to find and a pattern to instantiate and merge with the matched subterm. By repeatedly applying rewrites, many equivalences are added to the Egraph which causes the Egraph to represent exponentially many programs in the size of the Egraph. Critically, rewrites in an Egraph are not destructive, so many rewrites can be applied simultaneously without risking taking a “wrong turn” with a poor rewrite choice. After the Egraph reaches a fixed point with respect to the rewrite rule database (or a timeout is reached), a final extraction procedure analyzes the Egraph and selects the optimal program with respect to a userprovided cost function.
While promising, equality saturation is hampered by the fact that are ultimately still designed for use in theorem provers. For example, DPLL requires the ability to backtrack and undo unifications. inside theorem provers accomplish this by maintaining several linked lists that enable quick splicing of facts in or out of the Egraph. The equality saturation use case does not require backtracking, and thus a simpler, faster data structure could be used. Theorem provers also subject the Egraph to a very different workload than equality saturation: theorem provers assert and query for fact all the time, where as the rewritedriven approach of equality saturation has distinct phases of searching for patterns and then asserting the righthand sides. Finally, while theorem provers are generalpurpose tools used in various domains, equality saturation engines are not. Equality saturation users have frequently reimplemented from scratch due to the need for additional “interpreted” reasoning ability that the purely syntactic approach of rewrite rules does not allow for.
This paper presents (Egraphs g
ood), an opensource
^{2}^{2}2https://github.com/mwillsey/egg library for easy, efficient, and extensible . egg specifically targets equality saturation, taking advantage of its unique workload to provide optimized for program synthesis and optimization. egg uses an efficient, cachefriendly representation since it does not need to support backtracking. egg also uses a novel technique called rebuilding that loosens some of the key invariants of the Egraph data structure, enforcing them only at key parts of the equality saturation algorithm. This amortizes the cost of maintaining those invariants, leading to significant performance improvements.In addition to performance, prioritizes flexibility so that users across domains need not reimplement from scratch. One key feature giving that flexibility is metadata, a new technique that annotates eclasses with additional data allowing the user to implement analyses not possible with purely syntactic rewrite rules. Our case studies demonstrate how metadata facilitates analyses like constant folding, which previous equality saturation implementations baked in to their Egraph implementation.
In summary, the contributions of this paper include:

Rebuilding, a technique that enforces the Egraph data structure invariants at selected points in the equality saturation algorithm. Our evaluation demonstrates that rebuilding provides a superlinear speedup over existing techniques.

Metadata, a technique for maintaining additional information in eclasses that makes it possible to integrate complex, semantic analyses like constant folding that cannot be expressed as syntactic rewrites.

A fast, extensible implementation of an Egraph library, .

egg has been used already in several successful projects across domains such as floating point accuracy, linear algebra optimization, and CAD program synthesis. We present these projects as case studies demonstrating applications of in deductive synthesis and program optimization.
2. Overview
Many problems in program optimization, theorem proving, and other domains have a similar shape: given some input expression(s), search for a “better” equivalent expression. This paper puts forward the case that, with our proposed advances, equality saturation is now the right tool for the job in many cases like these.
We will work through an extended example around optimizing the expression and discover the benefits of and equality saturation, current limitations of using and implementing this approach, and how addresses those limitations.
2.1. Term Rewriting
Term rewriting (Dershowitz, 1993) is a timetested approach for equational reasoning in program optimization (Tate et al., 2009), theorem proving (Detlefs et al., 2005; De Moura and Bjørner, 2008), and program transformation (Andries et al., 1999). In this setting, a tool repeatedly chooses one of a set of rewrites (a.k.a. axioms), searches for matches of the lefthand pattern in the given expression, and replaces matching instances with the substituted righthand side.
Term rewriting is typically destructive. Consider applying a simple strength reduction rewrite to our example: . The result is a new term that carries no information about the initial term. Applying the strength reduction at this point is plainly the wrong choice, as we have now lost the chance to cancel out the 2’s. This classically tricky question of when to apply which rewrite is referred as the phase ordering problem.
2.2. Egraphs and Equality Saturation
One solution to the phase ordering problem would simply apply all rewrites simultaneously, keeping track of every expression seen. This eliminates the problem of choosing the right rule, but a naive implementation would require space exponential in the number of rewrites. Equality saturation (Tate et al., 2009; Stepp et al., 2011) is a technique to do this rewriting efficiently using an Egraph (Nelson, 1980), a data structure originally developed for maintaining congruence closure in theorem provers (Detlefs et al., 2005; De Moura and Bjørner, 2008).
An Egraph consists of eclasses containing equivalent enodes. (a) shows an Egraph that represents our example expression . Edges connect an enode to its children eclasses.
An eclass represents a term if contains an enode that matches the root of and the children of represent the children of . An Egraph represents a term if one of its eclasses does. As eclasses grow, the Egraph represents exponentially many terms, essentially one for every choice of representative enode for each eclass. The Egraph in (a) is essentially an AST with sharing, since each eclass is a singleton.
Egraphs bear many similarities to the unionfind data structure (Galler and Fisher, 1964), and much of the terminology is inherited. Egraphs are manipulated by two main operations: adding new enodes (into new eclasses) and merging eclasses (sometimes called asserting equivalences). These operations maintain two important invariants, which we will call the Egraph invariants:

Deduplication: The Egraph must not contain two enodes with the same operator and equivalent children.

Congruence: The equivalence relation on terms must also be a congruence relation, i.e. if then .
Deduplication is typically maintained by hashconsing, or memoizing, the add operation. If the user tries to add an enode that is already represented in the Egraph, the Egraph should simply return the eclass of the enode instead of inserting it. Congruence is traditionally maintained by keeping a list of parent pointers for each eclass that stores which enodes have that eclass as children. On the merge operation, the parents of the merged classes must be checked to see if any pairs of them became equivalent, proceeding recursively until no additional equivalences are found.
The add and merge operations can be composed to perform rewriting over the Egraph in a way that does not “forget” the initial term. To apply a rewrite to an Egraph, the user first searches for eclasses where there are substitutions such that represents . Then, for each substitution found at eclass , the user adds to the Egraph, returning a new eclass with a single enode, and then finally merges eclasses and .
(b) shows an Egraph after performing two simple rewrites. Note how the process is only additive; our initial term is still represented in the Egraph. This not only solves the rule choice problem, but also handles rules like commutativity that can be troublesome to a conventional rewrite system. If the user tried to apply to the Egraph in (a), adding the righthand side is essentially a noop, which the Egraph can detect and stop applying that rule. This case is called saturation, meaning the Egraph has learned every possible equivalence derivable from the given rewrites.
Equality saturation is the process of creating an initial Egraph from a given term, running a set of rewrite rules until the Egraph is saturated (or a timeout is reached), and finally extracting the optimal represented term according to some cost function. For simple cost functions, a bottomup, greedy traversal of the Egraph suffices to find the best term. Other extraction procedures have been explored for more complex cost functions (Wang et al., 2020; Wu et al., 2019).
By eliminating the tedious and often errorprone tasks of choosing when / what rewrites to apply, and proving the optimality of the result, equality saturation promises an appealingly simple workflow: state the relevant rewrites for the language, create an initial Egraph from a given expression, fire the rules until saturation, and finally extract the optimal equivalent expression. Unfortunately, the technique remains adhoc; prospective equality saturation users must implement their own customized to their language, avoiding performance pitfalls and hacking in the capabilities to do nonsyntactic analyses. aims to address each aspect of these difficulties.
2.3. : Easy
provides that are generic over the language of interest. Using as Rust library, users define the language by giving its operators shown in Figure 2. Note that these are the operators only; a term is an operator paired with zero or more children. The user may also annotate their language (not shown) such that can automatically derive a parser and prettyprinter, allowing easy creation of terms.
The user is free to manipulate ’s using the API to add enodes and merging eclasses. Typically, however, the user will want to search for and apply rewrites. Most rewrites are purely syntactic, and lets one concisely define such rewrites with the rewrite! macro. additionally supports more complex rewrites with conditions or arbitrary code (Section 4.2).
has builtin support for equality saturation, so the typical user need not worry about manually firing rewrite rules. Conceptually, the “outer loop” of equality saturation is simple and should not vary across domains: search for each rewrite, apply the matches, and loop until saturation. However, our experience implementing several of these loops has shown that it requires a significant amount of bugprone boilerplate code to handle practicalities like timeouts and Egraph size limits, saturation detection, statistics reporting, and rule scheduling. ’s Runner API provides these features in a configurable way, obviating the need to reimplement the outer loop in most cases.
Finally, also provides functionality for extracting the best represented term from an Egraph according to some cost function. The Extractor feature is generic over the cost function, but provides simple ones like AstSize that work in many cases. ’s extraction performs a greedy search, which will yield the optimal term if the cost function is monotonic. More complicated extraction techniques (like those in (Wu et al., 2019; Wang et al., 2020)) can be done manually in as well.
2.4. : Efficient
is, to our knowledge, the first generalpurpose, reusable Egraph implementation. This has allowed focused effort on optimization, knowing that any benefits will be seen across use cases as opposed to a single, adhoc instance. combines systems programming best practices with novel techniques to make —and the equality saturation use case in particular—more efficient.
is implemented in Rust (17), and all its components are generic over the userprovided language, cost functions, etc., giving the compiler freedom to monomorphize and inline userwritten code. This is especially important since frequently jumps between library code (ex: searching for rewrites) and user code (ex: comparing operators). Furthermore, is designed from the ground up to use cachefriendly, flat buffers with minimal indirection for most internal data structures. This is in sharp contrast to the traditional Lispy representation of (Nelson, 1980; Detlefs et al., 2005) that contains many tree and linkedlistlike data structures. additionally compiles patterns to be executed by a small virtual machine (de Moura and Bjørner, 2007), as opposed to recursively walking the treelike representation of patterns.
’s are designed specifically for the equality saturation workflow. Prior Egraph implementations (Panchekha et al., 2015) spent the vast majority of their runtime maintaining the Egraph invariants of deduplication and congruence. uses a novel technique called rebuilding to loosen that requirement, enforcing the invariants only when absolutely necessary (Section 3). This also allows to simplify internal data structures, leading to even further speedups.
2.5. : Extensible
Typically, most of the rewrites used in equality saturation are purely syntactic. Since is generic over the userdefined language, it supports this uninterpreted style of reasoning. In some cases, however, it is useful to support interpreted reasoning that depends on the values in the Egraph. For example, after applying associativity rewrites, the Egraph in Figure 1 will contain the term , which we would of course like to simplify to 1. One approach would be to rewrite terms containing constants to their evaluation, a form of constant folding over the Egraph. This cannot be done in general with syntactic rewrites,^{3}^{3}3 In this case, a purely syntactic cancellation rewrite would suffice: . Other cases require evaluation, e.g., . A real implementation would support cancellation of symbolic terms and evaluation of concrete ones. since it requires the ability to “see” the matching substitution and compute over it.
Most previous implementations of equality saturation support constant folding, as it is essential to finding more equalities and extracting simpler terms. However, those implementations were already specialized to their application domains, and the constant folding support is typically implemented as a manual pass over the Egraph.
supports constant folding over the Egraph and much more with two general techniques that work across userdefined languages:

Conditional and dynamic rewrites (Section 4.2) where the righthand side of the rewrite can be conditioned on or even computed from the match of the lefthand side.

Metadata (Section 4.3), a novel, powerful system that allows the user to arbitrarily attach latticelike data to each eclass.
We have used these techniques to implement not only constant folding, but several other features which are discussed in the case studies (Section 5). Importantly, supports all of these as a library: the use need not modify anything about the Egraph or ’s internals to implement these advanced analyses.
3. Rebuilding: Amortized Invariant Maintenance
Among ’s optimizations (Section 2.4), rebuilding is perhaps the most important. Rebuilding is a novel technique that lies at the heart of ’s modified equality saturation algorithm. This crucial technique specializes the Egraph data structure to the equality saturation workload, yielding substantial performance improvements.
Figure 3 shows both the traditional and ’s modified equality saturation loop. The key distinction is when the Egraph invariants of deduplication and congruence are maintained. In traditional , like with many data structures, invariants always hold. In contrast, mutating an Egraph may violate invariants, causing equalities to be not “seen” when searching for patterns. lets the user (or the algorithm) choose when to restore the invariants by calling the rebuild method. Rebuilding leads to a lower amortized cost of maintaining the Egraph invariants.
3.1. Hashconsing and Upward Merging
Traditional constantly maintain the data structure invariants while adding new enodes and merging existing eclasses (pseudocode in Figure 4). In the equality saturation algorithm (Figure 3), these modifications take place on lines 1516.
3.1.1. Hashconsing
Adding enode does not add any new equivalences, so congruence holds automatically. To maintain the deduplication invariant, use a technique called hashconsing^{4}^{4}4 “cons” as the technique originates from early Lisp implementations. to ensure that enodes share as much structure as possible. The hashcons data structure essentially maps enodes (which consist of operator and eclass children) to the eclass in which the enode resides.
Consider adding enodes and where is equivalent to . The two enodes are plainly equivalent, but a naive hashcons would not return a “hit” on the second add. Canonicalization ensures deduplication across equivalent but not identical enodes. The hashcons maintains the invariant that the enode keys in the map are canonical, i.e., the children of those enodes are leaders in the Egraph’s unionfind. This allows the add procedure to quickly detect whether the given enode is equivalent to one already in the Egraph.
3.1.2. Upward Merging
Merging eclasses is more onerous than adding new enodes, as adding new equivalences risks violating both the deduplication and congruence invariants. In existing Egraph implementations used for equality saturation, maintaining these invariants while merging can take the vast majority of runtime.
First consider congruence. If and reside in two different eclasses and , merging and should also merge and . This can propagate up further; merging and could cause two other eclasses to become equivalent. To maintain congruence, traditional follow Nelson’s original design and maintain a parent list for each eclass. The parent list contains pointers to each enode which contain that eclass as a child. When merging two eclasses, one must inspect these parents list to find parents that would now become equivalent, recursively merging them if necessary (Figure 4 lines 3538).
The merge routine must also do some bookkeeping to preserve deduplication. The merged eclass’s node list and parent list must be deduplicated. The hashcons must also be carefully modified in a way the preserves its canonicalization invariant. This process, called hashcons surgery (Figure 4, line 31) by some Egraph implementations, complicates the hashcons implementation, as most hashmaps do not support the modification of keys.
3.2. Rebuilding in Detail
defers invariant maintenance to the rebuild procedure which is invoked at the end of each equality saturation iteration ((b), line 19). This allows for a much simpler merge procedure that need not worry about congruence or updating the hashcons. Figure 5 shows pseudocode for ’s merge and rebuild.
The rebuild_once procedure walks the entire Egraph a single time, building up a new hashcons along the way. While building the new hashcons, any “unexpected” lookup hits are the result of congruence; it simply records this and performs the merge later. This results in an Egraph (including hashcons) where a single “layer” of congruence has been taken into account. Note that this obviates the need for hashcons manipulation (it is simply totally rebuilt) and maintaining parent lists. By eliminating these requirements, uses simpler, more efficient data structures to represent eclasses and the hashcons.
The rebuild procedure runs rebuild_once until it finds no new equivalences, i.e., congruence is restored. All that remains is to deduplicate the eclasses, which is done in a single, straightforward pass.
3.3. Evaluating Rebuilding
To demonstrate that rebuilding provides faster congruence closure than upward merging, we implemented rebuilding in Regraph, a traditionallystructured Egraph implementation extracted from Herbie (Panchekha et al., 2015), which uses it to simplify floatingpoint expressions.^{5}^{5}5 Herbie now uses to achieve a much greater () speedup (Section 5.1). This provides a onetoone comparison of rebuilding and upward merging, isolated from the many factors that make efficient: overall design differences, programming language performance, and even the data structure simplifications and optimizations that rebuilding enables.
Regraph is implemented in Racket, using the traditional Egraph invariant maintenance techniques mentioned in Section 3.1 (parent pointers, upward merging, hashcons surgery, etc.). We added an alternate mode for rebuilding that closely follows the algorithm in Figure 5, but changing the data structures as little as possible. We additionally instrument Regraph to separately track time spent ensuring congruence closure, allowing an isolated comparison of rebuild time versus upward merging time.
While implementing rebuilding in Regraph, we discovered a handful of bugs that store duplicate information in the various sets and tables used in upward merging. These bugs had eluded detection and repair over Regraph’s five years of maintenance, suggesting that upward merging is a difficult algorithm to implement correctly; by contrast, our rebuilding implementation takes up 28 lines of code and was written by one undergraduate student after a onehour walkthrough of the Regraph code base by one of its authors.
Figure 6 shows the result of running Herbie’s equality saturation benchmark suite on Regraph using upward merging and rebuilding. All experiments were on an Intel i74790K CPU with 32 GB of memory. The upward merging and rebuilding versions were run with identical configurations and produced the same results. Rebuilding makes congruence closure faster and equality saturation overall faster on average. This speedup is greater on benchmarks that took Regraph longer, suggesting that rebuilding offers superlinear speedup over upward merging.
4. Egraph Extensions
egg offers many convenient ways to interact with the Egraph that are difficult or impossible in other implementations. These tools give the flexibility highlighted by the diverse case studies in Section 5.
Like the rest of , these extensions are generic over the language and rewrites that the user is working with. These are all practically novel, as is the first (to our knowledge) reusable Egraph implementation. Metadata (Section 4.3) appears to be conceptually novel as well.
4.1. Runners and Extraction
Equality saturation workflows, regardless of the application domain, typically have similar structure. Add some expressions to an empty Egraph, run your rewrites until saturation or timeout, and extract the best equivalent expressions according to some cost function. This “outer loop” of equality saturation involves a significant amount of errorprone boiler plate:

Checking for saturation, timeouts, and Egraph size limits.

Orchestrating the readphase, writephase, rebuild system (Figure 5) that makes fast.

Recording performance data at each iteration

Potentially coordinating rule execution so that expansive rules like associativity do not dominate the Egraph.

Finally, extracting the best expression(s) according to one or more userdefined cost functions.
4.1.1. Runners
As shown in Figure 2, the Runner API provides a configurable implementation of the first four features. Runners automatically detect saturation, and can be configured to stop after a time, Egraph size, or iterations limit. The equality saturation loop provided by calls rebuild, so users need not even know about ’s deferred invariant maintenance. Runners report various data about each iteration automatically, and the use can hook into this to report anything they like. For example, users commonly record the “best so far” expression by extracting in each iteration.
Runners also provide configurable rule scheduling. In typical equality saturation, each rewrite is searched for an applied each iteration. This can cause certain rewrites to dominate others, making the search space less productive. The most common example is rewrites like associativity or distributivity. Applied in moderation, these rewrites can trigger other rewrites and find greatly improved expressions. However, they can explode the Egraph exponentially in size, causing search to slow and the size or time limit to be hit.
’s Runners can be configured with a userdefined rewrite scheduler, so users can choose as complex a solution to this as they wish. By default, uses the builtin backoff scheduler. This scheduler identifies rewrites that are matching in exponentiallygrowing locations and temporarily bans them. We have observed that this greatly reduced runtime (producing the same results) in many settings. also provides conventional everyruleeverytime scheduler.
4.1.2. Extraction
Extraction is the process of choosing the optimal expression represented by an eclass according to some cost function. users can define their own cost functions or use the provided AST size or depth functions. From there, ’s Extractor can perform a greedy search for the optimal expressions. The greedy approach is guaranteed to yield the optimal expression for simple, monotonic cost functions. In practice (Panchekha et al., 2015; Wang et al., 2020), greedy extraction seems to suffice for some complex cost functions as well, even though a more sophisticated extraction procedure would be necessary to guarantee optimality.
4.2. Conditional and Dynamic Rewrites
Given that takes care of most everything else, much of a user’s time is typically spend defining rewrites. Rewrites consist of a lefthand side and a righthand side, or in parlance, a Searcher and an Applier. Both Searcher and Applier are interfaces (traits in Rust) implemented by provided structures.
First and foremost, ’s syntactic patterns can serve as both Searchers and Appliers. As shown in Figure 2, can parse these from strings to quickly and easily create purely syntactic rewrites. In our experience, most rewrites are purely syntactic.
also supports conditional rewrites. These are created by attaching one or more predicates onto an Applier. The predicates take the matches substitution from the Searcher and determine whether or not to run the underlying applier Applier.
Most powerfully, the user can implement Searcher or Applier however they like. The most common combination is a syntactic pattern as the Searcher combined with a custom applier that dynamically computes what to add to the Egraph based on the matched substitution. This allows users to build on top of equality saturation without having to reimplement anything or mess with the library’s internals. Our realworld case studies (in particular, the one presented in Section 5.3) make extensive use of custom appliers to easily add the “secret sauce” of their research contribution.
4.3. Metadata
Frequently in program analysis, one may wish to associate some data with expressions: perhaps the type of the expression, its constant value (if any), or some other latticelike data. The same goes for equality saturation, but it is not immediately clear how to propagate that data over the Egraph.
As an example, consider equality saturation over the language of real arithmetic. Many rewrites would be conditional not over something syntactic, but a semantic property of something on the lefthand side, like “”, “ is rational”, etc. Conditional rewrites alone do not suffice, there must be a way to insert and propagate this additional information through the Egraph.
Metadata is a novel concept in that allows user to attach arbitrary latticelike data to eclasses. Metadata can be used by conditional and dynamic rewrites, and it modify the Egraph itself if the user wishes. To use metadata, the user provides the type of the desired metadata when creating the Egraph. The given type must implement the Metadata interface, which consists of methods to:

make: Given a new enode to be added to the Egraph and the metadata of ’s children, return the metadata to be associated with ’s new eclass.

merge: Merge the metadata of two eclasses being merged.

modify: Optionally, modify the eclass that this metadata is associated with.
Merging eclasses, rebuilding, and congruence all behave the expected way, makeing and mergeing the metadata at the correct times.
This interface allows the straightforward translation of many program analysis techniques into the world of . In particular, metadata is general enough to implement two common features that previous equality saturation tools have implemented as custom passes over the Egraph: constant propagation and pruning.
Constant propagation mirrors its namesake from compilers, but instead of rewriting an operator with constant children to a constant, we just add the computed result to the eclass. Implementing constant folding is straightforward with a metadata of type Option<Constant>: make evaluates the operation if there are constants for the children, merge takes the “and” of the options, and modify adds the constant (if there is one) to the eclass.
Some equality saturation implementations use a technique called pruning, sacrificing completeness to keep the size of the Egraph low for performance. If an eclass contains an constant enode (with no children), pruning removes all other enodes from the eclass. This can prevent other rewrites from firing, but some implementations find the tradeoff worth it. Implementing pruning in is a simple extension to constant folding: the modify method now replaces the eclass instead of just adding to it.
Metadata further highlights ’s choice of Rust as an implementation language. in are generic over the metadata as well as the userdefined language, so the metadata methods can be inlined so that their cost is no higher than the actual computation they perform.
The case study presented in Section 5.2 demonstrates how metadata allows the user to easily add and propagate powerful facts over the Egraph.
5. Case Studies
This section relates three independentlydeveloped, published projects which incorporated as an easytouse, highperformance Egraph implementation. In all three cases, the developers had first rolled their own Egraph implementations. egg allowed them to delete code, gain performance, and in some cases dramatically broaden the project’s scope thanks to ’s speed and flexibility.
5.1. Herbie
Herbie automatically improves accuracy for floatingpoint expressions, using random sampling to measure error, a set of rewrite rules for generating program variants, and algorithms that prune and combine program variants to achieve minimal error. Herbie received PLDI 2015’s Distinguished Paper award (Panchekha et al., 2015) has been continuously developed since then, has hundreds of Github stars, hundreds of downloads, and thousands of users on its limited, online version. Herbie uses for algebraic simplification of mathematical expressions, which is especially important for avoiding floatingpoint errors introduced by cancellation, function inverses, and rendundant computation.
Until our case study, Herbie used a custom Egraph implementation, Regraph, which is written in Racket (Herbie’s implementation language) and closely follows traditional Egraph implementations. Due to its centrality, Egraphbased simplification consumed roughly half of Herbie’s runtime. As a fix, Herbie sharply limits the used for simplification, with a limit on the number of equivalence nodes and an algorithm for unsoundly pruning nodes unlikely to lead to simpler expressions. Furthermore, the Herbie authors knew of several features that they believed would improve Herbie’s output but could not be implemented because they required more calls to simplification and would thus introduce unacceptable slowdowns.
5.1.1. Implementation
We chose to implement an egg simplification backend for Herbie, choosing a backend at runtime to allow for easy comparison. Herbie is implemented in Racket while Egg is in Rust; furthermore, egg is deeply extensible through its use of Rust interfaces, which are not accessible through standard foreignfunction interfaces (FFI). We thus implemented a Rust driver to instantiate various interfaces and provide a Clevel API for Herbie to access via FFI. In this driver, we defined the Herbie expression grammar (with named constants, numeric constants, variables, and operations) and implemented the quirks of Herbie’s use of .
First, Herbie’s set of rewrite rules is not fixed; users can select which rewrites to use using commandline flags. We implemented a simple Racketside serialization of rewrites to strings, and parse and instatiate those rules Rustside.
Second, Herbie separates exact and inexact program constants: exact operations on exact constants (such as the addition of two rational numbers) are evaluated and added to the Egraph, while operations on inexact constants or that yield inexact outputs are not. We thus split numeric constants in the Rustside grammar between exact rational numbers and inexact constants, which are described by an opaque identifier, and transformed Racketside expressions into this form before serializing them and passing them to the Rust driver. To evaluate operations on exact constants, we added metadata to track the “exact value” of each eclass.^{6}^{6}6Herbie’s rewrite rules guarantee that different exact values can never be rewritten to be equal; we added a Rustside check for this invariant. Every time an operation enode is added to the egg graph, we check whether all arguments to that operation have exact value metadata, and if so do rational number arithmetic to evaluate it.
Third, we implemented Herbie’s metric for simplest expression as metadata that tracks the simplest representative of every eclass.
Finally, Herbie contains extensive logging to track the size of the Egraph and the simplest expression found after ever iteration. We thus developed Rustside functions to compute those metrics and to execute one egg iteration at a time; this allowed the Racketside code to execute a single operation, then log the metrics it was interested in. The end result is a 473line Rust driver with 273 lines of Racket interface code (including tests, comments, and whitespace).
5.1.2. Results
Our egg simplification backend is a dropin replacement to the existing Herbie simplifier, making it easy to compare speed and results. To ensure representative inputs, we compare the two on Herbie standard test suite of roughly 500 benchmarks using Herbie’s default parameters and settings. Overall, egg simplification makes Herbie faster, with simplification specifically roughly faster, reducing simplification from to of Herbie’s run time. Furthermore, the overall speed cannot be attributed solely to faster simplification; in fact, other components of Herbie also became 24.2% faster on average, reflecting the fact that egg simplification produced simpler output, reducing work for other phases.
egg simplification produces equallyaccurate output: across the whole benchmark suite, the difference in output accuracy was below 1%. Herbie’s benchmark suite is broken into nine sections, each with a different mix of operators and difficulty; across those nine sections, the speed up from egg simplification ranges from 32% to
faster, suggesting that these results are not due to outliers. However, we did find that one benchmark suite (
libraries) did not see speedups to nonsimplification phases in Herbie; our preliminary investigations suggest that this is due to the egg backend missing some exact computation rules (such as and ). The Herbie developers plan to add such rules and ship the egg backend in the next release of Herbie.5.2. Spores
Spores (Wang et al., 2020)
is an optimizer for machine learning programs. It translates linear algebra (LA) expressions to relational algebra (RA), performs rewrites, and finally translates the result back to linear algebra. Each rewrite is built up from simple identities in relational algebra like the associativity of join. These relational identities express more finegrained equality than textbook linear algebra identities, allowing Spores to discover novel optimizations not found by traditional optimizers based on LA identities. Spores performs holistic optimization, taking into account the complex interactions among factors like sparsity, common subexpressions, and fusible operators and their impact on execution time.
5.2.1. Implementation
Spores is implemented entirely in Rust using egg. egg empowers Spores to orchestrate the complex interactions described above elegantly and effortlessly. Spores works in three steps: first, it translates the input LA expression to RA; second, it optimizes the RA expression by equality saturation; finally, it translates the optimal RA expression back to LA. Since the translation between LA and RA is straightforward, we focus the discussion on the equality saturation step in RA. Spores represents a relation as a function from tuples to real numbers: . This is similar to the index notation in linear algebra, where a matrix A can be viewed as a function . We identify a tuple with a named record, e.g. , so that order in a tuple doesn’t matter. There are just three operations on relations: join, union and aggregate. Join () takes two relations and returns their natural join, multiplying the associated real number for joined tuples:
Here is the set of field names for the records in . In RA terminology, is the schema of . Union () is a join in disguise: it also performs natural join on its two arguments, but adds the associated real instead of multiplying it:
Finally, aggregate () sums its argument along a given dimension. It coincides precisely with the “sigma notation” in mathematics:
The RA identites, presented in Figure 7, are also simple and intuitive. The notation means is not in the schema of , and is the size of dimension (e.g. length of rows in a matrix). In Rule 3, when , we first rename every to a fresh variable in , which gives us: . In addition to these equalities, Spores also supports replacing expressions with fused operators. For example, can be replaced by which streams values from and computes the result without creating intermediate matrices. Each of these fused operators is encoded with a simple identity in egg.
Note that Rule 3 requires a way to store the schema of every expression during optimization. In equality saturation, this means each eclass must be annotated with the schema. Spores directly stores the schema in egg
’s metadata, making it available to all rules during saturation. Spores also leverages the metadata abstraction for cost estimation. Spores has a conservative cost model that overapproximates. As a result, equivalent expressions may have different cost estimates. However, when two eclasses merge,
egg picks the lower cost, thereby automatically improving the cost estimate. Spores also imports the metadata from Herbie for tracking constants with virtually no change and therefore gets constant folding for free. As a whole, Spores’s metadata is a composition of 3 smaller “modules”, demonstrating metadata as a composable and reusable abstraction for equality saturation akin to the abstraction of lattices in abstract interpretation.egg’s extensible interface goes further: the decoupling of saturation and extraction allows us to experiment with different extraction algorithms. Initially Spores implemented an ILPbased extraction (Tate et al., 2009) but we found it to be a bottleneck in compilation time. We then experimented with two alternatives: (1) a greedy approach, and (2) another based on binary decision diagrams. In our experiments, the greedy algorithm always retained performance improvements while taking much less compilation time. Since egg does not have a hardcoded extraction algorithm, Spores is able to perform impactful optimizations without the penalty of long compilation time.
5.2.2. Results
We integrated Spores into Apache SystemML (Boehm, 2019)
, showing that equality saturation can derive all of 84 handwritten rules and heuristics for sumproduct optimization. Spores also discovered novel rewrites that contribute to
to speedup in endtoend experiments. With greedy extraction, all compilations completed within a second.5.3. Szalinski
Several tools have emerged that reverse engineer high level Computer Aided Design (CAD) models from polygon meshes and voxels (Nandi et al., 2018; Du et al., 2018; Tian et al., 2019; Sharma et al., 2017; Ellis et al., 2018)
. The output of these tools are Constructive Solid Geometry (CSG) programs. A CSG program is comprised of 3D solids like cubes, spheres, cylinders, affine transformations like scale, translate, rotate which take a 3D vector and a CSG expression as arguments, and binary operators like union, intersection, and difference that combine CSG expressions. For repetitive models like a gear, CSG programs can be too long and therefore difficult to comprehend. A recent tool, Szalinski
(Nandi et al., 2020), exposes the inherent structure in the CSG outputs of mesh decompilation tools by automatically inferring maps and folds. Szalinski uses an Egraph based rewriting system and its core algorithm is based on equality saturation (Tate et al., 2009). The three main features of Szalinski are:
Discovering structure using loop rerolling rules. This allows Szalinski to infer Folds, Map2s, Repeats and Tabulate from flat CSG inputs.

Identifying equivalence among CAD terms that are expressed as different expressions by mesh decompilers. Szalinski accomplishes this by using CAD identities. An example of one such CAD identity in Szalinski is . This implies that any CAD expression is equivalent to a CAD expression that applies a rotation by zero degrees about x, y, and z axes to .

Allowing external solvers to speculatively add potentially profitable expressions to the Egraph. Mesh decompilers often generate CSG expressions that order and/or group list elements in nonintuitive ways. To recover structure from such expressions, a tool like Szalinski must be able to reorder and regroup lists that expose any latent structure.
5.3.1. Implementation
Szalinski uses ’s Egraph library. Even though CSG is different from “traditional” languages that are targets of compiler optimizations, the language agnostic feature of made it easy to implement Szalinski. Szalinski uses purely syntactic rewrites to express CAD identities and some loop rerolling rules (like inferring a Fold from a list of CAD expressions). Critically, however, Szalinski relies on ’s dynamic rewrites to infer functions for lists.
Consider the flat CSG program in Figure 8. A structure finding rewrite first rewrites the flat list of Unions to:
(Fold Union (Map2 Translate [(2 0 0) (4 0 0) ...] (Repeat Cube 5))) 
Then, a dynamic rewrite uses an arithmetic solver to rewrite the concrete list of 3D vectors to (Tabulate (i 5) (* 2 (+ i 1))). A final set of syntactic rewrites can hoist the Tabulate, yielding the result on the right of Figure 8.
In many cases, the input CSG expression to Szalinski contains subexpressions appearing in arbitrary order. For these inputs, the arithmetic solvers must first reorder the expressions to find a closed form like a Tabulate as shown in Figure 8. However, reordering a list does not preserve equivalence,so adding it to the eclass of the concrete list would be unsound. Szalinski therefore uses inverse transformations, a novel technique that allows solvers to reorder and regroup list elements to find a closed form. In return, the solvers annotate the expression with the permutation or grouping that led to the successful discovery of the closed form. To do this, we extended the language used in Szalinski to support these inverse transformations. supported this novel technique without modification.
5.3.2. Results
An initial prototype of Szalinski used a custom Egraph written in OCaml. Anecdotally, switching to eliminated many bugs, facilitated the key contribution of inverse transformation, and made the tool about faster. ’s performance allowed us to shift from running on small, handpicked examples to a comprehensive evaluation on over 2000 realworld models from a popular online 3D model sharing forum (Nandi et al., 2020).
6. Related Work
Term Rewriting
Term rewriting (Dershowitz, 1993) has been used widely to facilitate equational reasoning for program optimizations (Tate et al., 2009), theorem proving (Detlefs et al., 2005; De Moura and Bjørner, 2008), and program transformations (Andries et al., 1999)
. A term rewriting system applies a database of semantics preserving rewrites or axioms to an input expression to get a new expression, which may, according to some cost function, be more profitable compared to the input. Rewrites are typically symbolic and have a left hand side and a right hand side. To apply a rewrite to an expression, a rewrite system implements pattern matching—if the left hand side of a rewrite rule matches with the input expression, the system computes a substitution which is then applied to the right hand side of the rewrite rule. Upon applying a rewrite rule, a rewrite system typically replaces the old expression by the new expression. This can lead to the
phase ordering problem— it eliminates the possibility of applying a rewrite to the old expression in the future which could have led to a potentially more optimal result.Egraphs and Ematching
The Egraph data structure was first introduced by Greg Nelson (Nelson, 1980). In their work, Nelson et al. used as an efficient data structure for maintaining congruence closure in the context of combining satisfiability theories by sharing equality information. continued to be a critical component in successful SMT solvers (De Moura and Bjørner, 2008). A key difference between past implementations of and ’s Egraph is the novel rebuilding algorithm that makes more efficient for the purpose of equality saturation by allowing it to maintain invariants only at certain critical points (Section 3). implements the pattern compilation strategy introduced by de Moura et al. (de Moura and Bjørner, 2007) that is used in state of the art theorem provers (De Moura and Bjørner, 2008). Several theorem provers (De Moura and Bjørner, 2008; Detlefs et al., 2005) propose optimizations like modtime, patternelement and invertedpathindex to find new terms and relevant patterns for matching, and avoid redundant matches. So far, we have found to be faster than several prior Egraph implementations even without implementing these optimizations. They are however interesting optimizations that we plan to explore in the future.
Superoptimization and Equality Saturation
The Denali (Joshi et al., 2002) superoptimizer first showed how to use for optimized code generation as an alternative to handoptimized machine code and prior exhaustive approaches (Massalin, 1987) both of which were less scalable. The inputs to Denali are programs in a Clike language from which it produces assembly programs. Denali supported three types of rewrites—arithmetic, architectural, and programspecific. After applying these rewrites till saturation, it used architectural description of the hardware to generate constraints that were solved using a SAT solver to output a nearoptimal program. While Denali’s approach was a significant improvement over prior work, it was intended to be used on straight line code only and therefore, did not apply to large real programs.
Equality saturation (Tate et al., 2009; Stepp et al., 2011) developed a compiler optimization phase that works for complex language constructs like loops and conditionals. The first equality saturation paper used an intermediate representation called Program Expression Graphs (PEGs) to encode loops and conditionals. PEGs have specialized nodes that can represent infinite sequences, which allows them to represent loops. It uses a global profitability heuristic for extraction which is implemented using a pseudoboolean solver. PEGs could be implemented in .
7. Conclusion
We presented , a reusable, extensible, and efficient Egraph library. has been used in several projects for deductive program synthesis guided by rewrite rules and as an optimizing compiler. This paper describes the key insights that make efficient and extensible. Specifically, we introduced rebuilding, a new and fast algorithm for maintaining congruence closure, and metadata, a technique in that allows complex optimizations like constant folding that cannot be expressed using purely syntactic rewrites. We discussed several case studies using and presented a comparison between ’s rebuilding algorithm and a more traditional upwardmerging based algorithm for maintaining Egraph invariants.
References
 [1] (199904) Graph transformation for specification and programming. Sci. Comput. Program. 34 (1), pp. 1–54. External Links: ISSN 01676423, Link, Document Cited by: §2.1, §6.
 [2] (2019) Apache systemml. Encyclopedia of Big Data Technologies, pp. 81–86. External Links: Document, Link, ISBN 9783319775258 Cited by: §5.2.2.
 [3] (2007) Efficient ematching for smt solvers. In Automated Deduction – CADE21, F. Pfenning (Ed.), Berlin, Heidelberg, pp. 183–198. External Links: ISBN 9783540735953 Cited by: §2.4, §6.
 [4] (2008) Z3: an efficient smt solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS’08/ETAPS’08, Berlin, Heidelberg, pp. 337–340. External Links: ISBN 3540787992, 9783540787990, Link Cited by: §2.1, §2.2, §6, §6.

[5]
(1993)
A taste of rewrite systems.
In
Functional Programming, Concurrency, Simulation and Automated Reasoning: International Lecture Series 1991–1992 McMaster University, Hamilton, Ontario, Canada
, P. E. Lauer (Ed.), pp. 199–228. External Links: ISBN 9783540477761, Document, Link Cited by: §2.1, §6.  [6] (200505) Simplify: a theorem prover for program checking. J. ACM 52 (3), pp. 365–473. External Links: ISSN 00045411, Link, Document Cited by: §2.1, §2.2, §2.4, §6, §6.
 [7] (201812) InverseCSG: automatic conversion of 3d models to csg trees. pp. 1–16. External Links: Document Cited by: §5.3.
 [8] (2018) Learning to infer graphics programs from handdrawn images. In Neural Information Processing Systems (NIPS), Cited by: §5.3.
 [9] (196405) An improved equivalence algorithm. Commun. ACM 7 (5), pp. 301–303. External Links: ISSN 00010782, Link, Document Cited by: §2.2.
 [10] (200205) Denali: a goaldirected superoptimizer. SIGPLAN Not. 37 (5), pp. 304–314. External Links: ISSN 03621340, Link, Document Cited by: §1, §6.
 [11] (1987) Superoptimizer: a look at the smallest program. In Proceedings of the Second International Conference on Architectual Support for Programming Languages and Operating Systems, ASPLOS II, Washington, DC, USA, pp. 122–126. External Links: ISBN 0818608056, Link, Document Cited by: §6.
 [12] (201807) Functional programming for compiling and decompiling computeraided design. Proc. ACM Program. Lang. 2 (ICFP), pp. 99:1–99:31. External Links: ISSN 24751421, Link, Document Cited by: §5.3.
 [13] (2020) Synthesizing structured cad models with equality saturation and inverse transformations. In PLDI ’20, PLDI ’20. Cited by: §5.3.2, §5.3.
 [14] (1980) Techniques for program verification. Ph.D. Thesis, Stanford University, Stanford, CA, USA. Note: AAI8011683 Cited by: §1, §2.2, §2.4, §6.
 [15] (2005) Proofproducing congruence closure. In Proceedings of the 16th International Conference on Term Rewriting and Applications, RTA’05, Berlin, Heidelberg, pp. 453–468. External Links: ISBN 3540255966, Link, Document Cited by: §1.
 [16] (201506) Automatically improving accuracy for floating point expressions. SIGPLAN Not. 50 (6), pp. 1–11. External Links: ISSN 03621340, Link, Document Cited by: §2.4, §3.3, §4.1.2, §5.1.
 [17] (Website) Note: https://www.rustlang.org/ External Links: Link Cited by: §2.4.
 [18] (2017) CSGNet: neural shape parser for constructive solid geometry. CoRR abs/1712.08290. External Links: Link, 1712.08290 Cited by: §5.3.
 [19] (2011) Equalitybased translation validator for llvm. In Computer Aided Verification, G. Gopalakrishnan and S. Qadeer (Eds.), Berlin, Heidelberg, pp. 737–742. External Links: ISBN 9783642221101 Cited by: §1, §2.2, §6.
 [20] (2009) Equality saturation: a new approach to optimization. In Proceedings of the 36th Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL ’09, New York, NY, USA, pp. 264–276. External Links: ISBN 9781605583792, Link, Document Cited by: §1, §2.1, §2.2, §5.2.1, §5.3, §6, §6.
 [21] (2019) Learning to infer and execute 3d shape programs. In International Conference on Learning Representations, External Links: Link Cited by: §5.3.
 [22] (2020) SPORES: sumproduct optimization via relational equality saturation for large scale linear algebra. External Links: 2002.07951 Cited by: §2.2, §2.3, §4.1.2, §5.2.
 [23] (2019) Carpentry compiler. ACM Transactions on Graphics 38 (6), pp. Article No. 195. Note: presented at SIGGRAPH Asia 2019 Cited by: §2.2, §2.3.
Comments
There are no comments yet.