1. Introduction
Equality graphs (egraphs) were originally developed to efficiently represent congruence relations in automated theorem provers (ATPs). At a high level, egraphs (Nelson, 1980; Nieuwenhuis and Oliveras, 2005) extend unionfind (Tarjan, 1975) to compactly represent equivalence classes of expressions while maintaining a key invariant: the equivalence relation is closed under congruence.^{1}^{1}1 Intuitively, congruence simply means that implies .
Over the past decade, several projects have repurposed egraphs to implement stateoftheart, rewritedriven compiler optimizations and program synthesizers using a technique known as equality saturation (Joshi et al., 2002; Tate et al., 2009; Stepp et al., 2011; Nandi et al., 2020; Premtoon et al., 2020; Wang et al., 2020; Panchekha et al., 2015). Given an input program , equality saturation constructs an egraph that represents a large set of programs equivalent to , and then extracts the “best” program from . The egraph is grown by repeatedly applying patternbased rewrites. Critically, these rewrites only add information to the egraph, eliminating the need for careful ordering. Upon reaching a fixed point (saturation), will represent all equivalent ways to express with respect to the given rewrites. After saturation (or timeout), a final extraction procedure analyzes and selects the optimal program according to a userprovided cost function.
Ideally, a user could simply provide a language grammar and rewrites, and equality saturation would produce a effective optimizer. Two challenges block this ideal. First, maintaining congruence can become expensive as grows. In part, this is because egraphs from the conventional ATP setting remain unspecialized to the distinct equality saturation workload. Second, many applications critically depend on domainspecific analyses, but integrating them requires ad hoc extensions to the egraph. The lack of a general extension mechanism has forced researchers to reimplement equality saturation from scratch several times. These challenges limit equality saturation’s practicality.
Equality Saturation Workload. ATPs frequently query and modify egraphs and additionally require backtracking to undo modifications (e.g., in DPLL(T) (Davis and Putnam, 1960)). These requirements force conventional egraph designs to maintain the congruence invariant after every operation. In contrast, the equality saturation workload does not require backtracking and can be factored into distinct phases of (1) querying the egraph to simultaneously find all rewrite matches and (2) modifying the egraph to merge in equivalences for all matched terms.
We present a new amortized algorithm called rebuilding that defers egraph invariant maintenance to equality saturation phase boundaries without compromising soundness. Empirically, rebuilding provides asymptotic speedups over conventional approaches.
Domainspecific Analyses. Equality saturation is primarily driven by syntactic rewriting, but many applications require additional interpreted reasoning to bring domain knowledge into the egraph. Past implementations have resorted to ad hoc egraph manipulations to integrate what would otherwise be simple program analyses like constant folding.
To flexibly incorporate such reasoning, we introduce a new, general mechanism called eclass analyses. An eclass analysis annotates each eclass (an equivalence class of terms) with facts drawn from a semilattice domain. As the egraph grows, facts are introduced, propagated, and joined to satisfy the eclass analysis invariant, which relates analysis facts to the terms represented in the egraph. Rewrites cooperate with eclass analyses by depending on analysis facts and adding equivalences that in turn establish additional facts. Our case studies and examples (Sections 5 and 6) demonstrate eclass analyses like constant folding and free variable analysis which required bespoke customization in previous equality saturation implementations.
egg. We implement rebuilding and eclass analyses in an opensource^{2}^{2}2https://github.com/mwillsey/egg library called (egraphs good). egg specifically targets equality saturation, taking advantage of its workload characteristics and supporting easy extension mechanisms to provide egraphs specialized for program synthesis and optimization. egg also addresses more prosaic challenges, e.g., parameterizing over userdefined lagnguages, rewrites, and cost functions while still providing an optimized implementation. Our case studies demonstrate how ’s features constitute a general, reusable egraph library that can support equality saturation across diverse domains.
In summary, the contributions of this paper include:

Rebuilding (Section 3), a technique that restores key correctness and performance invariants only at select points in the equality saturation algorithm. Our evaluation demonstrates that rebuilding provides an asymptotic speedup over existing techniques in practice.

Eclass analysis (Section 4), a technique for integrating domainspecific analyses that cannot be expressed as purely syntactic rewrites. The eclass analysis invariant provides the guarantees that enable cooperation between rewrites and analyses.

A fast, extensible implementation of egraphs in a library dubbed (Section 5).

Case studies of realworld, published tools that use for deductive synthesis and program optimization across domains such as floating point accuracy, linear algebra optimization, and CAD program synthesis, while achieving up to speed ups (Section 6).
2. Background
builds on egraphs and equality saturation. This section describes those techniques and presents the challenges that addresses.
2.1.
An egraph is a data structure that stores a set of terms and a congruence relation over those terms. Originally developed for and still used in the heart of theorem provers (Nelson, 1980; Detlefs et al., 2005; De Moura and Bjørner, 2008), egraphs have also been used to power a program optimization technique called equality saturation (Joshi et al., 2002; Tate et al., 2009; Stepp et al., 2011; Nandi et al., 2020; Premtoon et al., 2020; Wang et al., 2020; Panchekha et al., 2015).
2.1.1. Definition
An egraph is a set of equivalence classes (eclasses), and each eclass is a set of equivalent enodes. An enode is a function symbol paired with a list of children, each of which is a reference to an eclass. These references, typically implemented as pointers or integer ids, are stored in a unionfind data structure (Tarjan, 1975). Two eclass references and may refer to the same eclass even though they are distinct at the implementation level, . If and do refer to the same eclass, we say they are equivalent, or . We say two enodes and are equivalent, or , if they are in the same eclass.
An egraph, eclass, or enode is said to represent a term if they can be “found” within it. Two terms are equivalent if they are represented in the same eclass. More precisely:

An egraph represents a term if any of its eclasses do.

An eclass represents a term if any enode does.

An enode represents a term if they have the same function symbol and eclass represents term .
When each eclass is a singleton (containing only one enode), an egraph is essentially a syntax tree with sharing (sometimes called a term graph). (a) shows an egraph that represents the expression .
An egraph’s congruence invariant states that its equivalence relation over terms must also be a congruence relation. Two enodes and are congruent, or , if they have the same function symbol and their child eclasses are equivalent; in other words, . To maintain the congruence invariant, the egraph must ensure that congruent enodes are in the same eclass, i.e., .
2.1.2. Interface and Rewriting
bear many similarities to the classic unionfind data structure that they employ internally, and they inherit much of the terminology. provide two main lowlevel mutating operations:

add takes an enode and:

if there exists some in eclass such that , returns ;

otherwise, it creates a new singleton eclass containing and returns a reference to it.


merge (sometimes called assert or union) takes two eclass references and , unions them in the underlying unionfind, and combines the actual eclasses if they were not already equivalent.
Both of these operations must take additional steps to maintain the congruence invariant. Invariant maintenance is discussed in Section 3.
also offers operations for querying the data structure:

find takes an eclass reference and canonicalizes it using the underlying unionfind such that .

ematch performs the ematching (Detlefs et al., 2005; de Moura and Bjørner, 2007) procedure for finding patterns in the egraph. ematch takes a pattern term with variable placeholders and returns a list of tuples where is a substitution of variables to eclasses such that is represented in eclass .
These can be composed to perform rewriting over the egraph. To apply a rewrite to an egraph, ematch finds tuples where eclass represents . Then, for each tuple, merge(, add()) adds to the egraph and unifies it with the matching eclass c.
Figure 1 shows an egraph undergoing a series of rewrites. Note how the process is only additive; the initial term is still represented in the egraph. Rewriting in an egraph can also saturate, meaning the egraph has learned every possible equivalence derivable from the given rewrites. If the user tried to apply to an egraph twice, the second time would add no additional enodes and perform no new merges; the egraph can detect this and stop applying that rule.
2.2. Equality Saturation
Term rewriting (Dershowitz, 1993) is a timetested approach for equational reasoning in program optimization (Tate et al., 2009; Joshi et al., 2002), theorem proving (Detlefs et al., 2005; De Moura and Bjørner, 2008), and program transformation (Andries et al., 1999). In this setting, a tool repeatedly chooses one of a set of axiomatic rewrites, searches for matches of the lefthand pattern in the given expression, and replaces matching instances with the substituted righthand side.
Term rewriting is typically destructive and “forgets” the matched lefthand side. Consider applying a simple strength reduction rewrite: . The new term carries no information about the initial term. Applying strength reduction at this point prevents us from canceling out . In the compilers community, this classically tricky question of when to apply which rewrite is called the phase ordering problem.
One solution to the phase ordering problem would simply apply all rewrites simultaneously, keeping track of every expression seen. This eliminates the problem of choosing the right rule, but a naive implementation would require space exponential in the number of given rewrites. Equality saturation (Tate et al., 2009; Stepp et al., 2011) is a technique to do this rewriting efficiently using an egraph.
Figure 2 shows the equality saturation workflow. First, an initial egraph is created from the input term. The core of the algorithm runs a set of rewrite rules until the egraph is saturated (or a timeout is reached). Finally, a procedure called extraction selects the optimal represented term according to some cost function. For simple cost functions, a bottomup, greedy traversal of the egraph suffices to find the best term. Other extraction procedures have been explored for more complex cost functions (Wang et al., 2020; Wu et al., 2019).
Equality saturation eliminates the tedious and often errorprone task of choosing when to apply which rewrites, promising an appealingly simple workflow: state the relevant rewrites for the language, create an initial egraph from a given expression, fire the rules until saturation, and finally extract the cheapest equivalent expression. Unfortunately, the technique remains ad hoc; prospective equality saturation users must implement their own egraphs customized to their language, avoid performance pitfalls, and hack in the ability to do interpreted reasoning that is not supported by purely syntactic rewrites. aims to address each aspect of these difficulties.
2.3. Equality Saturation and Theorem Proving
An equality saturation engine and a theorem prover each have capabilities that would be impractical to replicate in the other. Automated theorem provers like satisfiability modulo theory (SMT) solvers are general tools that, in addition to supporting satisfiability queries, incorporate sophisticated, domainspecific solvers to allow interpreted reasoning within the supported theories. On the other hand, equality saturation is specialized for optimization, and its extraction procedure directly produces an optimal term with respect to a given cost function.
While SMT solvers are indeed the more general tool, equality saturation is not superseded by SMT; the specialized approach can be much faster when the full generality of SMT is not needed. To demonstrate this, we replicated a portion of the recent TASO paper (Jia et al., 2019)
, which optimizes deep learning models. As part of the work, they must verify a set of synthesized equalities with respect to a trusted set of universally quantified axioms. TASO uses Z3
(De Moura and Bjørner, 2008) to perform the verification even though most of Z3’s features (disjunctions, backtracking, theories, etc.) were not required. An equality saturation engine can also be used for verifying these equalities by adding the left and right sides of each equality to an egraph, running the axioms as rewrites, and then checking if both sides end up in the same eclass. Z3 takes 24.65 seconds to perform the verification; performs the same task in 1.56 seconds ( faster), or only 0.52 seconds ( faster) when using ’s batched evaluation (Section 5.3).3. Rebuilding: A New Take on Egraph Invariant Maintenance
Traditionally (Nelson, 1980; Detlefs et al., 2005), egraphs maintain their data structure invariants after each operation. We separate this invariant restoration into a procedure called rebuilding. This separation allows the client to choose when to enforce the egraph invariants. Performing a rebuild immediately after every operation replicates the traditional approach to invariant maintenance. In contrast, rebuilding less frequently can amortize the cost of invariant maintenance, significantly improving performance.
In this section, we first give a detailed description of the egraph invariants and how they are traditionally maintained (Section 3.1). We then describe the rebuilding framework and how it captures a spectrum of invariant maintenance approaches, including the traditional one (Section 3.2). Using this flexibility, we then give a modified algorithm for equality saturation that enforces the egraph invariants at only select points (Section 3.3). We finally demonstrate that this new approach offers an asymptotic speedup over traditional equality saturation (Section 3.4).
3.1. Hashconsing and Upward Merging
Both mutating operations on the egraph, add and merge, can break the congruence invariant if not done carefully. have traditionally used hashconsing and upward merging to maintain the congruence invariant.
3.1.1. Hashconsing
Adding two congruent enodes should always return equivalent eclasses: if . To efficiently check for congruent enodes, egraphs use a technique called hashconsing. The hashcons data structure maps an enode to the eclass in which resides.
The hashcons is typically implemented with a hash map or similar data structure. But for add to be correct, it must be able to perform lookups up to congruence, not just structural equality. The hashcons therefore canonicalizes enodes before querying (Figure 3 line 3):

An enode is canonical when its child eclass references are canonical. Note that childless enodes are always canonical.

An eclass reference is canonical when in the underlying unionfind.
In addition to canonicalizing the queried enode, the hashcons must maintain the hashcons invariant: enode keys in the map must be canonical. This allows the add procedure to quickly detect whether the given enode is congruent to one already in the egraph. Upward merging maintains this invariant, updating the hashcons when an enode’s canonical identity changes.
3.1.2. Upward Merging
Merging eclasses risks wider reaching invariant violations. If and reside in two different eclasses and , merging and should also merge and to maintain congruence. This can propagate further, requiring additional merges.
maintain a parent list for each eclass to maintain congruence. The parent list for eclass holds all enodes that reference as a child. When merging two eclasses, egraphs inspect these parent lists to find parents that are now congruent, recursively merging them if necessary.
The merge routine also performs bookkeeping to the preserve the hashcons invariant. In particular, merging two eclasses may change how parent enodes of those eclasses are canonicalized. The merge operation must therefore remove, recanonicalize, and replace those enodes in the hashcons. In existing egraph implementations (Panchekha et al., 2015) used for equality saturation, maintaining the invariants while merging can take the vast majority of run time.
3.2. Rebuilding in Detail
Traditionally, invariant restoration is part of the merge operation itself. Rebuilding separates these concerns, reducing merge’s obligations and allowing for amortized invariant maintenance. In the rebuilding paradigm, merge maintains a worklist of eclasses that need to be “upward merged”, i.e., eclasses whose parents are possibly congruent but not yet in the same eclass. The rebuild operation processes this worklist, restoring the invariants of deduplication and congruence.
Figure 3 shows pseudocode for the main egraph operations and rebuilding. Note that add and canonicalize are given for completeness, but they are unchanged from the traditional egraph implementation. The merge operation is similiar, but it only adds the new eclass to the worklist instead of immediately starting upward merging. Adding a call to rebuild right after the addition to the worklist (Figure 3 line 3) would yield the traditional behavior of restoring the invariants immediately.
The rebuild method essentially calls repair on the eclasses from the worklist until the worklist is empty. Instead of directly manipulating the worklist, ’s rebuild method first moves it into a local variable and deduplicates eclasses up to equivalence. Processing the worklist may merge eclasses, so breaking the worklist into chunks ensures that eclass references made equivalent in the previous chunk are deduplicated in the subsequent chunk.
The actual work of rebuild occurs in the repair method. repair examines an eclass and first canonicalizes enodes in the hashcons that have as a child. Then it performs what is essentially one “layer” of upward merging: if any of the parent enodes have become congruent, then their eclasses are merged and the result is added to the worklist.
Deduplicating the worklist, and thus reducing calls to repair, is at the heart of why deferring rebuilding improves performance. Intuitively, the upward merging process of rebuilding traces out a “path” of congruence through the egraph. When rebuilding happens immediately after merge (and therefore frequently), these paths can substantially overlap. By deferring rebuilding, the chunkanddeduplicate approach can coalesce the overlapping parts of these paths, saving what would have been redundant work. In our modified equality saturation algorithm (Section 3.3), deferred rebuilding is responsible for a significant, asymptotic speedup (Section 3.4).
3.2.1. Examples of Rebuilding
Consider the following terms in an egraph, each nested under function symbols:
Note that corresponds the width of this group of terms, and to the depth. Let the workload be merges that merge all the s together: for .
In the traditional upward merging paradigm where rebuild is called after every merge, each will require calls to repair to maintain congruence, one for each layer of s. Over the whole workload, this requires calls to repair.
With deferred rebuilding, however, the merges can all take place before congruence must be restored. Suppose the s are all merged into an eclass When rebuild finally is called, the only element in the deduplicated worklist is . Calling repair on will merge the eclasses of the enodes into an eclass , adding the eclasses that contained those enodes back to the worklist. When the worklist is again deduplicated, will be the only element, and the process repeats. Thus, the whole workload only incurs calls to repair, eliminating the factor corresponding to the width of this group of terms. Figure 7 shows that the number calls to repair is correlated with time spent doing congruence maintenance.
Deferred rebuilding also speeds up congruence maintenance by amortizing the work of maintaining the hashcons invariant. Consider the following terms in an egraph: . Let the workload be . Each merge may change the canonical representation of the s, so the traditional invariant maintenance strategy could require hashcons updates. With deferred rebuilding the merges happen before the hashcons invariant is restored, requiring no more than hashcons updates.
3.2.2. Proof of Congruence
Intuitively, rebuilding is a delay of the upward merging process, allowing the user to choose when to restore the egraph invariants. They are substantially similar in structure, with a critical a difference in when the code is run. Below we offer a proof demonstrating that rebuilding restores the egraph congruence invariant.
Theorem 3.1 ().
Rebuilding restores congruence and terminates.
Proof.
Let be the equivalence relation over enodes in the egraph, so iff and are in the same eclass. Let be the congruence closure of , i.e., is the smallest superset of such that iff .
Since rebuilding only merges congruent nodes, is fixed even though changes. When , congruence is restored. Note that both and are finite. We therefore show that rebuilding causes to approach . We define the set of incongruent enode pairs as ; in other words, if and are equivalent but not congruence closed.
Due to the additive nature of equality saturation, only increases and therefore is nonincreasing. However, a call to repair inside the loop of rebuild does not necessarily shrink . Some calls instead remove an element from the worklist but do not modify the egraph at all.
Let the set be the worklist of eclasses to be processed by repair; in Figure 3, corresponds to self.worklist plus the unprocessed portion of the todo local variable. We show that each call to repair decreases the tuple lexicographically until , and thus rebuilding terminates with .
Given an eclass from , repair examines ’s parents for congruent enodes that are not yet in the same eclass:

If at least one pair of ’s parents are congruent, rebuilding merges each pair , , which adds to but makes smaller by definition.

If no such congruent pairs are found, do nothing. Then, is decreased by 1 since came from the worklist and repair did not add anything back.
Since decreases lexicographically, eventually reaches , so rebuild terminates. Note that contains precisely those eclasses that need to be “upward merged” to check for congruent parents. So, when is empty, rebuild has effectively performed upward merging. By Nelson (1980, Chapter 7), . Therefore, when rebuilding terminates, congruence is restored.
∎
3.3. Rebuilding and Equality Saturation
Rebuilding offers the choice of when to enforce the egraph invariants, potentially saving work if deferred thanks to the deduplication of the worklist. The client is responsible for rebuilding at a time that maximizes performance without limiting the application.
provides a modified equality saturation algorithm to take advantage of rebuilding. Figure 4 shows pseudocode for both traditional equality saturation and ’s variant, which exhibits two key differences:

Each iteration is split into a read phase, which searches for all the rewrite matches, and a write phase that applies those matches.^{3}^{3}3 Although the original equality saturation paper (Tate et al., 2009) does not have separate reading and writing phases, some egraph implementations (like the one inside Z3 (De Moura and Bjørner, 2008)) do separate these phases as an implementation detail. Ours is the first algorithm to take advantage of this by deferring invariant maintenance.

Rebuilding occurs only once per iteration, at the end.
’s separation of the read and write phases means that rewrites are truly unordered. In traditional equality saturation, later rewrites in the given rewrite list are favored in the sense that they can “see” the results of earlier rewrites in the same iteration. Therefore, the results depend on the order of the rewrite list if saturation is not reached (which is common on large rewrite lists or input expressions). ’s equality saturation algorithm is invariant to the order of the rewrite list.
Separating the read and write phases also allows to safely defer rebuilding. If rebuilding were deferred in the traditional equality saturation algorithm, rules later in the rewrite list would be searched against an egraph with broken invariants. Since congruence may not hold, there may be missing equivalences, resulting in missing matches. These matches will be seen after the rebuild during the next iteration (if another iteration occurs), but the false reporting could impact metrics collection, rule scheduling,^{4}^{4}4 An optimization introduced in Section 5.2 that relies on an accurate count of how many times a rewrite was matched. or saturation detection.
3.4. Evaluating Rebuilding
To demonstrate that deferred rebuilding provides faster congruence closure than traditional upward merging, we modified to call rebuild immediately after every merge. This provides a onetoone comparison of deferred rebuilding against the traditional approach, isolated from the many other factors that make efficient: overall design and algorithmic differences, programming language performance, and other orthogonal performance improvements.
; points below the line mean deferring rebuilding is faster. In aggregate over all tests (using geometric mean), congruence is
faster, and equality saturation is faster. The linear scale plot shows that deferred rebuilding is significantly faster. The log scale plot suggests the speedup is greater than some constant multiple; Figure 7 demonstrates this in greater detail.We ran ’s test suite using both rebuild strategies, measuring the time spent on congruence maintenance. Each test consists of one run of ’s equality saturation algorithm to optimize a given expression. Of the 32 total tests, 8 hit the iteration limit of 100 and the remainder saturated. Note that both rebuilding strategies use ’s phasesplit equality saturation algorithm, and the resulting egraphs are identical in all cases. These experiments were performed on a 2020 Macbook Pro with a 2 GHz quadcore Intel Core i5 processor and 16GB of memory.
Figure 7 shows our how rebuilding speeds up congruence maintenance. Overall, our experiments show an aggregate speedup on congruence closure and speedup over the entire equality saturation algorithm. Figure 7 shows this speedup is asymptotic; the multiplicative speedup increases as problem gets larger.
’s test suite consists of two main applications: math, a small computer algebra system capable of symbolic differentiation and integration; and lambda, a partial evaluator for the untyped lambda calculus using explicit substitution to handle variable binding (shown in Section 5). Both are typical applications primarily driven by syntactic rewrites, with a few key uses of ’s more complex features like eclass analyses and dynamic/conditional rewrites.
can be configured to capture various metrics about equality saturation as it runs, including the time spent in the read phase (searching for matches), the write phase (applying matches), and rebuilding. In Figure 7, congruence time is measured as the time spent applying matches plus rebuilding. Other parts of the equality saturation algorithm (creating the initial egraph, extracting the final term) take negligible take compared to the equality saturation iterations.
Deferred rebuilding amortizes the examination of eclasses for congruence maintenance; deduplicating the worklist reduces the number of calls to the repair. Figure 7 shows that time spent in congruence is correlated with the number of calls to the repair methods.
The case study in Section 6.1 provides a further evaluation of rebuilding. Rebuilding (and other features) have also been implemented in a Racketbased egraph, demonstrating that rebuilding is a conceptual advance that need not be tied to the implementation.
4. Extending with Eclass Analyses
As discussed so far, egraphs and equality saturation provide an efficient way to implement a term rewriting system. Rebuilding enhances that efficiency, but the approach remains designed for purely syntactic rewrites. However, program analysis and optimization typically require more than just syntactic information. Instead, transformations are computed based on the input terms and also semantic facts about that input term, e.g., constant value, free variables, nullability, numerical sign, size in memory, and so on. The “purely syntactic” restriction has forced existing equality saturation applications (Tate et al., 2009; Stepp et al., 2011; Panchekha et al., 2015) to resort to ad hoc passes over the egraph to implement analyses like constant folding. These ad hoc passes require manually manipulating the egraph, the complexity of which could prevent the implementation of more sophisticated analyses.
We present a new technique called eclass analysis, which allows the concise expression of a program analysis over the egraph. An eclass analysis resembles abstract interpretation lifted to the egraph level, attaching analysis data from a semilattice to each eclass. The egraph maintains and propagates this data as eclasses get merged and new enodes are added. Analysis data can be used directly to modify the egraph, to inform how or if rewrites apply their righthand sides, or to determine the cost of terms during the extraction process.
Eclass analyses provide a general mechanism to replace what previously required ad hoc extensions that manually manipulate the egraph. Eclass analyses also fit within the equality saturation workflow, so they can naturally cooperate with the equational reasoning provided by rewrites. Moreover, an analysis lifted to the egraph level automatically benefits from a sort of “partialorder reduction” for free: large numbers of similar programs may be analyzed for little additional cost thanks to the egraph’s compact representation.
This section provides a conceptual explanation of eclass analyses as well as dynamic and conditional rewrites that can use the analysis data. The following sections will provide concrete examples: Section 5 discusses the implementation and a complete example of a partial evaluator for the lambda calculus; Section 6 discusses how three published projects have used and its unique features (like eclass analyses).
4.1. Eclass Analyses
An eclass analysis defines a domain and associates a value to each eclass . The eclass contains the associated data , i.e., given an eclass , one can get easily, but not viceversa.
The interface of an eclass analysis is as follows, where refers to the egraph, and and refer to enodes and eclasses within :
When a new enode is added to into a new, singleton eclass , construct a new value to be associated with ’s new eclass, typically by accessing the associated data of ’s children.  
When eclasses are being merged into , join into a new value to be associated with the new eclass .  
Optionally modify the eclass based on , typically by adding an enode to . Modify should be idempotent if no other changes occur to the eclass, i.e., 
The domain together with the join operation should form a joinsemilattice. The semilattice perspective is useful for defining the analysis invariant (where is the join operation):
The first part of the analysis invariant states that the data associated with each eclass must be the join of the make for every enode in that eclass. Since is a joinsemilattice, this means that . The motivation for the second part is more subtle. Since the analysis can modify an eclass through the modify method, the analysis invariant asserts that these modifications are driven to a fixed point. When the analysis invariant holds, a client looking at the analysis data can be assured that the analysis is “stable” in the sense that recomputing make, join, and modify will not modify the egraph or any analysis data.
4.1.1. Maintaining the Analysis Invariant
We extend the rebuilding procedure from Section 3 to restore the analysis invariant as well as the congruence invariant. Figure 8 shows the necessary modifications to the rebuilding code from Figure 3.
Adding enodes and merging eclasses risk breaking the analysis invariant in different ways. Adding enodes is the simpler case; lines 8–8 restore the invariant for the newly created, singleton eclass that holds the new enode. When merging enodes, the first concern is maintaining the semilattice portion of the analysis invariant. Since join forms a semilattice over the domain of the analysis data, the order in which the joins occur does not matter. Therefore, line 8 suffices to update the analysis data of the merged eclass.
Since creates analysis data by looking at the data of ’s, children, merging eclasses can violate the analysis invariant in the same way it can violate the congruence invariant. The solution is to use the same worklist mechanism introduced in Section 3. Lines 8–8 of the repair method (which rebuild on each element of the worklist) remake and merge the analysis data of the parent of any recently merged eclasses. The new repair method also calls modify once, which suffices due to its idempotence. In the pseudocode, modify is reframed as a mutating method for clarity.
egg’s implementation of eclass analyses assumes that the analysis domain is indeed a semilattice and that modify is idempotent. Without these properties, may fail to restore the analysis invariant on rebuild, or it may not terminate.
4.1.2. Example: Constant Folding
The data produced by eclass analyses can be usefully consumed by other components of an equality saturation system (see Section 4.2), but eclass analyses can be useful on their own thanks to the modify hook. Typical modify hooks will either do nothing, check some invariant about the eclasses being merged, or add an enode to that eclass (using the regular add and merge methods of the egraph).
As mentioned above, other equality saturation implementations have implemented
constant folding as custom, ad hoc passes over the egraph.
We can formulate constant folding as an eclass analysis that highlights the
parallels with abstract interpretation.
Let the domain , and let the join
operation be the “or” operation of the Option type:
⬇
match (a, b) {
(None, None ) => None,
(Some(x), None ) => Some(x),
(None, Some(y)) => Some(y),
(Some(x), Some(y)) => { assert!(x == y); Some(x) }
}
Note how join can also aid in debugging by checking properties about
values that are unified in the egraph;
in this case we assert that all terms represented in an eclass should have
the same constant value.
The make operation serves as the abstraction function, returning the
constant value of an enode if it can be computed from the constant values
associated with its children eclasses.
The modify operation serves as a concretizaton function in this
setting.
If is a constant value, then would add
to , where concretizes the constant value into
a childless enode.
4.2. Conditional and Dynamic Rewrites
In equality saturation applications, most of the rewrites are purely syntactic. In some cases, additional data may be needed to determine if or how to perform the rewrite. For example, the rewrite is only valid if . A more complex rewrite may need to compute the righthand side dynamically based on an analysis fact from the lefthand side.
The righthand side of a rewrite can be generalized to a function apply that takes a substitution and an eclass generated from ematching the lefthand side, and produces a term to be added to the egraph and unified with the matched eclass. For a purely syntactic rewrite, the apply function need not inspect the matched eclass in any way; it would simply apply the substitution to the righthand pattern to produce a new term.
Eclass analyses greatly increase the utility of this generalized form of rewriting. The apply function can look at the analysis data for the matched eclass or any of the eclasses in the substitution to determine if or how to construct the righthand side term. These kinds of rewrites can broken down further into two categories:

Conditional rewrites like that are purely syntactic but whose validity depends on checking some analysis data;

Dynamic rewrites that compute the righthand side based on analysis data.
Conditional rewrites are a subset of the more general dynamic rewrites. Our implementation supports both. The example in Section 5 and case studies in Section 6 heavily use generalized rewrites, as it is typically the most convenient way to incorporate domain knowledge into the equality saturation framework.
4.3. Extraction
Equality saturation typically ends with an extraction phase that selects an optimal represented term from an eclass according to some cost function. In many domains (Panchekha et al., 2015; Nandi et al., 2020), AST size (sometimes weighted differently for different operators) suffices as a simple, local cost function. We say a cost function is local when the cost of a term can be computed from the function symbol and the costs of the children. With such cost functions, extracting an optimal term can be efficiently done with a fixedpoint traversal over the egraph that selects the minimum cost enode from each eclass (Panchekha et al., 2015).
Extraction can be formulated as an eclass analysis when the cost function is local. The analysis data is a tuple where is the cheapest enode in that eclass and its cost. The operation calculates the cost based on the analysis data (which contain the minimum costs) of ’s children. The merge operation simply takes the tuple with lower cost. The semilattice portion of the analysis invariant then guarantees that the analysis data will contain the lowestcost enode in each class. Extract can then proceed recursively; if the analysis data for eclass gives as the optimal enode, the optimal term represented in is . This not only further demonstrates the generality of eclass analyses, but also provides the ability to do extraction “on the fly”; conditional and dynamic rewrites can determine their behavior based on the cheapest term in an eclass.
Extraction (whether done as a separate pass or an eclass analysis) can also benefit from the analysis data. Typically, a local cost function can only look at the function symbol of the enode and the costs of ’s children. When an eclass analysis is attached to the egraph, however, a cost function may observe the data associated with ’s eclass, as well as the data associated with ’s children. This allows a cost function to depend on computed facts rather that just purely syntactic information. In other words, the cost of an operator may differ based on its inputs. Section 6.2
provides a motivating case study wherein an eclass analysis computes the size and shape of tensors, and this size information informs the cost function.
5. : Easy, Extensible, and Efficient
We implemented the techniques of rebuilding and eclass analysis in , an easytouse, extensible, and efficient egraph library. To the best of our knowledge, is the first generalpurpose, reusable egraph implementation. This has allowed focused effort on ease of use and optimization, knowing that any benefits will be seen across use cases as opposed to a single, ad hoc instance.
This section details ’s implementation and some of the various optimizations and tools it provides to the user. We use an extended example of a partial evaluator for the lambda calculus, for which we provide the complete source code (which few changes for readability) in LABEL:fig:lambdalang and LABEL:fig:lambdaanalysis. While contrived, this example is compact and familiar, and it highlights (1) how is used and (2) some of its novel features like eclass analyses and dynamic rewrites. It demonstrates how can tackle binding, a perennially tough problem for egraphs, with a simple explicit substitution approach powered by ’s extensibility. Section 6 goes further, providing realworld case studies of published projects that have depended on .
is implemented in ~5000 lines of Rust,^{5}^{5}5 Rust is a highlevel systems programming language. has been integrated into applications written in other programming languages using both C FFI and serialization approaches. including code, tests, and documentation. is opensource, welldocumented, and distributed via Rust’s package management system.^{6}^{6}6 Source: https://github.com/mwillsey/egg. Documentation: https://docs.rs/egg. Package: https://crates.io/crates/egg. All of ’s components are generic over the userprovided language, analysis, and cost functions.
5.1. Ease of Use
’s ease of use comes primarily from its design as a library. By defining only a language and some rewrite rules, a user can quickly start developing a synthesis or optimization tool. Using as a Rust library, the user defines the language using the define_language! macro shown in LABEL:fig:lambdalang, lines 122. Childless variants in the language may contain data of userdefined types, and eclass analyses or dynamic rewrites may inspect this data.
The user provides rewrites as shown in LABEL:fig:lambdalang, lines 51100. Each rewrite has a name, a lefthand side, and an righthand side. For purely syntactic rewrites, the righthand is simply a pattern. More complex rewrites can incorporate conditions or even dynamic righthand sides, both explained in the Section 5.2 and LABEL:fig:lambdaapplier.
Equality saturation workflows, regardless of the application domain, typically have a similar structure: add expressions to an empty egraph, run rewrites until saturation or timeout, and extract the best equivalent expressions according to some cost function. This “outer loop” of equality saturation involves a significant amount of errorprone boilerplate:

Checking for saturation, timeouts, and egraph size limits.

Orchestrating the readphase, writephase, rebuild system (Figure 3) that makes fast.

Recording performance data at each iteration.

Potentially coordinating rule execution so that expansive rules like associativity do not dominate the egraph.

Finally, extracting the best expression(s) according to a userdefined cost function.
provides these functionalities through its Runner and Extractor interfaces. Runners automatically detect saturation, and can be configured to stop after a time, egraph size, or iterations limit. The equality saturation loop provided by calls rebuild, so users need not even know about ’s deferred invariant maintenance. Runners record various metrics about each iteration automatically, and the user can hook into this to report relevant data. Extractors select the optimal term from an egraph given a userdefined, local cost function.^{7}^{7}7 As mentioned in Section 4.3, extraction can be implemented as part of an eclass analysis. The separate Extractor feature is still useful for ergonomic and performance reasons. The two can be combined as well; users commonly record the “best so far” expression by extracting in each iteration.
LABEL:fig:lambdalang also shows ’s test_fn! macro for easily creating tests (lines 2750). These tests create an egraph with the given expression, run equality saturation using a Runner, and check to make sure the righthand pattern can be found in the same eclass as the initial expression.
5.2. Extensibility
For simple domains, defining a language and purely syntactic rewrites will suffice. However, our partial evaluator requires interpreted reasoning, so we use some of ’s more advanced features like eclass analyses and dynamic rewrites. Importantly, supports these extensibility features as a library: the user need not modify the egraph or ’s internals.
5.3. Efficiency
’s novel rebuilding algorithm (Section 3) combined with systems programming best practices makes egraphs—and the equality saturation use case in particular—more efficient than prior tools. is implemented in Rust, giving the compiler freedom to specialize and inline userwritten code. This is especially important as ’s generic nature leads to tight interaction between library code (e.g., searching for rewrites) and user code (e.g., comparing operators). is designed from the ground up to use cachefriendly, flat buffers with minimal indirection for most internal data structures. This is in sharp contrast to traditional representations of egraphs (Nelson, 1980; Detlefs et al., 2005) that contains many tree and linked listlike data structures. additionally compiles patterns to be executed by a small virtual machine (de Moura and Bjørner, 2007), as opposed to recursively walking the treelike representation of patterns. Aside from deferred rebuilding, ’s equality saturation algorithm leads to implementationlevel performance enhancements. Searching for rewrite matches, which is the bulk of running time, can be parallelized thanks to the phase separation. Either the rules or eclasses could be searched in parallel. Furthermore, the onceperiteration frequency of rebuilding allows to establish other performanceenhancing invariants that hold during the readonly search phase. For example, sorts enodes within each eclass to enable binary search, and also maintains a cache mapping function symbols to eclasses that contain enodes with that function symbol. Many of ’s extensibility features can also be used to improve performance. As mentioned above, rule scheduling can lead to great performance improvement in the face of “expansive” rules that would otherwise dominate the search space. The Runner interface also supports user hooks that can stop the equality saturation after some arbitrary condition. This can be useful when using equality saturation to prove terms equal; once they are unified, there is no point in continuing. ’s Runners also support batch simplification, where multiple terms can be added to the initial egraph before running equality saturation. If the terms are substantially similar, both rewriting and any eclass analyses will benefit from the egraph’s inherent structural deduplication. The case study in Section 6.1 uses batch simplification to achieve a large speedup with simplifying similar expressions.6. Case Studies
This section relates three independentlydeveloped, published projects from diverse domains that incorporated as an easytouse, highperformance egraph implementation. In all three cases, the developers had first rolled their own egraph implementations. egg allowed them to delete code, gain performance, and in some cases dramatically broaden the project’s scope thanks to ’s speed and flexibility. In addition to gaining performance, all three projects use ’s novel extensibility features like eclass analyses and dynamic/conditional rewrites.6.1. Herbie: Improving Floating Point Accuracy
Herbie automatically improves accuracy for floatingpoint expressions, using random sampling to measure error, a set of rewrite rules for generating program variants, and algorithms that prune and combine program variants to achieve minimal error. Herbie received PLDI 2015’s Distinguished Paper award (Panchekha et al., 2015) and has been continuously developed since then, sporting hundreds of Github stars, hundreds of downloads, and thousands of users on its online version. Herbie uses egraphs for algebraic simplification of mathematical expressions, which is especially important for avoiding floatingpoint errors introduced by cancellation, function inverses, and redundant computation. Until our case study, Herbie used a custom egraph implementation written in Racket (Herbie’s implementation language) that closely followed traditional egraph implementations. With timeouts disabled, egraphbased simplification consumed the vast majority of Herbie’s run time. As a fix, Herbie sharply limits the simplification process, placing a size limit on the egraph itself and a time limit on the whole procedure. When the timeout is exceeded, simplification fails altogether. Furthermore, the Herbie authors knew of several features that they believed would improve Herbie’s output but could not be implemented because they required more calls to simplification and would thus introduce unacceptable slowdowns. Taken together, slow simplification reduced Herbie’s performance, completeness, and efficacy. We implemented a simplification backend for Herbie. The backend is over faster than Herbie’s initial simplifier and is now used by default as of Herbie 1.4. Herbie has also backported some of ’s features like batch simplification and rebuilding to its egraph implementation (which is still usable, just not the default), demonstrating the portability of ’s conceptual improvements.6.1.1. Implementation
Herbie is implemented in Racket while is in Rust; the simplification backend is thus implemented as a Rust library that provides a Clevel API for Herbie to access via foreignfunction interface (FFI). The Rust library defines the Herbie expression grammar (with named constants, numeric constants, variables, and operations) as well as the eclass analysis necessary to do constant folding. The library is implemented in under 500 lines of Rust. Herbie’s set of rewrite rules is not fixed; users can select which rewrites to use using commandline flags. Herbie serializes the rewrites to strings, and the backend parses and instantiates them on the Rust side. Herbie separates exact and inexact program constants: exact operations on exact constants (such as the addition of two rational numbers) are evaluated and added to the egraph, while operations on inexact constants or that yield inexact outputs are not. We thus split numeric constants in the Rustside grammar between exact rational numbers and inexact constants, which are described by an opaque identifier, and transformed Racketside expressions into this form before serializing them and passing them to the Rust driver. To evaluate operations on exact constants, we used the constant folding eclass analysis to track the “exact value” of each eclass.^{8}^{8}8Herbie’s rewrite rules guarantee that different exact values can never become equal; the semilattice join checks this invariant on the Rust side. Every time an operation enode is added to the egraph, we check whether all arguments to that operation have exact value (using the analysis data), and if so do rational number arithmetic to evaluate it. The eclass analysis is cleaner than the corresponding code in Herbie’s implementation, which is a builtin pass over the entire egraph.6.1.2. Results
Our simplification backend is a dropin replacement to the existing Herbie simplifier, making it easy to compare speed and results. We compare using Herbie’s standard test suite of roughly 500 benchmarks, with timeouts disabled. Figure 9 shows the results. The simplification backend is over faster than Herbie’s initial simplifier. This speedup eliminated Herbie’s largest bottleneck: the initial implementation dominated Herbie’s total run time at , backporting improvements into Herbie cuts that to about half the total run time, and simplification takes under of the total run time. Practically, the run time of Herbie’s initial implementation was smaller, since timeouts cause tests failures when simplification takes too long. Therefore, the speedup also improved Herbie’s completeness, as simplification now never times out. Since incorporating into Herbie, the Herbie developers have backported some of ’s key performance improvements into the Racket egraph implementation. First, batch simplification gives a large speedup because Herbie simplifies many similar expressions. When done simultaneously in one equality saturation, the egraph’s structural sharing can massively deduplicate work. Second, deferring rebuilding (as discussed in Section 3) gives a further speedup. As demonstrated in Figure 7, rebuilding offers an asymptotic speedup, so Herbie’s improved implementation (and the backend as well) will scale better as the search size grows.6.2. Spores: Optimizing Linear Algebra
Spores (Wang et al., 2020)is an optimizer for machine learning programs. It translates linear algebra (LA) expressions to relational algebra (RA), performs rewrites, and finally translates the result back to linear algebra. Each rewrite is built up from simple identities in relational algebra like the associativity of join. These relational identities express more finegrained equality than textbook linear algebra identities, allowing Spores to discover novel optimizations not found by traditional optimizers based on LA identities. Spores performs holistic optimization, taking into account the complex interactions among factors like sparsity, common subexpressions, and fusible operators and their impact on execution time.
6.2.1. Implementation
Spores is implemented entirely in Rust using egg. egg empowers Spores to orchestrate the complex interactions described above elegantly and effortlessly. Spores works in three steps: first, it translates the input LA expression to RA; second, it optimizes the RA expression by equality saturation; finally, it translates the optimized RA expression back to LA. Since the translation between LA and RA is straightforward, we focus the discussion on the equality saturation step in RA. Spores represents a relation as a function from tuples to real numbers: . This is similar to the index notation in linear algebra, where a matrix A can be viewed as a function . A tuple is identified with a named record, e.g. , so that order in a tuple doesn’t matter. There are just three operations on relations: join, union and aggregate. Join () takes two relations and returns their natural join, multiplying the associated real number for joined tuples:requires a way to store the schema of every expression during optimization. Spores uses an eclass analysis to annotate eclasses with the appropriate schema. It also leverages the eclass analysis for cost estimation, using a conservative cost model that overapproximates. As a result, equivalent expressions may have different cost estimates. The
merge operation on the analysis data takes the lower cost, incrementally improving the cost estimate. Finally, Spores’ eclass analysis also performs constant folding. As a whole, the eclass analysis is a composition of three smaller analyses in a similar style to the composition of lattices in abstract interpretation.6.2.2. Results
Spores is integrated into Apache SystemML (Boehm, 2019)in a prototype, where it is able to derive all of 84 handwritten rules and heuristics for sumproduct optimization. It also discovered novel rewrites that contribute to
to speedup in endtoend experiments. With greedy extraction, all compilations completed within a second.6.3. Szalinski: Decompiling CAD into Structured Programs
Several tools have emerged that reverse engineer high level Computer Aided Design (CAD) models from polygon meshes and voxels (Nandi et al., 2018; Du et al., 2018; Tian et al., 2019; Sharma et al., 2017; Ellis et al., 2018). The output of these tools are constructive solid geometry (CSG) programs. A CSG program is comprised of 3D solids like cubes, spheres, cylinders, affine transformations like scale, translate, rotate (which take a 3D vector and a CSG expression as arguments), and binary operators like union, intersection, and difference that combine CSG expressions. For repetitive models like a gear, CSG programs can be too long and therefore difficult to comprehend. A recent tool, Szalinski
(Nandi et al., 2020), extracts the inherent structure in the CSG outputs of mesh decompilation tools by automatically inferring maps and folds (Figure 11). Szalinski accomplished this using ’s extensible equality saturation system, allowing it to: Discover structure using loop rerolling rules. This allows Szalinski to infer functional patterns like Fold, Map2, Repeat and Tabulate from flat CSG inputs Identify equivalence among CAD terms that are expressed as different expressions by mesh decompilers. Szalinski accomplishes this by using CAD identities. An example of one such CAD identity in Szalinski is . This implies that any CAD expression is equivalent to a CAD expression that applies a rotation by zero degrees about x, y, and z axes to Use external solvers to speculatively add potentially profitable expressions to the egraph. Mesh decompilers often generate CSG expressions that order and/or group list elements in nonintuitive ways. To recover structure from such expressions, a tool like Szalinski must be able to reorder and regroup lists that expose any latent structure6.3.1. Implementation
Even though CAD is different from traditional languages targeted by programming language techniques, supports Szalinski’s CAD language in a straightforward manner. Szalinski uses purely syntactic rewrites to express CAD identities and some loop rerolling rules (like inferring a Fold from a list of CAD expressions). Critically, however, Szalinski relies on ’s dynamic rewrites and eclass analysis to infer functions for lists. Consider the flat CSG program in (b). A structure finding rewrite first rewrites the flat list of Unions to:⬇ (Union (Translate (0 0 0) Cube) (Translate (2 0 0) Cube) (Translate (4 0 0) Cube) (Translate (6 0 0) Cube) (Translate (8 0 0) Cube))  ⬇ (Fold Union (Tabulate (i 5) (Translate ((* 2 i) 0 0) Cube))) 
6.3.2. Results
Szalinski’s initial protoype used a custom egraph written in OCaml. Anecdotally, switching to removed most of the code, eliminated bugs, facilitated the key contributions of solverbacked rewrites and inverse transformations, and made the tool about faster. ’s performance allowed a shift from running on small, handpicked examples to a comprehensive evaluation on over 2000 realworld models from a 3D model sharing forum (Nandi et al., 2020).7. Related Work
Term Rewriting
Term rewriting (Dershowitz and Jouannaud, 1990) has been used widely to facilitate equational reasoning for program optimizations (Boyle et al., 1996; van den Brand et al., 2002; Visser et al., 1998). A term rewriting system applies a database of semantics preserving rewrites or axioms to an input expression to get a new expression, which may, according to some cost function, be more profitable compared to the input. Rewrites are typically symbolic and have a left hand side and a right hand side. To apply a rewrite to an expression, a rewrite system implements pattern matching—if the left hand side of a rewrite rule matches with the input expression, the system computes a substitution which is then applied to the righthand side of the rewrite rule. Upon applying a rewrite rule, a rewrite system typically replaces the old expression by the new expression. This can lead to the
phase ordering problem— it makes it impossible to apply a rewrite to the old expression in the future which could have led to a more optimal result.
Comments
There are no comments yet.