babble: Learning Better Abstractions with E-Graphs and Anti-Unification

12/08/2022
by   David Cao, et al.
0

Library learning compresses a given corpus of programs by extracting common structure from the corpus into reusable library functions. Prior work on library learning suffers from two limitations that prevent it from scaling to larger, more complex inputs. First, it explores too many candidate library functions that are not useful for compression. Second, it is not robust to syntactic variation in the input. We propose library learning modulo theory (LLMT), a new library learning algorithm that additionally takes as input an equational theory for a given problem domain. LLMT uses e-graphs and equality saturation to compactly represent the space of programs equivalent modulo the theory, and uses a novel e-graph anti-unification technique to find common patterns in the corpus more directly and efficiently. We implemented LLMT in a tool named BABBLE. Our evaluation shows that BABBLE achieves better compression orders of magnitude faster than the state of the art. We also provide a qualitative evaluation showing that BABBLE learns reusable functions on inputs previously out of reach for library learning.

READ FULL TEXT
research
11/29/2022

Top-Down Synthesis for Library Learning

This paper introduces corpus-guided top-down synthesis as a mechanism fo...
research
05/09/2023

ShapeCoder: Discovering Abstractions for Visual Programs from Unstructured Primitives

Programs are an increasingly popular representation for visual data, exp...
research
08/15/2022

A Library for Representing Python Programs as Graphs for Machine Learning

Graph representations of programs are commonly a central element of mach...
research
07/12/2022

Supercharging the APGAS Programming Model with Relocatable Distributed Collections

In this article we present our relocatable distributed collections libra...
research
10/19/2020

Warrior1: A Performance Sanitizer for C++

This paper presents Warrior1, a tool that detects performance anti-patte...
research
02/26/2019

cuSten -- CUDA Finite Difference and Stencil Library

In this paper we present cuSten, a new library of functions to handle th...
research
02/06/2022

Solidfmm: A highly optimised library of operations on the solid harmonics for use in fast multipole methods

We present solidfmm, a highly optimised C++ library for the solid harmon...

Please sign up or login with your details

Forgot password? Click here to reset