Synthesis of Data Completion Scripts using Finite Tree Automata

07/05/2017
by   Xinyu Wang, et al.
0

In application domains that store data in a tabular format, a common task is to fill the values of some cells using values stored in other cells. For instance, such data completion tasks arise in the context of missing value imputation in data science and derived data computation in spreadsheets and relational databases. Unfortunately, end-users and data scientists typically struggle with many data completion tasks that require non-trivial programming expertise. This paper presents a synthesis technique for automating data completion tasks using programming-by-example (PBE) and a very lightweight sketching approach. Given a formula sketch (e.g., AVG(?_1, ?_2)) and a few input-output examples for each hole, our technique synthesizes a program to automate the desired data completion task. Towards this goal, we propose a domain-specific language (DSL) that combines spatial and relational reasoning over tabular data and a novel synthesis algorithm that can generate DSL programs that are consistent with the input-output examples. The key technical novelty of our approach is a new version space learning algorithm that is based on finite tree automata (FTA). The use of FTAs in the learning algorithm leads to a more compact representation that allows more sharing between programs that are consistent with the examples. We have implemented the proposed approach in a tool called DACE and evaluate it on 84 benchmarks taken from online help forums. We also illustrate the advantages of our approach by comparing our technique against two existing synthesizers, namely PROSE and SKETCH.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2017

WebRelate: Integrating Web Data with Spreadsheets using Examples

Data integration between web sources and relational data is a key challe...
research
09/07/2018

Relational Program Synthesis

This paper proposes relational program synthesis, a new problem that con...
research
11/10/2017

Automated Migration of Hierarchical Data to Relational Tables using Programming-by-Example

While many applications export data in hierarchical formats like XML and...
research
03/03/2020

Data Migration using Datalog Program Synthesis

This paper presents a new technique for migrating data between different...
research
07/16/2023

Programming by Example Made Easy

Programming by example (PBE) is an emerging programming paradigm that au...
research
11/01/2019

Program Sketching with Live Bidirectional Evaluation

We present Sketch-n-Myth, a technique for completing program sketches wh...
research
03/18/2022

WebRobot: Web Robotic Process Automation using Interactive Programming-by-Demonstration

It is imperative to democratize robotic process automation (RPA), as RPA...

Please sign up or login with your details

Forgot password? Click here to reset