Synthesizing Analytical SQL Queries from Computation Demonstration

04/14/2022
by   Xiangyu Zhou, et al.
0

Analytical SQL is widely used in modern database applications and data analysis. However, its partitioning and grouping operators are challenging for novice users. Unfortunately, programming by example, shown effective on standard SQL, are less attractive because examples for analytical queries are more laborious to solve by hand. To make demonstrations easier to create, we designed a new end-user specification, programming by computation demonstration, that allows the user to demonstrate the task using a (possibly incomplete) cell-level computation trace. This specification is exploited in a new abstraction-based synthesis algorithm to prove that a partially formed query cannot be completed to satisfy the specification, allowing us to prune the search space. We implemented our approach in a tool named Sickle and tested it on 80 real-world analytical SQL tasks. Results show that even from small demonstrations, Sickle can solve 76 tasks, in 12.8 seconds on average, while the prior approaches can solve only 60 tasks and are on average 22.5x slower. Our user study with 13 participants reveals that our specification increases user efficiency and confidence on challenging tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2022

CUBES: A Parallel Synthesizer for SQL Using Examples

In recent years, more and more people see their work depend on data mani...
research
03/16/2020

Duoquest: A Dual-Specification System for Expressive SQL Queries

Querying a relational database is difficult because it requires users to...
research
05/12/2023

Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations

Relational databases play an important role in this Big Data era. Howeve...
research
10/12/2020

PATSQL: Efficient Synthesis of SQL Queries from Example Tables with Quick Inference of Projected Columns

SQL is one of the most popular tools for data analysis and used by an in...
research
07/23/2021

Comprehending nulls

The Nested Relational Calculus (NRC) has been an influential high-level ...
research
01/24/2019

TigerGraph: A Native MPP Graph Database

We present TigerGraph, a graph database system built from the ground up ...
research
12/20/2021

Demonstration Informed Specification Search

This paper considers the problem of learning history dependent task spec...

Please sign up or login with your details

Forgot password? Click here to reset