Mining Precision Interfaces From Query Logs

by   Haoci Zhang, et al.
Columbia University

Interactive tools make data analysis both more efficient and more accessible to a broad population. Simple interfaces such as Google Finance as well as complex visual exploration interfaces such as Tableau are effective because they are tailored to the desired user tasks. Yet, designing interactive interfaces requires technical expertise and domain knowledge. Experts are scarce and expensive, and therefore it is currently infeasible to provide tailored (or precise) interfaces for every user and every task. We envision a data-driven approach to generate tailored interactive interfaces. We observe that interactive interfaces are designed to express sets of programs; thus, samples of programs-increasingly collected by data systems-may help us build interactive interfaces. Based on this idea, Precision Interfaces is a language-agnostic system that examines an input query log, identifies how the queries structurally change, and generates interactive web interfaces to express these changes. The focus of this paper is on applying this idea towards logs of structured queries. Our experiments show that Precision Interfaces can support multiple query languages (SQL and SPARQL), derive Tableau's salient interaction components from OLAP queries, analyze <75k queries in <12 minutes, and generate interaction designs that improve upon existing interfaces and are comparable to human-crafted interfaces.


page 10

page 13


PI2: Generating Visual Analysis Interfaces From Queries

Interactive visual analysis interfaces are critical in nearly every data...

Monte Carlo Tree Search for Generating Interactive Data Analysis Interfaces

Interactive tools like user interfaces help democratize data access for ...

Interactive Data Analysis with Next-step Natural Language Query Recommendation

Natural language interfaces (NLIs) provide users with a convenient way t...

An Adaptive Benchmark for Modeling User Exploration of Large Datasets

Interactive analysis systems provide efficient and accessible means by w...

Querying Linked Data: An Experimental Evaluation of State-of-the-Art Interfaces

The adoption of Semantic Web technologies, and in particular the Open Da...

Towards Plug-and-Play Visual Graph Query Interfaces: Data-driven Canned Pattern Selection for Large Networks

Canned patterns (i.e. small subgraph patterns) in visual graph query int...

A Generic Framework and Library for Exploration of Small Multiples through Interactive Piling

Small multiples are miniature representations of visual information used...

1 Introduction

Data analysis is a fundamental driver of modern decision making, and interactive interfaces are a powerful way for users to express their analyses. A well-designed interface provides interaction components for the users to easily accomplish their tasks and hides the technical complexity of the underlying system. For instance, the Google Finance stock trend visualization incorporates a time-range filter so that users from a broad audience can explore prices over time. The same interface would not satisfy market analysts who want to aggregate sales information and perform roll-ups and drill-downs; for them, a rich interface such as Tableau would be preferable. In a more extreme scenario, a single text box that lets users type a full program would be effective for engineers who want—and have the technical skills—to write all possible programs. This leads to the following observation: an interactive interface describes a set of programs, and its effectiveness depends on whether this set matches the operations that the users want to express.

Designing interfaces poses two challenges traditionally addressed by experts: specifying of a universe of programs that is relevant for a given task, and developing an interactive interface that can express it. For example, consider Tableau [42], a popular visual data exploration tool. Tableau’s designers carefully identified a common, and valuable, set of analytical data operations (OLAP) and mapped those to widgets in an interactive interface. Thus, they designed shelves to select measures or dimensions, and contextual menus to pick aggregates. Users did not need to learn SQL, and could easily manipulate queries by clicking, dragging and dropping interface components. Although this process was hugely successful, the cost—years of research and development—is not available for every task.

We believe that there is opportunity to drastically reduce the costs of interface construction through data-driven approaches. Given a trace of programs—perhaps automatically logged by data processing systems—we may infer an interactive interface that can express them. Although imperfect, this process can be largely automated, enabling us to scale interface construction and serve a long tail of users for whom it may not be feasible to manually build custom interfaces. Our long term vision is to generate tailored interfaces for every user and every task based on their past analyses.

This paper is a first step towards this vision. We propose query logs as the API for interactive interface generation because, increasingly, such logs are automatically collected by data processing systems by way of provenance capture subsystems [17, 37], as part of recovery and auditing mechanisms such as DBMS query logs [28], or by user-facing applications such as Jupyter. Our discussions with several businesses have identified several compelling use cases for generating interfaces from those logs:

Tailored dashboards: An IOT startup (name anonymized) regularly performs tailored analyses for its customers. For simple cases, the engineers create custom front-ends with a dashboard builder. But the tool does not support complex statements (e.g., nested queries), and therefore the employees spend considerable time writing queries, including the CTO of the firm. For each case, they check out a text file that contains past queries, identify the statements that they need, customize them, copy-paste them and possibly update the document and check it in. A tool to build interfaces from queries would allow them to quickly set up expressive front-ends for each case and each customer.

Auditing: The employees of a large consulting firm perform financial audits by running previous programs (e.g., queries and macros) and performing what-if style analyses. In most cases, this task involves changing parameters, in order to test the robustness of financial indicators. But the auditors are not interested in programming. Precision Interfaces can be viewed as a way to summarize query logs into more accessible interactive visual interfaces.

To this end, we present Precision Interfaces, an automatic tool to generate task-specific interactive interfaces from query logs. Precision Interfaces focuses on supporting different query languages111Our methods are based on abstract syntax trees, which we expect will generalize to different query languages. However, our evaluation focuses on SQL and SPARQL queries. We use “program” and “query” interchangeably in the text. and different types of interface components (e.g., dropdowns, selections, panning, etc). It takes a log as input (say, collected by IT) and generates a set of interactive web applications to express the queries in this log. To do so, it parses the queries into canonicalized parse trees, compares pairs of trees to identify structural changes, and maps common types of changes to interface components. Our focus is on identifying the salient interactions from query logs, and not necessarily the interface design per se. Further, we are currently database agnostic and do not leverage information such as the schema, query plans, and the data.

Building Precision Interfaces requires solving several key challenges. First, we need to develop a unified mathematical model for queries and interfaces, which should be rich enough to express a wide range of real-life scenarios and incorporate user preferences but simple enough to remain tractable. Second, not all structural changes are meaningful and mapping all possible changes to interactions would lead to unusable interfaces—we need to devise a method to filter and identify the most important ones. Third, the space of all the interfaces that we can generate for a given set of query transformations grows exponentially with the size of this set. We need to define constraints and heuristics to quickly find good solutions and scale to logs that contain tens, or hundreds, of thousands of query. To tackle those challenges, we make the following contributions:

  • [leftmargin=*, topsep=0mm, itemsep=0mm]

  • We formalize the problem of mining structural changes in query logs and mapping them to interactive interfaces. The problem definition is general to any language with a well-defined grammar, and a wide range of interaction components (including non-mouse interactions).

  • We introduce a DSL called PILang to specify interesting structural changes. This helps generate an interaction graph where each node is a query and each labeled edge is a structural change identified by a PILang statement.

  • We map the interface construction problem to edge cover over the interaction graph and use a contraction-based heuristic to quickly solve the problem.

  • We evaluate Precision Interfaces on four query logs that span two query languages (SQL and SPARQL) and contains both synthetic and real-world statements. Our optimizations are 2OOM faster than the baseline, and lets us generate interfaces for logs of queries within 12 minutes. The interfaces improve upon an existing interface that generated the queries, and are comparable with human crafted interfaces.

Figure 1: Precision Interfaces parses queries into canonicalized parse trees, performs tree alignment to generate an interaction graph that is filtered using the PILang domain specific language and whose edges are mapped to interface widgets.

2 Overview

We give an overview of the precision interfaces setup and solution pipeline, as well as the technical challenges that we address in the subsequent sections.

Query Logs as API Interfaces are traditionally created by programmers or through a WYSIWYG application, so why mine interfaces from query logs? The primary reason is that query logs encode the analyses that analysts actually perform, and therefore can be used to suggest candidate interfaces. As an API, logs are a flexible abstracton that can be generated from a variety of sources. Modern program execution engines (e.g., DBMSs, Spark, Jupyter, RStudio) already track program logs for recovery and debugging purposes, while explicit provenance metadata systems are increasingly ubiquitous in industry [17, 29] and research [20, 30, 7]. Any analysis that directly uses these systems, or uses them as a backend (e.g., Tableau) will naturally collect query logs. Our hope is that if Precision Interfaces is successful, then cleaning and finding query logs will be an interesting problem in its own right.

2.1 Pipeline Overview

We decompose the general problem of generating interfaces based on query logs in two sub-tasks: finding structural changes between queries and mapping those changes to interactions. The complexity and precision of the resulting interfaces depends on the types of structural changes that we can identify and the quality of the user interactions that we can map to arbitrary changes.

The problem is difficult because the scope of what a user interaction may express is theoretically unlimited—a button press could replace the current query with an arbitrary query string—and can easily lead to unusable interfaces. We must bound the complexity of structural changes, and provide simple mechanisms to specify the types of changes that are meaningful. Also, the system should easily adapt to new programming languages, as well as new types of interaction widgets (e.g., new modalities such as touch).

Based on these observations, we decompose the Precision Interfaces process into three logical steps (Figure 1). The Representation Canonicalizer transforms the input sequence of query strings into a canonicalized parse tree structure (an AST) that makes query comparisons easier.

The Interaction Miner and Distiller logically identifies structural changes between the ASTs based on an ordered tree matching algorithm. These changes form an interaction graph where each node represents a query/AST and each directed edge is labeled by a corresponding structural transformation; it is a multi-graph because there may be multiple labeled edges between any two nodes. To reduce the set of erroneous changes and preserve interesting changes, the developer can distill the graph by using a simple change specification language called PILang for specifying interesting structural changes. We provide an interactive log analysis tool for suggesting potentially useful PILang statements, and use this tool in our experiment setup.

The Interaction Mapper maps sets of edges in the interaction graph to interaction components in interfaces. Because this problem is NP-hard, we use a graph contraction heuristic to compute a best effort solution. We then compile the resulting interfaces into an interactive web application.

Figure 2: An example interface.

2.2 Challenges and Assumptions

Real-life query logs may contain much variability, and it is not obvious how to map arbitrary AST tree differences to widgets automatically. This leads to three major technical challenges. The first is to develop a unified model of queries, interactions, interfaces and interface components (widgets) that is restricted enough for analysis (Section 3). But even then, the number of structural changes in a query log is quadratic in the log size and the majority of those changes are irrelevant. The second challenge is to develop mechanisms to identify the subset of changes that are meaningful to translate to interfaces (Section 4). Finally, the third challenge is to map these changes to components in one or more interfaces. We show that this problem is NP-hard and present an efficient heuristics to generate the interfaces (Section 5).

Inputs and Assumptions: We assume that we have access to the grammar for each language, a parser to map program source code to a parse tree, an unparser to translate parser trees into source code [2], and annotations of AST node types to understand which nodes are literals, or collections.

A core assumption is that most syntactic changes in the query log are incremental, such that the changes can be mapped to interactive interface components. To test this property, we evaluated Precision Interfaces on query logs generated by students analyzing a dataset using SQL. We found that a majority of the queries changes conform the the assumptions, as we will show in Section 7.1. In addition, we do not assume deep semantic understanding about the queries beyond near-universal features such as primitive data types—the whole analysis is performed syntactically.

Another assumption on which Precision Interfaces is built is that there exists no logical dependency between the entries in the log – for instance, no query uses a view or a temporary table defined in another statement. One way to remove this limitation would be to detect clusters of queries using, e.g., pattern mining or source code analysis. We leave this line of study for future work.

Finally, we assume two functions and for a given language that executes a query AST and renders the output, respectively. is called on an interface’s current query state and for SQL query logs, either generates a simple visualization [27, 26] or renders a table.

3 Modeling Interactions

Symbol Description
, Interface, its closure (expressible queries)
Query in query log
, Path in an AST, a subtree
Interaction that replaces subtree rooted at
, Widget type and widget
Cost functions for widget type and widget
, Domain for widget type and widget
Template function for widget
Cost function weight
PILang statement
Table 1: Summary of notations.

This section describes the formal model of query logs and interactions on which we built Precision Interfaces. The key idea is to model interactions as tree transformations, which serve to bridge the structural differences that we mine from the query logs with the interface components that users directly manipulate.

3.1 Modeling Queries

We assume that the input query log can be modeled as a table queries that contains the query id , parsed query , along with any relevant metadata about the queries (e.g., the user that executed the program, the timestamp, the analysis session, etc):

queries(pid, p, tstamp, user, ...);

In order to support a variety of programming languages, we rely on the language’s grammar and parse structure rather than query semantics.

Figure 3: Example ASTs for two SQL queries that differ in the second project clause (blue) and the constant in the equality predicate (red).

We model a query as a canonicalized abstract syntax tree (AST) generated by a parser. Figure 3 shows two (simplified) examples of ASTs for two SQL queries. We assume that each AST node consists of its node type, a set of attributes, and an ordered list of child nodes. For instance the binary expression cty = USA is represented by the BiExpr node type, its attribute op:=, and two children for the left and right sub-expressions. Its second child is a string literal StrExpr with value USA.

In addition, we assume the existence of a table that indicate how terminal nodes map to primitive data types (e.g., string literals map to StrExpr, integers map to IntExpr), as well as the node types that represent lists of sub-expressions (e.g., Project consists of a list of ProjectClause nodes). Primitive data types are straightforward to identify as literal expressions in the language grammar. Similarly, list nodes can either be identified by a language expert, or determined automatically by searching the language’s grammar file for common idioms. For instance, the SQLite grammar defines the list of output expressions sel_core as a project clause (represented by the sel_result non-terminal) followed by zero or more additional project clauses:

  sel_core = (sel_result (whitespace comma sel_result)*)

Explicitly modeling list nodes lets Precision Interfaces map multi-selection widgets, which can specify sets of values, to collection-based tree modifications that insert, reorder, or delete multiple AST subtrees. For instance, a checkbox list of table attributes could be used to specify the list of attributes to return in the project clause.

Finally, we assume that logical expressions are canonicalized into conjunctive normal form. This allows us to model logical expressions as a list of lists (e.g., ANDs of ORs) rather than a complex binary tree structure of AND and OR operators. This reduces tree mis-alignment issues that can be caused when adding a logical expression to e.g., the WHERE clause of a query restructures the expression subtree (Figure 4).

Figure 4: Examples of non/canonicalized versions of ASTs for a SQL WHERE clause with two and three predicates. Canonicalization flattens the tree.

Although this process must be done for each language (and dialect), it only needs to be setup once (e.g., by an admin). In this paper, the authors have provided drivers to model SQL and SPARQL queries.

3.2 Interactions as Query Differences

Given a list of ASTs, we would like to identify all structural differences between pairs of trees. To do so, we use a fast ordered tree matching algorithm [5, 13] that preserves ancestor and left-to-right sibling relationships when matching nodes between the two trees. The algorithm first computes the preorder traversal of both trees. It goes on to the next node if the current pair of nodes matches. When the algorithm finds a pair of nodes that cannot be mapped, it uses backtracking to return to the last pair of nodes that has already been mapped and tries to map them to some other candidate. The algorithm has complexity where are respectively the size, number of leaves, and the tree depth of the tree.

We model all pairwise AST differences in a set of queries as a logical table of differences diffs that contains the sub-tree differences as output by a tree-alignment algorithm. Specifically, diffs contains foreign key references , to the queries and , the unique path to the sub-tree differences, and the sub-trees and that differ.

diffs(, , , , , )

Additions and deletions in the ASTs can be represented by setting or to null, respectively. Let diffs be shorthand for diffs.

Example 1

The pair of ASTs in Figure 3 differ in the project clause and the equality predicate. These differences would be modeled as two records in diffs:

0 0 1 0/1/0 ColExpr(sales) ColExpr(costs)
1 0 1 2/0/0/1 StrExpr(USA) StrExpr(EUR)

The paths specify the index of each child along the path. For instance, the first row’s path follows the root node to PROJECT (0/), to the second ProjClause (0/1/), to its only child (0/1/0). The transformation is to replace the sales column expression with costs. The second row replaces the string expression USA with the string EUR.

Note that diffs is a logical representation used to mine for interactions. Our experiments show it is too costly to fully materialize diffs for all but the smallest query logs. In addition, not all structural differences are meaningful for generating interfaces. Section 4 introduces a simple filtering language to identify meaningful subsets of diffs.

Interactions: Interactions are the abstraction that connects records in the diffs, which represent the locations and examples of structural changes in queries, with interface components, which translate user interactions into query transformations. To this end, we model an interaction as a tree transformation function:

Definition 1

An interaction maps an AST into another AST by substituting the subtree rooted at by a new subtree :

Example 2

Continuing the example in Figure 3, the following interactions changes ’s project clause and then the equality predicate and outputs :

Interaction Graph: The table diffs can be modeled as an interaction graph, where each query is a node, and a directed edge is labeled with an interaction such that . There can be multiple labeled edges between two nodes. These two representations are interchangable, however the graph formulation is useful for generating interfaces in Section 5.

3.3 Interfaces

A given interactive interface represents the AST of an initial query along with a set of interaction components such as buttons, sliders, selection, dragging, panning, and other manipulations that can be interactively expressed. In a slight abuse of terminology, we term these components widgets222Although we use the term widgets for simplicity, it also represents user manipulations such as panning that do not have a visual representation.. Each widget incrementally transforms the current query into the next query , whose output is rendered in the interface. In effect, represents the set of queries expressible by applying all possible sequences of its widgets to its initial query , which we term the closure of the interface. and runs and renders the query AST.

We model a widget as an instance of a widget type . Consider the dropdowns in Figure 2. A dropdown is a type of widget that renders a list of possible options that the user can select from, and its state stores the currently selected option; it is suitable for choosing from a small set of string options, and is more challenging to use when there are more than a dozen options.

More generally, a widget type consists of a domain that restricts the allowable values , along with a generic cost function that quantifies how “good” the widget type is for a given domain . For instance, the general domain of a dropdown is the set of all possible strings, while for a range slider it is a pair of numbers . Although these general domains are broadly defined, they are important for identifying candidate widgets that can express a given structural change. We describe the cost function in the next subsection.

A widget is an instance of a widget type that is instantiated with a specific domain , as well as specifications of how the state of the widget should be used to modify a program. The latter is specified by an interaction along with a template function that maps an element to a subtree that can be passed as an argument to the interaction. Let be the current state of the widget; then applying the widget to the current query is equivalent to:

Example 3

Consider the interface in Figure 2: it contains three dropdown widgets and its current query is the following SQL query whose output is rendered as a line chart:

SELECT date as x, sales as y FROM sales WHERE cty = ’US’

The top Column X widget uses the selected value to modify the column expression date in the first projection clause; its domain is the set of attribute names in the table , its function returns a column expression populated with the specified attribute name, and its interaction replaces the subtree rooted at the first project clause with the output of . Similarly, the middle widget sets the column expression of the second project clause, while the bottom widget modifies the string literal in the equality expression of the WHERE clause.

Our definition of widgets simply specifies a domain and widget state, and is not bound any specific form element. This allows Precision Interfaces to be easily extended to new interaction components or even different modalities such as voice or touch gestures.

Invalid Queries: Since Precision Interfaces operates at the syntactic level, certain combinations of AST transformations might lead to non-executable queries. Although this is unlikely for common transformations such as adding expression clauses or tuning parameters, it is still possible. One solution is to speculatively parse and execute queries in the interface’s closure, and visually disallow interactions that lead to these ASTs. If the space of queries is small, this can be a way to both verify and pre-compute results for performance purposes.

Ranking Interfaces: It is clear that there are many possible interfaces that could be used to express the same query log. For instance, given a query log , an interface may generate buttons, where each button widget represents query and shows its result when pressed. However, if all of the queries were identical except for a numerical constant that represents a threshold, then a single numerical slider would succinctly express the same set of queries. Thus it is desirable to define a scoring function in order to rank and select the “best” interfaces.

The literature on assessing interactive interfaces is continuously evolving and has found a variety of characteristics that affect interface usability. The GOMS family of interface analysis techniques assign each user operation a cost and measure interface efficiency based on the cost to complete higher level goals [23, 8]. Similarly, the amount of visual clutter [36] or even number of pixels needed to render the interface [33, 24] can affect readability. To flexibly support this range of interface measurements, we allow developers to specify multiple cost functions for the widget types. As introduced in the previous subsection, the cost function for a widget type is defined as:

where is the cost function defined for the widget type by the developer. We assume that each widget type has cost functions, whose outputs are between —a button may return if its domain contains more than one element, and otherwise: . Similarly, a checkbox list may increase linearly with the size of the domain: . The terms are user-controllable knobs to specify which cost functions matter more to the user. For instance, in a setting with small screen resolution, the user may prioritize simpler interfaces that are easier to use than more complex and efficient interfaces that would necessitate scrolling or repeatedly zooming in [33].

For an interface

, we estimate the interface complexity as weighed sum of its widgets:

Multiple Interfaces: In many cases, having one interface that expresses all the queries in the log is not the optimal solution. Suppose for instance that our log contains only a pair of queries and that those queries are very different from each other. One approach is to create an interface where is a widget that expresses the complex transformation between and . Another approach is to create two interfaces and , such that each interface expresses exactly one program. To model this flexibility, we support sets of interfaces . We estimate that the complexity as the sum of its interfaces:, where is a constant cost for each new interface. Similarly, we define the closure as the union of its interface’s closures: .

3.4 Interface Generation Problem

We can now define the main problem statement:

Problem 1 (Interface Generation)

Given a query log , a threshold for the percentage of the query log to cover, and the weights for the cost functions, generate the optimal set of interfaces such that:

  • [topsep=-1mm, itemsep=0mm]

  • is minimal

Our aim is to find a set the minimal interface which closure includes a given proportion of the queries in the log.

Solution Overview: Our solution decomposes this problem in two steps. The first is to efficiently mine the query log to identify meaningful structural changes that can be mapped to interactions; directly using the table diffs can lead to overly complex and incoherent interfaces because it can contain both irrelevant differences as well as differences that syntactically appear similar but are semantically different. To address this issue, Section 4 introduces a domain specific language called PILang to filter diffs. An added benefit is that the filtering operations defined by PILang statements can be pushed into the query parsing and tree alignment steps of the system to improve the end-to-end runtime.

The second step is to map these changes to the appropriate widget types, and instantiate the widgets by generating each widget ’s domain , interaction and template function . To do so, Section 5 describes how interface generation is modeled as a subset cover problem, and how to instantiate each widget from the changes output from the PILang statements.

4 Interaction Mining

The Interface Generation problem states that the output interface should be capable of expressing the queries in the log. However naively mining the log for all query differences leads to irrelevant differences (say, between two unrelated queries), as well as semantically similar but syntactically different changes (e.g., table aliases), that result in overly complex or semantically meaningless interactive interfaces. In other words, not all possible interactions that can be mined from the query log are meaningful. In this section, we present PILang, a domain specific language for specifying types of meaningful changes, as well as a tool to help developers write PILang statements.

4.1 Why PILang?

Why not use diffs to analyze all differences in the log? There are three types of issues that arise:

Irrelevant Changes: Changes to queries such as function renaming, changing the alias of a project clause, reordering tables in the FROM clause do not have any impact on the semantics of the query.

Misleading Differences: Consider the following two example queries:

      SELECT a FROM T WHERE 1 = 1

the tree alignment algorithm would identify that replacing the first query’s WHERE clause with the GROUPBY clause produces the second query. Even if this pair of queries are found in the query log, it is unlikely to be a meaningful interaction to map to a widget. For instance, it is possible that they were created by adding the WHERE and GROUPBY clauses to a base query SELECT a FROM T.

Special Cases: Consider changes affecting the constants and in the following two queries:

      SELECT b FROM T WHERE a > 5 AND a < 10
      SELECT b FROM T WHERE c > 5 AND a=10 

The first query involves filtering the values based on an interval, a natural fit for a range slider widget. The same widget would not apply for the second query. In order to distinguish between these two cases, Precision Interfaces needs additional semantics to specify that the constants that change should be part of and inequality expressions that share the same attribute.

For these reasons, it is desirable to filter diffs to a subset of query changes that are meaningful, as defined by the application and developer needs. Note that Precision Interfaces can be bootstrapped with a set of PILang statements for a given language so that, by default, it generates reasonable interfaces without any user intervention, and additional PILang statements can be added if desired. It is an open area of investigation whether learning-based approaches can replace the need for manual PILang statements.

4.2 PILang

PILang is a domain specific language for users to easily specify where and how queries change. A PILang statement is evaluated over a pair of queries and returns an output table, or if it does not match. It is equivalent to filtering diffs by its attribute and transforming its subtrees , . Although we will introduce simple tree traversal syntax that serves our purposes, more powerful nested SQL syntax such as SQL++ [32] could be adopted in the future.

PILang statement comprises a From clause, a Where clause and a Match clause, organized as follows:

    Ψ  FROM <path expression> AS <table name>, ...
    Ψ[WHERE <boolean expression>]
    Ψ MATCH <stmt name>[(<table name>)]

From Clause: The FROM clause is used to both define the query scope within which Precision Interfaces searches for structural differences, and to transform the subtrees and in diffs. The path expression is composed of operators to specify ancestor // and child / relationships between node types; * denotes a wildcard node.

For instance *//* matches all possible paths, a//* matches any path containing node type a, /a//* matches paths whose root node is a, while a/b matches paths that contain a with direct child b that is also a leaf node. If there are multiple matching nodes, [i] can be used to specify a specific child: a/*[1] selects the second child of a, while a/b[1] selects the second b child of a.

Since is always matched against , it will always match an ancestor of the subtrees and ; it is also used to specify the ancestor subtree to return. The FROM clause returns the deepest subtree that matches and contains the subtree; the trailing //* specifies that the subtrees in diffs should not be transformed. In short, the FROM clause is equivalent to binding a range variable to the following SQL statement:

  SELECT id, pid1, pid2, extract(, path),
          ancestor(, path), ancestor(, path)
    FROM diffs
   WHERE matches(, path)

Where Clause: The WHERE clause is a boolean expression over the variables defined in the FROM clause. In addition to classic SQL expressions, path operators can be used to manipulate the subtrees . Ellipsis notation (..) denotes the parent node, and the single dot (.) is used to access node attributes. For instance, the following identifies tree differences within equality predicates:

    FROM Where/BiExpr AS T
   WHERE T..op = ’=’ AND T..op = ’=’ AND
          T./*[0].name = cty AND T./*[0].name = cty

The attribute can be used as shorthand for expressions over both subtrees. Thus, the following is equivalent to the above example:

    FROM Where/BiExpr AS T
   WHERE T..op = ’=’ AND T./*[0].name = cty

The following example checks for insertions in the query’s project clause:

    FROM Project/ProjClause AS T
   WHERE T. is null AND T. is not null

Match Clause: This clause is used to name the PILang statement so that successful matches can be used as labeled edges in the interaction graph. In addition, the statement returns one of the range variables so that it is accessible in for the interface generation step. The returned range variable is augmented with a name attribute containing the statement’s name.

4.3 Executing and Writing PILang

PILang statements are translated into SQL queries and executed over partitions of diffs defined by a pair . The results over the partitions are unioned into a single table that we call diffs_pil. Section 6 describes techniques that use PILang to reduce the tree matching and interface generation costs.

PILang is intended for developers, and crafting PILang statements manually may be difficult for users with no experience with abstract syntax trees. To facilitate this process, we created a tool to detect the most common differences between pairs of trees and let the users chose the ones that interest them. The tool operates as follows. The users specify a range of possible transformations, by setting a number of allowable pairwise differences and a number of nodes that may differ. The tool takes a sample from the log, compares all pairs of trees in that sample and reports those that match the specified criteria. To report the differences, it unparses the trees and highlights the substrings that vary. For any transformation, the tool can create a PILang statement that checks for differences in the subtrees where it detected variations. To help users refine their search, the tool excludes all the changes that match an existing PILang statement from subsequent searches.

5 Interface Mapper

The output of the interaction mining step is a table diffs_pil representing subtree differences between pairs of queries in the query log. We map this table into an interaction graph where each query is a vertex and each record is a directed edge that is labeled with a description of the interaction. The graph is a multi-graph because each pair of queries can be connected by multiple transformations. Our goal is to generate a set of interfaces that can express the queries in this graph. Doing so involves three challenges: 1) identifying candidate widgets for each edge in the graph, 2) extracting domain and template functions to instantiate those widgets, and 3) mapping subsets of the interaction graph to widgets in an interface. The remainder of this section describes how we tackle those problems.

5.1 Preprocessing the Interaction Graph

Recall that a widget is instantiated with an interaction , a domain , and a template function that maps elements in the domain to a subtree. We now describe how to extract template functions and domains from diffs_pil. At a high level, we extract parameterized tree templates from the subtrees in the diffs_pil table, and use the parameter values to construct the domains. We will also use these reults to label the edges in the interaction graph.

Template Functions: We use the subtrees in diffs_pil to extract template functions. To do so, we replace the primitive values in a subtree with parameters to create a parameterized subtree template ; the table is modeled as its template along with its parameter values . We then group the trees by their templates, and for each group , collect the parameter values for all trees in the group. We then keep the indices of the parameters that have at least one change, and result in a per-group table . This table represents the parameter values that vary across the same parameterized subtrees.

Each widget has a -dimensional domain. For instance a range slider has domain . A parameterized subtree can be mapped to a widget if there exists a bijection between attributes in its values in table and dimensions in ’s domain, such that each binding is within . If so, the template function is simply the inverse bijection from the widget’s tuple state to the attributes in the values table that are then bound to the parameters in the template subtree.

Labeled Edges: Each edge in the interaction graph represents a record in diffs_pil. We label the edge with its path and subtree template . We also annotate the edge with its parameter value . Finally, an edge could be mapped to multiple widgets—for instance, selecting from a set of options can be expressed by a dropdown and textbox—and we call these the edge’s candidate widgets .

5.2 Widget Mapping

The next step is to build a set of interfaces that can express all the queries in the log. To do so, we will first define the closure of an interface , and its cost in terms of the interaction graph, and show that the interface generation problem is NP-hard by a reduction from the set cover problem. We then describe a graph contraction heuristic along with optimizations to speed up the process. To simplify the description, we will first assume that each edge has a single candidate widget, each transformation is a scalar, and each PILang statement generates at most one edge between any pair of queries333A PILang statement can generate an output table with multiple records, say due to multiple numbers changing, and each record corresponds to a transformation.. We will then relax these restrictions.

An interface consists of an initial query and a set of widgets. We define its closure as the set of reachable queries; a query is reachable by if there exists some path from to that consists of edges expressible by some widget in . An edge is expressible by a widget if its path and templated subtree are the same as the widget’s.

The domain of widget is defined as the union of the annotated values of the edges that it expresses in . This domain is used to compute its cost , and subsequently the cost of the set of interface .

NP-Hardness: We now sketch the reduction from set cover to the interface generation problem. Given a universe of items and a set of subsets that covers , , set cover identifies the minimal set of subsets such that .

We can construct an interaction graph where each subset forms a clique , whose edges are labeled with the subset’s id . The edges in ’s clique are expressible by a unique candidate widget , and the cost of a widget is if its domain is empty, and otherwise. Adding a widget to an interface adds all elements in into the closure, and the set of widgets in the resulting set of interfaces forms the subset sum solution.

Simple Heuristic Solution: We present a greedy heuristic to solve the interface generation problem. We initialize the solution by assigning an interface for each query . We then greedily merge pairs of interfaces until the total cost of the interfaces does not further decrease.

A pair of interfaces are merge candidates if there exists zero or more edges that connect a query to a query . Let edge be used to merge the interfaces. The resulting merged interface uses the initial query from , and combines the widgets from both interfaces, along with the (single) candidate widget for the edge : . can then be reduced by merging widgets that represent the same transformations: two widgets with the same paths and feature functions can be merged into a single widget with domain .

For each iteration , we identify the pair of interfaces that, if merged, will most reduce the total cost:

Example 4

Figure 5 illustrates an example merge. The top two interfaces are initialized with their respective queries. The queries differ in the constant in the equality predicate and there is a corresponding edge between the two queries. The interfaces are merged by mapping the edge to a toggle widget that picks between and . Note that the cost function for the toggle widget will be high if the domain does not have exactly two values; if there are more queries with different values in the predicate, then other widgets such as a dropdown will have a lower cost and be chosen.

Figure 5: Merging two interfaces.

Multiple Candidate Widgets: In practice, a given edge can have multiple candidate widgets . Rather than binding an edge to a specific widget in the initial interface set, we propagate all candidates throughout the heuristic solution. To do so, we need to define how two candidate widgets are merged, how their costs are estimated. Once we have finished the interface merging process, we then select the lowest cost interfaces from each candidate set.

Merging candidate sets and means computing the union of the two sets and performing the domain merging procedure described above. The cost of a candidate set is defined by the minimum cost widget in the set.

Multiple Edges: A given diffs_pil table for queries can contain records, each representing an edge with a different path between the pair of queries in the graph. However, in order to fully transform into , we must apply the transformations for all the edges. To account for this, we model these edges as a single “super-edge” whose candidate widget set is the cross product of each edge’s candidate set .

Collection-based Changes Most languages support expressions that represent collections; for instance the SQL FROM clause is a set of range variable definitions, and the GROUPBY clause is a list of grouping expressions. Yet, the model described in Section 3.2 does not account from interactions that manipulate sets. For instance, the output of a PILang statement that looks for numerical differences over the program fragments [1,2,3] and [1,4] would output a table with the following pairs: . Naively, each pair would be modeled as a separate widget. In many cases, it would be preferable to map the pairs to a multi-selection widget such as a checkbox list that can express collections.

To automatically translate such diffs_pil tables into a single collection-based interaction, we use a procedure similar to extracting template functions. We first collect the set of all subtrees in diffs_pil: . For each , we look for the subtree rooted at its closest strict ancestor node that is annotated as a list node type. If no such subtree exists one or more subtrees, we stop. Otherwise, we then replace in with a parameter variable to create a templated ancestor subtree . If all templated ancestors are identical, then that suggests that the subtrees are elements of a collection, and we map the entire diffs_pil table to a single collection-interaction edge whose candidate widgets are collection-based widgets such as checkbox or multi-select.

5.3 Generating Interfaces

Once we have identified the optimal set of interfaces , we select the lowest cost widget from each of the candidate sets based on the final widget domains. At this point it is possible to run a standard interface layout algorithm [39], and then render each interface as a tab in a web application. We render the query output using the developer provided function. In our implementation for SQL query logs, we position the widgets manually and we use a simple visualization generator similar to ShowMe [27] or APT [26] if the number of attributes in the query output is small, and otherwise render a table.

6 Optimization

A naive implementation of Precision Interfaces first materializes diffs by computing tree alignments between all pairs of ASTs in the query log, filtering diffs using the PILang statements, transforming the results into an interaction graph, and performing the graph contraction procedure to derive the interfaces. However, the cost of these steps, in particular tree alignment and graph contraction, can be considerable for even thousands of queries in the log. Fortunately, we may exploit three properties of the problem to reduce the number of pairwise comparisons: the transitivity of transformations,the existence of templates and the fact that all transformations may not be relevant.

6.1 Transitive PILang Cliques

A PILang statement is transitive if matches between and implies a match between :

If is transitive, then the set of programs that it matches forms a clique, and a new program need only compare with an arbitrary program to check if matches with all members of the set. Algorithm 1 uses this observation to efficiently evaluate transitive PILang statements directly on the program log.

Data: PILang statement: s, Programs: P
Result: C
initialize C = ;
while  do
        matched = false;
        while  do
               if  then
                      matched = true;
               end if
        end while
       if matched then
        end if
end while
Algorithm 1 Clique detection for transitive PILang statements.

We use a simple heuristic that identifies transitive statements in the PILang statements we have developed. We check that the WHERE clause only contains transitive logical expressions (e.g., ). We leave richer analysis techniques as future work.

A welcome side effect of this method is that it allows us to compress the interaction graph. Indeed, this graph is often unpractical because it is very dense. Typically, it contains edges for queries, a consequence of the PILang’s statements transitivity. Thanks to Algorithm 1, we can store it as a set of cliques rather than a set of edges, and lower the storage cost by an order of magnitude. We will show in Section 7.2 that this strategy let us process large query logs in main memory.

6.2 Program Templates

We observe that query logs often contain cliques due to queries that have identical parse structures (i.e., templates) but different values in the literals. For instance, queries emitted by varying a distance threshold or function parameter are identical everywhere except for a single value. We term these cliques templated cliques. To detect such cliques, our procedure is similar to finding template functions in Section 5: we replace the literals in the query ASTs (e.g., all node attributes are primitive values) with unnamed variables, hash the resulting templates and group by the hash values. Thus, we represent a group of similar ASTs by a template followed by a list of literals. The rest of the system performs tree alignment and PILang evaluation over templated cliques rather than individual queries. For each templated clique, we index the paths to each variable and probe each clique with the path expressions in each PILang statement. A matching statement can use the index to quickly identify the ASTs that change and evaluate those; a statement that does not match can skip the clique altogether.

Since these operations do not rely on user inputs (e.g., PILang statements), we can perform them offline, during a preprocessing step. Furthermore, the output can be reused as the developer adds new PILang statements and tunes the interface generation parameters.

6.3 Restricting Program Pairs

Since the interfaces generated by Precision Interfaces are derived from differences between pairs of programs, we can restrict the pairs to compare for performance and personalization purposes. For instance, we might only compare program pairs in sequence (e.g., with ), or filter the program log table progs (Section 3.1) by the user, timestamp or other metadata. Modeling the program log as a relation provides the system with considerable flexibility in choosing the program pairs.

7 Experiments

Figure 6: Interface generated from a log of random OLAP queries over the On-Time database. The widgets were created by Precision Interfaces, we edited the layout and titles manually. Users choose dimensions and measures by dragging and dropping in the leftmost boxes. They create filters with the dropdown lists and text boxes at the bottom of the screen.
Figure 7: Interfaces generated for the random OLAP queries. Values simplicity.
Figure 8: Interfaces generated for the random OLAP queries. Values directness.
(a) First interface.
(b) Second interface.
(c) Third interface.
Figure 9: Top 3 interfaces generated by Precision Interfaces from a set of queries over the On-Time database written by students. The first window lets users compose simple group by queries, by selecting lines in the list boxes. The second one presents aggregates for each combination of origin and desintation airport. The last interface shows statistics for each carrier. Collectively, those interfaces cover 59% of all the queries that we collected.

We evaluate Precision Interfaces using 5 query logs—4 SQL and 1 SPARQL log. We used simulated query logs, logs from existing data systems, and logs generated through manual visual exploration, in order to study the system on clean, real-world, and ad-hoc types of logs types, respectively. We seek to answer four questions: 1) can Precision Interfaces’s interfaces express query logs? 2) is the runtime acceptable? 3) can the system support multiple languages? and 4) how do users prefer the generated interfaces as compared with existing and user-designed interfaces?

Parsing and interaction mining are implemented in Java, and the widget mapping and rendering are implemented in Python, which generates HTML+Javascript interfaces. We defined 12 widget types (e.g., dropdown, checkbox list, slider, range slider, textbox, multi-select) and manually created and functions for SQL and SPARQL. After generating the interfaces, we named and positioned the widgets for presentation purposes. We used a MacBook Pro with Intel Core i7 2.5 GHz CPU and 8GB RAM.

7.1 Expressing Query Logs

We first showcase the generated interface designs and their ability to express queries from two logs—synthetically generated OLAP queries, and those generated through manual exploration—over the On-Time Database444521,000 rows, 91 cols.

Synthetic OLAP queries: The aim of this experiment is to show that Precision Interfaces can generate a simple Tableau-like interface from a standard OLAP query workload. To simulate the exploration process, our generator explores the OLAP space by starting with a group by-aggregate query, and iteratively modifying a random clause (GROUPBY, WHERE, SELECT) at each step. To seed the process, we generate a random group by-aggregate query that follows the following format:

SELECT dim1, …, dimM,
       agg1(meas1), …, aggN(measN)
FROM Ontime
WHERE var1=val1 AND …AND varP=valP
GROUP BY dim1, …, dimM

The number of dimension, filters and measures is a sampled from a uniform distribution. We then perform random edits, one for each step. We present the possible edits in Table 

2. We wrote 7 PILang statements that correspond to structural and value changes that our query generator expressed.

Type Actions
Dimensions Add, Remove, Change
Measures Add, Remove, Change col., Change agg.
Filters Add, Remove, Change col., Change val.
Table 2: Possible modifications from the query generator.

Figure 6 presents the generated interface. The two drag-n-drop boxes on the left let users choose measures and dimensions to visualize. The bottom section of the interface provides three filters, each consisting of a drop-down list to select a column and a text field to specify a value. This interface can express 100% of the queries in the log (i.e., its closure contains all the queries in the interaction graph). As Tableau, it lets users produce OLAP queries by dragging columns onto “shelves”, however further work is needed to generate complex logic such as small multiples.

Figure 7 presents an alternative UI, obtained from the same set of queries but with different parameters. In this case, we tuned the weight associated to the cost functions to obtain the most simple interface possible. We assigned a high cost to visual complexity and a low cost to user effort. As a result, the UI contains only three text boxes—one for the dimensions, one for the measures and one for the filters. The user must type the queries manually.

To generate Figure 8, we reversed the weighing scheme. We assigned a high cost to user efforts (e.g., number of keyboard interactions and clicks) and ignored visual complexity. In the resulting window, all the options are explicit: there is one tick box for each possible dimension, measure of filter column, and a dropdown lists for the filter values.

Manual log: We created an ad-hoc exploration query log by asking 12 students to perform 3 random (out of 12) tasks using the On Time dataset (e.g., “how delayed are the flights to from AA?”), answer one free form question (“tell us something you found surprising”) and report their findings. We logged all queries that were executed. There are a total of statements, unique. We did no clean the log (e.g., dead-end analyses, erroneous queries) and simply report the top interfaces. We used PILang statements writtend using the tool described in Section 4.3.

Figure 10: Coverage as more interfaces are added to for manual log. Shows long tail of queries.

This query log contains far more variability than the synthetic dataset, and we consider it a “hard” case. Figure 10 shows the total number of queries covered as the number of interfaces in the output increases. We observe that the first interface covers () queries, and subsequent interfaces cover queries each. This suggests that the interaction graph is sparse, which is reflected in our post-hoc analysis of the logs.

Figure 9 shows the top three interfaces. The left interface is the primary one that resembles a simplified Tableau: most students incrementally vary the select, where, groupby and orderby clauses. The middle interface covers () and computes aggregate statistics for each flight origin; the right interface is representative of the long tail (covers queries). Although those three interfaces do not cover the whole log, they express the primary exploration structure using only 6 interaction components.

7.2 Performance and Languages

In this experiment, we evaluate Precision Interfaces’s language support and scalability. To evaluate the first aspect, we run the pipeline on logs written in two different query languages. To test the second, we measure its runtime for different optimizations. We show that Precision Interfaces spends more than 90% of its time in the interaction mining stage, and therefore we focus on this step.

We use two programs logs. The first one is the SDSS log [3], which contains SQL queries ( unique). The second is a sample from the British Museum’s Semantic Web Collection [1], which contains SPARQL queries ( unique). We respectively used and PILang statements for SQL and SPARQL, which describe the more frequent transformations, detected both by manual inspection and by using the tool described in Section 4.3. We compare four settings: no optimization, the clique-based optimization of Section 6.1 (Clique), the program templates of Section 6.2 (Template), and both optimizations. By default, the latter setting is enabled. Our main finding is that using both optimizations allows Precision Interfaces to scale to logs that are two orders of magnitude larger than without any optimization, and thus it can process logs with 10,000s of queries in minutes.

Figure 11: Breakdown of execution time for each dataset.

Cost Breakdown: Figure 11 shows the overall cost breakdown. The Interaction Mining phase is by far the most time consuming. Because this cost largely dominates Precision Interfaces’s runtime, the rest of this section focuses on it.

Figure 12: Runtime (secs) of interaction mining vs log size (log scale). The points for Template and Both overlap. We cap maximum runtime to hour.

Scalability with the size of the program logs: Figure 12 shows that the runtime increase quadratically for both logs, even when the optimizations are enabled. This comes from the fact that Precision Interfaces must align and compare pairs of programs to build the interaction graph, where is the number of programs. We ran a micro-benchmark and found that the cost of the comparisons is almost constant—they take in average for the SDSS log and ms for British Museum log ( are 95% CIs). The cost comes from the high number of comparisons.

The optimizations do not reduce the worst-case complexity of the algorithm, but they allow the system to skip comparisons. In particular, the Template optimization incurs a runtime improvement of about two orders of magnitude compared to No Optimization—up to 347x for the SDSS data set and 71x for the British Museum data.

Figure 13: Runtime of interaction mining vs. number of PILang statements, for samples of each dataset. In both cases, the plots for Cliques and Both overlap.

Scalability with the number of PILang statements: Figure 13 presents how the interaction miner’s runtime changes when we vary the number of PILang statements. The number of queries is fixed; we used small sample sizes to enforce that all the versions of the algorithm reach completion within one hour. In both cases, we find that increasing the number of statements linearly increases the runtime. Here again, the Template optimization yields improvements of about two order of magnitudes.

Figure 14: Templated clique extraction costs for each log.

Preprocessing: Figure 14 shows the time required to extract the query templates (Section 6.2). The runtime increases quadratically with respect to the log size, but it runs within s for both program logs.

Figure 15: Runtime of interaction mining for a log of 50,000 queries, varying the number of templates.

Impact of the Structure on the Optimizations: In this set of experiments, we show how the optimizations’ efficiency varies with the structure of the programs in the log. We generate random queries using templates; we fix the number of queries, vary the number of templates and measure how Precision Interfaces’s runtime varies. We expect that the runtime decreases as the variability of the queries (i.e. the number of templates) decreases.

Figure 15 presents the results of the experiments. We observe that the runtime of interaction mining varies linearly with the number of templates, for a fixed number of queries. The effect is similar for both Clique and Template, though Clique is two orders magnitude slower than Template. This illustrates that the optimizations successfully exploit the structure in the log. The more structure the log contains, the faster interaction mining runs.

(a) Original SDSS Interface.
(b) Precision Interface
(c) Custom Interface 1
(d) Custom Interface 2
Figure 16: The original SDSS interface, the interface generated by Precision Interfaces, and two manually designed interfaces.

7.3 User Study

We conducted users studie based on the SDSS query log, using the original Sky Server interface555 for reference. We studied whether 1) the generated interfaces reduce the reponse time and analysis accuracy as compared to the existing interface and 2) the generated interfaces are competitive with the original one and handcrafted alternatives in terms of user preference. We recruited 40 CS graduate students for the study.

Comparison With Existing Interface Users were given 5 minutes to read the describing the 4 tasks supported by the existing SDSS interface (Figure 15(a)), and interact with the interface. To avoid learning effects, we randomly split the users into two groups which were asked to complete the 4 tasks using different interfaces. The first group used the existing interface, while the second used Precision Interfaces (Figure 15(b)). We recorded the analysis time and result accuracy for each task.

Figure 17: Comparison of response time and accuracy using the original SDSS interface and the automatically generated interface.

Figure  17 depicts the average accuracy and time needed for each task for both groups of users. For reference, the average time users needed to perform task 2 (filter objects using the sky coordinates RA (Right Ascension) and dec (Declination) is 34 seconds while it takes only 12 seconds using our generated interface. We explain this by the fact that the original interface does not have default widgets for this task, and users have to choose a combination of options for the widgets to appear and then filter using the widgets, which involves multiple interactions. On the other hand, performing this analysis with our interface requires a single interaction. The case for task 4 (filter using spectrum and redshift) is similar and therefore generated similar results. The response time and accuracy differ the most for task 1 (filtering objects by id) as the original interface does not have a widget for this task and users have to write their own query. Our generated interface led to faster and more accurate analysis for all tasks except task 3 (filter objects by their colors), where both interfaces provided straightforward widgets. Overall, the generated interface created higher quality widgets than the original SDSS interface, which led to an increase in accuracy and a decrease in response time.

Interface Preferences After users performed the above tasks, we presented them with four interfaces—the SDSS original, Precision Interfaces, and two manually crafted interfaces—and asked them to choose their preferred interface based on their design. The aim is to understand the extent to which Precision Interfaces’s interactions are congruent with user expectations. The two manual interfaces (Figures 15(c)15(d)) were implemented by two software engineers that read the SDSS task manual and implemented applications to support the described analyses.

Figure 18: Percentage of user who prefers each interface.

Figure  18 shows that both Precision Interfaces and the manually crafted interfaces are preferable to the original. Over 70 percent of the users chose the forth interface while over 20 percent of the users chose Precision Interfaces. Out of 40 users, only 1 user chose the first manually designed interface while none chose the original SDSS interface. These results suggest that manually crafted interfaces can vary considerably in perceived quality, and that Precision Interfaces can generate interfaces competitive with manual implementations. It also suggests that Precision Interfaces, without domain-specific knowledge—of the SDSS manuals, the tasks, or the underlaying data—can summarize the salient analysis operations in an interactive interface from query logs that can be simpler to obtain than expert developers.

7.4 Experiments with Tableau

Figure 19: Interface generated for the Tableau log.

In this experiments, we run Precision Interfaces on queries generated directly by Tableau, and check if (1) our system can detect the underlying interactions and (2) can generate a simpler, Tableau-like interface. We asked 7 students to use Tableau on the Ontime dataset, using the same setup as that discussed in Section 7.1. We logged the queries generated by Tableau, and obtained SQL statements ( unique).

We present the resulting interface in Figure 19. This UI can express 100% of the queries in the log, with only 4 components. The select clause widget lets users select the set of attributes and expressions to return (they can drag a column from the bottom box and drop it in the top one). The where clause widget specifies the four predicates that were used; the order by widget shows the three attributes students combined to sort the output. The having clause widget simply adds a no-op expression to the query — it is a side-effect of how Tableau automatically generates queries, which would likely be removed by the user.

This use case is “easy” for Precision Interfaces because Tableau generates highly structured transformations. In fact, more than 99% of the edges in the interaction graph express changes of columns in the GROUP BY and the SELECT clause. Those actions correspond to drag and drops in the leftmost component in our interface.

8 Related Work

User Interface generation: Jayapandian et al. automate form-based record search and creation interfaces by analyzing the content of the database [21]. In contrast, we use example queries to synthesize analysis interfaces. In future work, we plan to borrow these ideas and take data and query semantics into account. The UI literature offers a large body of work on model-based interface design [34, 43, 31], which rely on the developer to provide high level specifications and focus on layout. The above works do not explicitly leverage query logs.

Development Libraries: Tools such as Sikuli [45] or Microsoft Access let non-technical users build their own interfaces. They improve upon lower-level libraries (e.g., Bootstrap) but still require programming and debugging. Similarly, reactive languages (e.g., [d3express], Shiny [chang2015shiny], EVE [eve], etc) still require programming and are limited to value changes rather than structural program changes.

Log Mining: Historically query log mining has been used in the database literature to detect representative workloads for performance tuning [9, 18]. More recently, it has been used to support data exploration. QueRIE [15] and SnipSuggest [25] produce context-sensitive suggestions from existing queries at the string level. Query steering [14]

uses a Markov model to produce new statements. Log mining is also extremely common in web search 

[41], e.g., to augment search results [16], make suggestions [6] or enable exploration [11]. Precision Interfaces exploits and summarizes the structural changes found in query logs into interactive interfaces.

Visualization Recommendation: Visualization recommendation tools such as Panoramic Data [46], Zenvisage [40] and Voyager [44] constitute a recent and complementary research direction. Those tools help recommend similar data to a given view, while Precision Interfaces seeks to generate the exploration interface itself.

Interface Redesign: Interface redesign includes responsive designs that adjust the presentation or selects alternative widgets based on the display size or modality [38]. Similar interface redesign techniques have been used to reduce data entry errors in survey design [10]. Those techniques are complementary to ours, which focuses on identifying and selecting task-specific interactions.

Programming Languages: The motivations behind Precision Interfaces are close to those of domain specific languages (DSLs) [19], with the major difference that Precision Interfaces targets the interaction domain. A subset of the DSL literature discusses how to build task-specific compilers [12]; we will incorporate those ideas in future work. A related domain of research is program synthesis, which seeks to construct programs that satisfy a high level logical description. For instance, Potter’s Wheel [35] and Foofah [22] build data transformation programs based on input and output examples. We target a different problem—Precision Interfaces analyzes query logs, not input-output pairs.

9 Conclusion and Discussion

This paper introduced the use of query logs as the API for interface generation, formalized the problem of extracting and generating interactive interfaces from these logs, and presented Precision Interfaces as a language-agnostic solution. To do so, we introduced a unified model over queries, query changes, interfaces, and interactions; and presented algorithms and optimizations to solve the interface generation problem for SQL and SPARQL. Visual interactive interfaces are increasingly relied upon in analysis, and we believe Precision Interfaces presents an exciting research direction towards improving accessibility for the long-tail of analyses. From another perspective, we view Precision Interfaces as a compact visual summary of a query log.

We are extending this work in several directions. First is to go beyond syntactic changes and incorporate metadata, language semantics, database content and other information can lead to richer output interfaces. Second, we will investigate how grammar induction methods [4] can help us learn or recommend PILang statements directly. Finally, Precision Interfaces currently assumes that all queries in the log are relevant to the user’s task and in the same language; a third direction is to support complex, multi-language logs.
Acknowledgements: We thank Yifan Wu for the initial inspiration, Anant Bhardwaj for data collection, Laura Rettig on early formulations of the problem, and the support of NSF 1527765 and 1564049.


  • [1] British museum semantic web collection online, 2017.
  • [2] pypeg: a peg parser-interpreter in python, 2013.
  • [3] Sloan digital sky survey, 2017.
  • [4] R. C. Berwick and S. Pilato. Learning syntax by automata induction. Machine learning, 2(1):9–38, 1987.
  • [5] P. Bille. A survey on tree edit distance and related problems. Theoretical computer science, 337(1):217–239, 2005.
  • [6] F. Cai, M. de Rijke, et al. A survey of query auto completion in information retrieval. Foundations and Trends® in Information Retrieval, 10(4):273–363, 2016.
  • [7] S. P. Callahan, J. Freire, E. Santos, C. E. Scheidegger, C. T. Silva, and H. T. Vo. Vistrails: visualization meets data management. In SIGMOD, 2006.
  • [8] S. K. Card, A. Newell, and T. P. Moran. The psychology of human-computer interaction. 1983.
  • [9] S. Chaudhuri and V. Narasayya. Autoadmin “what-if” index analysis utility. ACM SIGMOD Record, 1998.
  • [10] K. Chen, H. Chen, N. Conway, J. M. Hellerstein, and T. S. Parikh. Usher: Improving data quality with dynamic forms. TKDE, 2011.
  • [11] F. Chirigati, J. Liu, F. Korn, Y. W. Wu, C. Yu, and H. Zhang. Knowledge exploration using tables on the web. Proceedings of the VLDB Endowment, 2016.
  • [12] J. C. Cleaveland. Building application generators. IEEE Software, 5(4):25–33, 1988.
  • [13] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms, volume 6. MIT press Cambridge, 2001.
  • [14] K. Dimitriadou, O. Papaemmanouil, and Y. Diao. Explore-by-example: An automatic query steering framework for interactive data exploration. In Proc. SIGMOD, pages 517–528. ACM, 2014.
  • [15] M. Eirinaki, S. Abraham, N. Polyzotis, and N. Shaikh. Querie: Collaborative database exploration. IEEE Transactions on knowledge and data engineering, 26(7):1778–1790, 2014.
  • [16] M. Hearst. Search user interfaces. Cambridge University Press, 2009.
  • [17] J. M. Hellerstein, V. Sreekanti, J. E. Gonzalez, J. Dalton, A. Dey, S. Nag, K. Ramachandran, S. Arora, A. Bhattacharyya, S. Das, M. Donsky, G. Fierro, C. She, C. Steinbach, V. Subramanian, and E. Sun. Ground: A data context service. In CIDR, 2017.
  • [18] J. M. Hellerstein, M. Stonebraker, J. Hamilton, et al. Architecture of a database system. Foundations and Trends® in Databases, 2007.
  • [19] P. Hudak. Domain-specific languages. Handbook of Programming Languages, 3(39-60):21, 1997.
  • [20] Z. G. Ives, T. J. Green, G. Karvounarakis, N. E. Taylor, V. Tannen, P. P. Talukdar, M. Jacob, and F. Pereira. The orchestra collaborative data sharing system. SIGMOD Record, 2008.
  • [21] M. Jayapandian and H. Jagadish. Automated creation of a forms-based database query interface. Proc. VLDB, 1(1):695–709, 2008.
  • [22] Z. Jin, M. R. Anderson, M. Cafarella, and H. Jagadish. Foofah: Transforming data by example. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 683–698. ACM, 2017.
  • [23] B. E. John and D. E. Kieras. The goms family of user interface analysis techniques: Comparison and contrast. TOCHI, 1996.
  • [24] M. Jones, G. Marsden, N. Mohd-Nasir, K. Boone, and G. Buchanan. Improving web interaction on small displays. Computer Networks, 1999.
  • [25] N. Khoussainova, Y. Kwon, M. Balazinska, and D. Suciu. Snipsuggest: context-aware autocompletion for sql. Proc. VLDB, 2010.
  • [26] J. Mackinlay. Automating the design of graphical presentations of relational information. Transactions On Graphics, 1986.
  • [27] J. Mackinlay, P. Hanrahan, and C. Stolte. Show me: Automatic presentation for visual analysis. TVCG, 2007.
  • [28] N. Malviya, A. Weisberg, S. Madden, and M. Stonebraker. Rethinking main memory oltp recovery. In Data Engineering (ICDE), 2014 IEEE 30th International Conference on, pages 604–615. IEEE, 2014.
  • [29] R. Mavlyutov, C. Curino, B. Asipov, and P. Cudre-Mauroux. Dependency-driven analytics: a compass for uncharted data oceans. CIDR, 2017.
  • [30] K.-K. Muniswamy-Reddy, D. A. Holland, U. Braun, and M. I. Seltzer. Provenance-aware storage systems. In USENIX, 2006.
  • [31] J. Nichols, B. A. Myers, and K. Litwack. Improving automatic interface generation with smart templates. In Proc. Intelligent User Interfaces, pages 286–288. ACM, 2004.
  • [32] K. W. Ong, Y. Papakonstantinou, and R. Vernoux. The sql++ unifying semi-structured query language, and an expressiveness benchmark of sql-on-hadoop, nosql and newsql databases. CoRR, 2014.
  • [33] A. Peytchev, M. P. Couper, S. E. McCabe, and S. D. Crawford. Web survey design paging versus scrolling. Public opinion quarterly, 2006.
  • [34] A. R. Puerta, H. Eriksson, J. H. Gennari, and M. A. Musen. Model-based automated generation of user interfaces. In AAAI, pages 471–477, 1994.
  • [35] V. Raman and J. M. Hellerstein. Potter’s wheel: An interactive data cleaning system. In VLDB, volume 1, pages 381–390, 2001.
  • [36] R. Rosenholtz, Y. Li, and L. Nakano. Measuring visual clutter. Journal of vision, 2007.
  • [37] C. Scheidegger, D. Koop, E. Santos, H. Vo, S. Callahan, J. Freire, and C. Silva. Tackling the provenance challenge one layer at a time. Concurrency and Computation: Practice and Experience, 20(5):473–483, 2008.
  • [38] M. Schneider-Hufschmidt, U. Malinowski, and T. Kuhme, editors. Adaptive User Interfaces: Principles and Practice. Elsevier Science Inc., New York, NY, USA, 1993.
  • [39] A. Sears. Layout appropriateness: A metric for evaluating user interface widget layout. IEEE Transactions on Software Engineering, 1993.
  • [40] T. Siddiqui, A. Kim, J. Lee, K. Karahalios, and A. Parameswaran. Effortless data exploration with zenvisage: an expressive and interactive visual analytics system. VLDB, 2016.
  • [41] F. Silvestri et al. Mining query logs: Turning search usage data into knowledge. Foundations and Trends in Information Retrieval, 4(1–2):1–174, 2009.
  • [42] C. Stolte, D. Tang, and P. Hanrahan. Polaris: A system for query, analysis, and visualization of multidimensional relational databases. IEEE Transactions on Visualization and Computer Graphics, 8(1):52–65, 2002.
  • [43] J. Vanderdonckt. Automatic generation of a user interface for highly interactive business-oriented applications. In Conference Companion on Human Factors in Computing Systems, pages 123–124. ACM, 1994.
  • [44] K. Wongsuphasawat, D. Moritz, A. Anand, J. Mackinlay, B. Howe, and J. Heer. Voyager: Exploratory analysis via faceted browsing of visualization recommendations. IEEE TVCG, pages 649–658, 2016.
  • [45] T. Yeh, T.-H. Chang, and R. C. Miller. Sikuli: using gui screenshots for search and automation. In Proceedings of the 22nd annual ACM symposium on User interface software and technology, pages 183–192, 2009.
  • [46] E. Zgraggen, R. Zeleznik, and S. M. Drucker. Panoramicdata: Data analysis through pen & touch. TVCG, 2014.