Towards a Theory of Data-Diff: Optimal Synthesis of Succinct Data Modification Scripts

01/19/2018
by   Tana Wattanawaroon, et al.
0

This paper addresses the Data-Diff problem: given a dataset and a subsequent version of the dataset, find the shortest sequence of operations that transforms the dataset to the subsequent version, under a restricted family of operations. We consider operations similar to SQL UPDATE, each with a condition (WHERE) that matches a subset of tuples and a modifier (SET) that makes changes to those matched tuples. We characterize the problem based on different constraints on the attributes and the allowed conditions and modifiers, providing complexity classification and algorithms in each case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/17/2023

S/C: Speeding up Data Materialization with Bounded Memory

With data pipeline tools and the expressiveness of SQL, managing interde...
research
10/06/2006

A kernel for time series based on global alignments

We propose in this paper a new family of kernels to handle times series,...
research
06/10/2020

Efficient Partial Snapshot Implementations

In this work, we propose the λ-scanner snapshot, a variation of the snap...
research
06/18/2023

Quantum Algorithms for the Shortest Common Superstring and Text Assembling Problems

In this paper, we consider two versions of the Text Assembling problem. ...
research
04/21/2015

Graphlet-based lazy associative graph classification

The paper addresses the graph classification problem and introduces a mo...
research
10/10/2022

Data Synchronization: A Complete Theoretical Solution for Filesystems

Data reconciliation in general, and filesystem synchronization in partic...
research
04/04/2023

State-Based ∞P-Set Conflict-Free Replicated Data Type

The 2P-Set Conflict-Free Replicated Data Type (CRDT) supports two phases...

Please sign up or login with your details

Forgot password? Click here to reset