COBRA: Compression via Abstraction of Provenance for Hypothetical Reasoning

07/10/2020
by   Daniel Deutch, et al.
0

Data analytics often involves hypothetical reasoning: repeatedly modifying the data and observing the induced effect on the computation result of a data-centric application. Recent work has proposed to leverage ideas from data provenance tracking towards supporting efficient hypothetical reasoning: instead of a costly re-execution of the underlying application, one may assign values to a pre-computed provenance expression. A prime challenge in leveraging this approach for large-scale data and complex applications lies in the size of the provenance. To this end, we present a framework that allows to reduce provenance size. Our approach is based on reducing the provenance granularity using abstraction. We propose a demonstration of COBRA, a system that allows examine the effect of the provenance compression on the anticipated analysis results. We will demonstrate the usefulness of COBRA in the context of business data analysis.

READ FULL TEXT
research
07/10/2020

Hypothetical Reasoning via Provenance Abstraction

Data analytics often involves hypothetical reasoning: repeatedly modifyi...
research
09/18/2018

Towards Abstraction in ASP with an Application on Reasoning about Agent Policies

ASP programs are a convenient tool for problem solving, whereas with lar...
research
07/16/2021

Explainable AI Enabled Inspection of Business Process Prediction Models

Modern data analytics underpinned by machine learning techniques has bec...
research
07/26/2021

HySec-Flow: Privacy-Preserving Genomic Computing with SGX-based Big-Data Analytics Framework

Trusted execution environments (TEE) such as Intel's Software Guard Exte...
research
02/19/2021

Abstracting data in distributed ledger systems for higher level analytics and visualizations

By design, distributed ledger technologies persist low-level data which ...
research
01/29/2023

Data accounting and error counting

Can we infer sources of errors from outputs of the complex data analytic...
research
03/08/2021

Efficient Fuzz Testing for Apache Spark Using Framework Abstraction

The emerging data-intensive applications are increasingly dependent on d...

Please sign up or login with your details

Forgot password? Click here to reset