Why so? or Why no? Functional Causality for Explaining Query Answers

by   Alexandra Meliou, et al.

In this paper, we propose causality as a unified framework to explain query answers and non-answers, thus generalizing and extending several previously proposed approaches of provenance and missing query result explanations. We develop our framework starting from the well-studied definition of actual causes by Halpern and Pearl. After identifying some undesirable characteristics of the original definition, we propose functional causes as a refined definition of causality with several desirable properties. These properties allow us to apply our notion of causality in a database context and apply it uniformly to define the causes of query results and their individual contributions in several ways: (i) we can model both provenance as well as non-answers, (ii) we can define explanations as either data in the input relations or relational operations in a query plan, and (iii) we can give graded degrees of responsibility to individual causes, thus allowing us to rank causes. In particular, our approach allows us to explain contributions to relational aggregate functions and to rank causes according to their respective responsibilities. We give complexity results and describe polynomial algorithms for evaluating causality in tractable cases. Throughout the paper, we illustrate the applicability of our framework with several examples. Overall, we develop in this paper the theoretical foundations of causality theory in a database context.


page 1

page 2

page 3

page 4


The Complexity of Causality and Responsibility for Query Answers and non-Answers

An answer to a query has a well-defined lineage expression (alternativel...

Causality in Configurable Software Systems

Detecting and understanding reasons for defects and inadvertent behavior...

Causes for Query Answers from Databases, Datalog Abduction and View-Updates: The Presence of Integrity Constraints

Causality has been recently introduced in databases, to model, character...

Putting Things into Context: Rich Explanations for Query Answers using Join Graphs (extended version)

In many data analysis applications, there is a need to explain why a sur...

Causes for Query Answers from Databases: Datalog Abduction, View-Updates, and Integrity Constraints

Causality has been recently introduced in databases, to model, character...

To not miss the forest for the trees – a holistic approach for explaining missing answers over nested data (extended version)

Query-based explanations for missing answers identify which operators of...

A Quantum Observation Scheme Can Universally Identify Causalities from Correlations

It has long been recognized as a difficult problem to determine whether ...