Computing the Shapley Value of Facts in Query Answering

12/16/2021
by   Daniel Deutch, et al.
0

The Shapley value is a game-theoretic notion for wealth distribution that is nowadays extensively used to explain complex data-intensive computation, for instance, in network analysis or machine learning. Recent theoretical works show that query evaluation over relational databases fits well in this explanation paradigm. Yet, these works fall short of providing practical solutions to the computational challenge inherent to the Shapley computation. We present in this paper two practically effective solutions for computing Shapley values in query answering. We start by establishing a tight theoretical connection to the extensively studied problem of query evaluation over probabilistic databases, which allows us to obtain a polynomial-time algorithm for the class of queries for which probability computation is tractable. We then propose a first practical solution for computing Shapley values that adopts tools from probabilistic query evaluation. In particular, we capture the dependence of query answers on input database facts using Boolean expressions (data provenance), and then transform it, via Knowledge Compilation, into a particular circuit form for which we devise an algorithm for computing the Shapley values. Our second practical solution is a faster yet inexact approach that transforms the provenance to a Conjunctive Normal Form and uses a heuristic to compute the Shapley values. Our experiments on TPC-H and IMDB demonstrate the practical effectiveness of our solutions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2023

Banzhaf Values for Facts in Query Answering

Quantifying the contribution of database facts to query answers has been...
research
06/25/2023

From Shapley Value to Model Counting and Back

In this paper we investigate the problem of quantifying the contribution...
research
01/27/2022

Probabilistic Query Evaluation with Bag Semantics

We initiate the study of probabilistic query evaluation under bag semant...
research
01/20/2023

A Simple Algorithm for Consistent Query Answering under Primary Keys

We consider the dichotomy conjecture for consistent query answering unde...
research
03/30/2020

Consistency and Certain Answers in Relational to RDF Data Exchange with Shape Constraints

We investigate the data exchange from relational databases to RDF graphs...
research
08/17/2021

Computing and Maintaining Provenance of Query Result Probabilities in Uncertain Knowledge Graphs

Knowledge graphs (KG) that model the relationships between entities as l...
research
08/10/2022

Saturation-based Boolean conjunctive query answering and rewriting for the guarded quantification fragments

Answering Boolean conjunctive query over logical constraints is an essen...

Please sign up or login with your details

Forgot password? Click here to reset