Explaining Inference Queries with Bayesian Optimization

02/10/2021
by   Brandon Lockhart, et al.
0

Obtaining an explanation for an SQL query result can enrich the analysis experience, reveal data errors, and provide deeper insight into the data. Inference query explanation seeks to explain unexpected aggregate query results on inference data; such queries are challenging to explain because an explanation may need to be derived from the source, training, or inference data in an ML pipeline. In this paper, we model an objective function as a black-box function and propose BOExplain, a novel framework for explaining inference queries using Bayesian optimization (BO). An explanation is a predicate defining the input tuples that should be removed so that the query result of interest is significantly affected. BO - a technique for finding the global optimum of a black-box function - is used to find the best predicate. We develop two new techniques (individual contribution encoding and warm start) to handle categorical variables. We perform experiments showing that the predicates found by BOExplain have a higher degree of explanation compared to those found by the state-of-the-art query explanation engines. We also show that BOExplain is effective at deriving explanations for inference queries from source and training data on a variety of real-world datasets. BOExplain is open-sourced as a Python package at https://github.com/sfu-db/BOExplain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2020

Projective Preferential Bayesian Optimization

Bayesian optimization is an effective method for finding extrema of a bl...
research
12/29/2018

Explaining Aggregates for Exploratory Analytics

Analysts wishing to explore multivariate data spaces, typically pose que...
research
04/12/2020

Complaint-driven Training Data Debugging for Query 2.0

As the need for machine learning (ML) increases rapidly across all indus...
research
03/21/2019

Explain3D: Explaining Disagreements in Disjoint Datasets

Data plays an important role in applications, analytic processes, and ma...
research
09/13/2022

FEDEX: An Explainability Framework for Data Exploration Steps

When exploring a new dataset, Data Scientists often apply analysis queri...
research
11/02/2021

Explaining Documents' Relevance to Search Queries

We present GenEx, a generative model to explain search results to users ...
research
05/27/2023

Query-Efficient Black-Box Red Teaming via Bayesian Optimization

The deployment of large-scale generative models is often restricted by t...

Please sign up or login with your details

Forgot password? Click here to reset