Selectivity Estimation with Attribute Value Dependencies using Linked Bayesian Networks

09/21/2020
by   Max Halford, et al.
0

Relational query optimisers rely on cost models to choose between different query execution plans. Selectivity estimates are known to be a crucial input to the cost model. In practice, standard selectivity estimation procedures are prone to large errors. This is mostly because they rely on the so-called attribute value independence and join uniformity assumptions. Therefore, multidimensional methods have been proposed to capture dependencies between two or more attributes both within and across relations. However, these methods require a large computational cost which makes them unusable in practice. We propose a method based on Bayesian networks that is able to capture cross-relation attribute value dependencies with little overhead. Our proposal is based on the assumption that dependencies between attributes are preserved when joins are involved. Furthermore, we introduce a parameter for trading between estimation accuracy and computational cost. We validate our work by comparing it with other relevant methods on a large workload derived from the JOB and TPC-DS benchmarks. Our results show that our method is an order of magnitude more efficient than existing methods, whilst maintaining a high level of accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2019

An Approach Based on Bayesian Networks for Query Selectivity Estimation

The efficiency of a query execution plan depends on the accuracy of the ...
research
03/06/2013

A Construction of Bayesian Networks from Databases Based on an MDL Principle

This paper addresses learning stochastic rules especially on an inter-at...
research
10/07/2021

Workload-Aware Materialization of Junction Trees

Bayesian networks are popular probabilistic models that capture the cond...
research
07/05/2021

Variational Bayesian Inference for the Polytomous-Attribute Saturated Diagnostic Classification Model with Parallel Computing

As a statistical tool to assist formative assessments in educational set...
research
06/28/2021

Modelling Monotonic and Non-Monotonic Attribute Dependencies with Embeddings: A Theoretical Analysis

During the last decade, entity embeddings have become ubiquitous in Arti...
research
02/27/2013

Reduction of Computational Complexity in Bayesian Networks through Removal of Weak Dependencies

The paper presents a method for reducing the computational complexity of...
research
02/10/2020

Corrected score methods for estimating Bayesian networks with error-prone nodes

Motivated by inferring cellular signaling networks using noisy flow cyto...

Please sign up or login with your details

Forgot password? Click here to reset