Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries

07/28/2023
by   Yifei Yang, et al.
0

This paper presents predicate transfer, a novel method that optimizes join performance by pre-filtering tables to reduce the join input sizes. Predicate transfer generalizes Bloom join, which conducts pre-filtering within a single join operation, to multi-table joins such that the filtering benefits can be significantly increased. Predicate transfer is inspired by the seminal theoretical results by Yannakakis, which uses semi-joins to pre-filter acyclic queries. Predicate transfer generalizes the theoretical results to any join graphs and use Bloom filters to replace semi-joins leading to significant speedup. Evaluation shows predicate transfer can outperform Bloom join by 3.1x on average on TPC-H benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2019

GPU-based Efficient Join Algorithms on Hadoop

The growing data has brought tremendous pressure for query processing an...
research
10/01/2021

MATE: Multi-Attribute Table Extraction

A core operation in data discovery is to find joinable tables for a give...
research
05/15/2018

Approximate Distributed Joins in Apache Spark

The join operation is a fundamental building block of parallel data proc...
research
06/21/2019

Learning to Sample: Counting with Complex Queries

In this paper we present a suite of methods to efficiently estimate coun...
research
10/22/2018

Selection of BJI configuration: Approach based on minimal transversals

Decision systems deal with a large volume of data stored in new database...
research
07/01/2023

Aggregation Consistency Errors in Semantic Layers and How to Avoid Them

Analysts often struggle with analyzing data from multiple tables in a da...
research
11/16/2021

The Case for Learned In-Memory Joins

In-memory join is an essential operator in any database engine. It has b...

Please sign up or login with your details

Forgot password? Click here to reset