DeepAI
Log In Sign Up

Offset-value coding in database query processing

09/30/2022
by   Goetz Graefe, et al.
Google
0

Recent work shows how offset-value coding speeds up database query execution, not only sorting but also duplicate removal and grouping(aggregation) in sorted streams, order-preserving exchange (shuffle), merge join, and more. It already saves thousands of CPUs in Google's Napa and F1 Query systems, e.g., in grouping algorithms and in log-structured merge-forests. In order to realize the full benefit of interesting orderings, however, query execution algorithms must not only consume and exploit offset-value codes but also produce offset-value codes for the next operator in the pipeline. Our research has investigated ways to produce offset-value codes without comparing successive output rows column-by-column. This short paper introduces a new theorem and, based on its proof and a simple corollary, describes in detail how order-preserving algorithms (from filter to merge join and even shuffle) can compute offset-value codes for their outputs. These calculations are surprisingly simple and very efficient.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/17/2022

Robust and Efficient Sorting with Offset-Value Coding

Sorting and searching are large parts of database query processing, e.g....
10/01/2020

Sort-based grouping and aggregation

Database query processing requires algorithms for duplicate removal, gro...
01/16/2019

SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning

SkinnerDB is designed from the ground up for reliable join ordering. It ...
03/22/2022

Non-recursive Approach for Sort-Merge Join Operation

Several algorithms have been developed over the years to perform join op...
03/21/2020

Covering the Relational Join

In this paper, we initiate a theoretical study of what we call the join ...
10/05/2021

Scalable Relational Query Processing on Big Matrix Data

The use of large-scale machine learning methods is becoming ubiquitous i...
11/27/2018

Efficiently Charting RDF

We propose a visual query language for interactively exploring large-sca...