Offset-value coding in database query processing

09/30/2022
by   Goetz Graefe, et al.
0

Recent work shows how offset-value coding speeds up database query execution, not only sorting but also duplicate removal and grouping (aggregation) in sorted streams, order-preserving exchange (shuffle), merge join, and more. It already saves thousands of CPUs in Google's Napa and F1 Query systems, e.g., in grouping algorithms and in log-structured merge-forests. In order to realize the full benefit of interesting orderings, however, query execution algorithms must not only consume and exploit offset-value codes but also produce offset-value codes for the next operator in the pipeline. Our research has sought ways to produce offset-value codes without comparing successive output rows one-by-one, column-by-column. This short paper introduces a new theorem and, based on its proof and a simple corollary, describes in detail how order-preserving algorithms (from filter to merge join and even shuffle) can compute offset-value codes for their outputs. These computations are surprisingly simple and very efficient.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2022

Robust and Efficient Sorting with Offset-Value Coding

Sorting and searching are large parts of database query processing, e.g....
research
10/01/2020

Sort-based grouping and aggregation

Database query processing requires algorithms for duplicate removal, gro...
research
01/16/2019

SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning

SkinnerDB is designed from the ground up for reliable join ordering. It ...
research
03/22/2022

Non-recursive Approach for Sort-Merge Join Operation

Several algorithms have been developed over the years to perform join op...
research
02/25/2022

Break Up the Pipeline Structure to Reach a Nearly Optimal End-to-End Latency

Query optimization is still problematic in the commercial database syste...
research
10/05/2021

Scalable Relational Query Processing on Big Matrix Data

The use of large-scale machine learning methods is becoming ubiquitous i...
research
11/27/2018

Efficiently Charting RDF

We propose a visual query language for interactively exploring large-sca...

Please sign up or login with your details

Forgot password? Click here to reset