To pipeline or not to pipeline, that is the question

02/03/2020
by   Harshad Deshmukh, et al.
0

In designing query processing primitives, a crucial design choice is the method for data transfer between two operators in a query plan. As we were considering this critical design mechanism for an in-memory database system that we are building, we quickly realized that (surprisingly) there isn't a clear definition of this concept. Papers are full or ad hoc use of terms like pipelining and blocking, but as these terms are not crisply defined, it is hard to fully understand the results attributed to these concepts. To address this limitation, we introduce a clear terminology for how to think about data transfer between operators in a query pipeline. We show that there isn't a clear definition of pipelining and blocking, and that there is a full spectrum of techniques based on a simple concept called unit-of-transfer. Next, we develop an analytical model for inter-operator communication, and highlight the key parameters that impact performance (for in-memory database settings). Armed with this model, we then apply it to the system we are designing and highlight the insights we gathered from this exercise. We find that the gap between pipelining and non-pipelining query execution, w.r.t. key factors such as performance and memory footprint is quite narrow, and thus system designers should likely rethink the notion of pipelining vs. blocking for in-memory database systems.

READ FULL TEXT
research
07/26/2022

Implementing the Comparison-Based External Sort

In the age of big data, sorting is an indispensable operation for DBMSes...
research
08/13/2018

Database Operations in D4M.jl

Each step in the data analytics pipeline is important, including databas...
research
08/09/2023

Eigenvector Dreaming

Among the performance-enhancing procedures for Hopfield-type networks th...
research
12/11/2021

Impact of Blocking Correlation on the Performance of mmWave Cellular Networks

In mmWave networks, a large or nearby object can obstruct multiple commu...
research
03/27/2018

Language-integrated provenance in Haskell

Scientific progress increasingly depends on data management, particularl...
research
05/03/2019

On the Impact of Memory Allocation on High-Performance Query Processing

Somewhat surprisingly, the behavior of analytical query engines is cruci...
research
02/26/2018

Shaping Influence and Influencing Shaping: A Computational Red Teaming Trust-based Swarm Intelligence Model

Sociotechnical systems are complex systems, where nonlinear interaction ...

Please sign up or login with your details

Forgot password? Click here to reset