A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics (Extended Version)

03/02/2020
by   Anil Shanbhag, et al.
0

There has been significant amount of excitement and recent work on GPU-based database systems. Previous work has claimed that these systems can perform orders of magnitude better than CPU-based database systems on analytical workloads such as those found in decision support and business intelligence applications. A hardware expert would view these claims with suspicion. Given the general notion that database operators are memory-bandwidth bound, one would expect the maximum gain to be roughly equal to the ratio of the memory bandwidth of GPU to that of CPU. In this paper, we adopt a model-based approach to understand when and why the performance gains of running queries on GPUs vs on CPUs vary from the bandwidth ratio (which is roughly 16x on modern hardware). We propose Crystal, a library of parallel routines that can be combined together to run full SQL queries on a GPU with minimal materialization overhead. We implement individual query operators to show that while the speedups for selection, projection, and sorts are near the bandwidth ratio, joins achieve less speedup due to differences in hardware capabilities. Interestingly, we show on a popular analytical workload that full query performance gain from running on GPU exceeds the bandwidth ratio despite individual operators having speedup less than bandwidth ratio, as a result of limitations of vectorizing chained operators on CPUs, resulting in a 25x speedup for GPUs over CPUs on the benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2022

Algorithmic Improvement and GPU Acceleration of the GenASM Algorithm

We improve on GenASM, a recent algorithm for genomic sequence alignment,...
research
02/01/2023

Revisiting Query Performance in GPU Database Systems

GPUs offer massive compute parallelism and high-bandwidth memory accesse...
research
04/07/2020

A GPU-friendly Geometric Data Model and Algebra for Spatial Queries: Extended Version

The availability of low cost sensors has led to an unprecedented growth ...
research
03/03/2022

Query Processing on Tensor Computation Runtimes

The huge demand for computation in artificial intelligence (AI) is drivi...
research
07/02/2023

Accelerating Relational Database Analytical Processing with Bulk-Bitwise Processing-in-Memory

Online Analytical Processing (OLAP) for relational databases is a busine...
research
08/18/2019

The Maximum Common Subgraph Problem: A Portfolio Approach

The Maximum Common Subgraph is a computationally challenging problem wit...
research
03/27/2022

GPU-Powered Spatial Database Engine for Commodity Hardware: Extended Version

Given the massive growth in the volume of spatial data, there is a great...

Please sign up or login with your details

Forgot password? Click here to reset