Generating Custom Code for Efficient Query Execution on Heterogeneous Processors

09/03/2017
by   Sebastian Breß, et al.
0

Processor manufacturers build increasingly specialized processors to mitigate the effects of the power wall to deliver improved performance. Currently, database engines are manually optimized for each processor: A costly and error prone process. In this paper, we propose concepts to enable the database engine to perform per-processor optimization automatically. Our core idea is to create variants of generated code and to learn a fast variant for each processor. We create variants by modifying parallelization strategies, specializing data structures, and applying different code transformations. Our experimental results show that the performance of variants may diverge up to two orders of magnitude. Therefore, we need to generate custom code for each processor to achieve peak performance. We show that our approach finds a fast custom variant for multi-core CPUs, GPUs, and MICs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/15/2022

Efficient Post-Processors for Improving Error-Correcting Performance of LDPC Codes

The error floor phenomenon, associated with iterative decoders, is one o...
research
02/27/2021

Acceleration of probabilistic reasoning through custom processor architecture

Probabilistic reasoning is an essential tool for robust decision-making ...
research
04/05/2018

Early Experience on Using Knights Landing Processors for Lattice Boltzmann Applications

The Knights Landing (KNL) is the codename for the latest generation of I...
research
07/25/2023

Implementing and Benchmarking the Locally Competitive Algorithm on the Loihi 2 Neuromorphic Processor

Neuromorphic processors have garnered considerable interest in recent ye...
research
09/08/2018

Accelerating Viterbi Algorithm using Custom Instruction Approach

In recent years, the decoding algorithms in communication networks are b...
research
02/23/2017

First Experiences Optimizing Smith-Waterman on Intel's Knights Landing Processor

The well-known Smith-Waterman (SW) algorithm is the most commonly used m...
research
02/16/2019

A Timer-Augmented Cost Function for Load Balanced DSMC

Due to a hard dependency between time steps, large-scale simulations of ...

Please sign up or login with your details

Forgot password? Click here to reset