Control Flow Duplication for Columnar Arrays in a Dynamic Compiler

02/20/2023
by   Sebastian Kloibhofer, et al.
0

Columnar databases are an established way to speed up online analytical processing (OLAP) queries. Nowadays, data processing (e.g., storage, visualization, and analytics) is often performed at the programming language level, hence it is desirable to also adopt columnar data structures for common language runtimes. While there are frameworks, libraries, and APIs to enable columnar data stores in programming languages, their integration into applications typically requires developer interference. In prior work, researchers implemented an approach for *automated* transformation of arrays into columnar arrays in the GraalVM JavaScript runtime. However, this approach suffers from performance issues on smaller workloads as well as on more complex nested data structures. We find that the key to optimizing accesses to columnar arrays is to identify queries and apply specific optimizations to them. In this paper, we describe novel compiler optimizations in the GraalVM Compiler that optimize queries on columnar arrays. At JIT compile time, we identify loops that access potentially columnar arrays and duplicate them in order to specifically optimize accesses to columnar arrays. Additionally, we describe a new approach for creating columnar arrays from arrays consisting of complex objects by performing **multi-level storage transformation**. We demonstrate our approach via an implementation for JavaScript `Date` objects. [ full abstract at https://doi.org/10.22152/programming-journal.org/2023/7/9 ]

READ FULL TEXT
research
10/06/2020

The Improved GP 2 Compiler

GP 2 is a rule-based programming language based on graph transformation ...
research
07/25/2018

Compiling Database Application Programs

There is a trend towards increased specialization of data management sof...
research
03/14/2022

Automatic Compiler-Based Data Structure Generation

Optimizing compilers are mainly equipped to optimize control flow. The o...
research
10/29/2020

Systolic Computing on GPUs for Productive Performance

We propose a language and compiler to productively build high-performanc...
research
08/23/2022

Exchangeable Laws in Borel Data Structures

Motivated by statistical practice, category theory terminology is used t...
research
10/10/2017

A Lambda Calculus for Transfinite Arrays: Unifying Arrays and Streams

Array programming languages allow for concise and generic formulations o...
research
09/08/2022

Looplets: A Language For Structured Coiteration

Real world arrays often contain underlying structure, such as sparsity, ...

Please sign up or login with your details

Forgot password? Click here to reset