Functional Collection Programming with Semi-Ring Dictionaries

03/10/2021
by   Amir Shaikhha, et al.
0

This paper introduces semi-ring dictionaries, a powerful class of compositional and purely functional collections that subsume other collection types such as sets, multisets, arrays, vectors, and matrices. We develop SDQL, a statically typed language centered around semi-ring dictionaries, that can encode expressions in relational algebra with aggregations, functional collections, and linear algebra. Furthermore, thanks to the semi-ring algebraic structures behind these dictionaries, SDQL unifies a wide range of optimizations commonly used in databases and linear algebra. As a result, SDQL enables efficient processing of hybrid database and linear algebra workloads, by putting together optimizations that are otherwise confined to either database systems or linear algebra frameworks. Through experimental results, we show that a handful of relational and linear algebra workloads can take advantage of the SDQL language and optimizations. Overall, we observe that SDQL achieves competitive performance to Typer and Tectorwise, which are state-of-the-art in-memory systems for (flat, not nested) relational data, and achieves an average 2x speedup over SciPy for linear algebra workloads. Finally, for hybrid workloads involving linear algebra processing over nested biomedical data, SDQL can give up to one order of magnitude speedup over Trance, a state-of-the-art nested relational engine.

READ FULL TEXT
research
04/10/2018

Implementing Push-Pull Efficiently in GraphBLAS

We factor Beamer's push-pull, also known as direction-optimized breadth-...
research
04/12/2020

A Relational Matrix Algebra and its Implementation in a Column Store

Analytical queries often require a mixture of relational and linear alge...
research
11/18/2022

Compiling Structured Tensor Algebra

Tensor algebra is essential for data-intensive workloads in various comp...
research
08/25/2017

LevelHeaded: Making Worst-Case Optimal Joins Work in the Common Case

Pipelines combining SQL-style business intelligence (BI) queries and lin...
research
07/02/2022

The Programming of Algebra

We present module theory and linear maps as a powerful generalised and c...
research
03/23/2021

HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries (Extended Version)

Hybrid complex analytics workloads typically include (i) data management...
research
02/19/2020

SPORES: Sum-Product Optimization via Relational Equality Saturation for Large Scale Linear Algebra

Machine learning algorithms are commonly specified in linear algebra (LA...

Please sign up or login with your details

Forgot password? Click here to reset