A Layered Aggregate Engine for Analytics Workloads

06/20/2019
by   Maximilian Schleich, et al.
0

This paper introduces LMFAO (Layered Multiple Functional Aggregate Optimization), an in-memory optimization and execution engine for batches of aggregates over the input database. The primary motivation for this work stems from the observation that for a variety of analytics over databases, their data-intensive tasks can be decomposed into group-by aggregates over the join of the input database relations. We exemplify the versatility and competitiveness of LMFAO for a handful of widely used analytics: learning ridge linear regression, classification trees, regression trees, and the structure of Bayesian networks using Chow-Liu trees; and data cubes used for exploration in data warehousing. LMFAO consists of several layers of logical and code optimizations that systematically exploit sharing of computation, parallelism, and code specialization. We conducted two types of performance benchmarks. In experiments with four datasets, LMFAO outperforms by several orders of magnitude on one hand, a commercial database system and MonetDB for computing batches of aggregates, and on the other hand, TensorFlow, Scikit, R, and AC/DC for learning a variety of models over databases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2020

LMFAO: An Engine for Batches of Group-By Aggregates

LMFAO is an in-memory optimization and execution engine for large batche...
research
03/20/2018

AC/DC: In-Database Learning Thunderstruck

We report on the design and implementation of the AC/DC gradient descent...
research
01/10/2020

Multi-layer Optimizations for End-to-End Data Analytics

We consider the problem of training machine learning models over multi-r...
research
06/01/2020

F-IVM: Learning over Fast-Evolving Relational Data

F-IVM is a system for real-time analytics such as machine learning appli...
research
02/23/2023

Simultaneous Drawing of Layered Trees

We study the crossing minimization problem in a layered graph drawing of...
research
02/06/2013

Learning Bayesian Networks from Incomplete Databases

Bayesian approaches to learn the graphical structure of Bayesian Belief ...
research
03/15/2023

F-IVM: Analytics over Relational Databases under Updates

This article describes F-IVM, a unified approach for maintaining analyti...

Please sign up or login with your details

Forgot password? Click here to reset