Givens QR Decomposition over Relational Databases

04/01/2022
by   Dan Olteanu, et al.
0

This paper introduces Figaro, an algorithm for computing the upper-triangular matrix in the QR decomposition of the matrix defined by the natural join over a relational database. The QR decomposition lies at the core of many linear algebra techniques and their machine learning applications, including: the matrix inverse; the least squares; the singular value decomposition; eigenvalue problems; and the principal component analysis. Figaro's main novelty is that it pushes the QR decomposition past the join. This leads to several desirable properties. For acyclic joins, it takes time linear in the database size and independent of the join size. Its execution is equivalent to the application of a sequence of Givens rotations proportional to the join size. Its number of rounding errors relative to the classical QR decomposition algorithms is on par with the input size relative to the join size. In experiments with real-world and synthetic databases, Figaro outperforms both in runtime performance and accuracy the LAPACK libraries openblas and Intel MKL by a factor proportional to the gap between the join output and input sizes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2020

Instance Optimal Join Size Estimation

We consider the problem of efficiently estimating the size of the inner ...
research
08/20/2022

Safe Subjoins in Acyclic Joins

It is expensive to compute joins, often due to large intermediate relati...
research
10/30/2017

The equational theory of the natural join and inner union is decidable

The natural join and the inner union operations combine relations of a d...
research
11/16/2013

The Optimization of Running Queries in Relational Databases Using ANT-Colony Algorithm

The issue of optimizing queries is a cost-sensitive process and with res...
research
11/10/2017

Size bounds and query plans for relational joins

Relational joins are at the core of relational algebra, which in turn is...
research
03/22/2019

Instance and Output Optimal Parallel Algorithms for Acyclic Joins

Massively parallel join algorithms have received much attention in recen...
research
04/03/2023

Guaranteeing the Õ(AGM/OUT) Runtime for Uniform Sampling and OUT Size Estimation over Joins

We propose a new method for estimating the number of answers OUT of a sm...

Please sign up or login with your details

Forgot password? Click here to reset