Optimal On The Fly Index Selection in Polynomial Time

09/12/2017
by   Herbert Jordan, et al.
0

The index selection problem (ISP) is an important problem for accelerating the execution of relational queries, and it has received a lot of attention as a combinatorial knapsack problem in the past. Various solutions to this very hard problem have been provided. In contrast to existing literature, we change the underlying assumptions of the problem definition: we adapt the problem for systems that store relations in memory, and use complex specification languages, e.g., Datalog. In our framework, we decompose complex queries into primitive searches that select tuples in a relation for which an equality predicate holds. A primitive search can be accelerated by an index exhibiting a worst-case run-time complexity of log-linear time in the size of the output result of the primitive search. However, the overheads associated with maintaining indexes are very costly in terms of memory and computing time. In this work, we present an optimal polynomial-time algorithm that finds the minimal set of indexes of a relation for a given set of primitive searches. An index may cover more than one primitive search due to the algebraic properties of the search predicate, which is a conjunction of equalities over the attributes of a relation. The index search space exhibits a exponential complexity in the number of attributes in a relation, and, hence brute-force algorithms searching for solutions in the index domain are infeasible. As a scaffolding for designing a polynomial-time algorithm, we build a partial order on search operations and use a constructive version of Dilworth's theorem. We show a strong relationship between chains of primitive searches (forming a partial order) and indexes. We demonstrate the effectiveness and efficiency of our algorithm for an in-memory Datalog compiler that is able to process relations with billions of entries in memory.

READ FULL TEXT
research
12/26/2019

A Polynomial Time Algorithm for Computing the Strong Rainbow Connection Numbers of Odd Cacti

We consider the problem of computing the strong rainbow connection numbe...
research
10/18/2022

A polynomial-time algorithm to solve the large scale of airplane refueling problem

Airplane refueling problem (ARP) is a scheduling problem with an objecti...
research
12/09/2022

A Polynomial-Time Algorithm for MCS Partial Search Order on Chordal Graphs

We study the partial search order problem (PSOP) proposed recently by Sc...
research
01/18/2022

A Computation Model with Automatic Functions and Relations as Primitive Operations

Prior work of Hartmanis and Simon (Hartmanis and Simon, 1974) and Floyd ...
research
01/18/2022

Finding Strong Components Using Depth-First Search

We survey three algorithms that use depth-first search to find strong co...
research
05/11/2020

Towards Efficient Normalizers of Primitive Groups

We present the ideas behind an algorithm to compute normalizers of primi...
research
02/11/2022

Pseudo Polynomial-Time Top-k Algorithms for d-DNNF Circuits

We are interested in computing k most preferred models of a given d-DNNF...

Please sign up or login with your details

Forgot password? Click here to reset