Learning from Data to Speed-up Sorted Table Search Procedures: Methodology and Practical Guidelines

07/20/2020
by   Domenico Amato, et al.
0

Sorted Table Search Procedures are the quintessential query-answering tool, with widespread usage that now includes also Web Applications, e.g, Search Engines (Google Chrome) and ad Bidding Systems (AppNexus). Speeding them up, at very little cost in space, is still a quite significant achievement. Here we study to what extend Machine Learning Techniques can contribute to obtain such a speed-up via a systematic experimental comparison of known efficient implementations of Sorted Table Search procedures, with different Data Layouts, and their Learned counterparts developed here. We characterize the scenarios in which those latter can be profitably used with respect to the former, accounting for both CPU and GPU computing. Our approach contributes also to the study of Learned Data Structures, a recent proposal to improve the time/space performance of fundamental Data Structures, e.g., B-trees, Hash Tables, Bloom Filters. Indeed, we also formalize an Algorithmic Paradigm of Learned Dichotomic Sorted Table Search procedures that naturally complements the Learned one proposed here and that characterizes most of the known Sorted Table Search Procedures as having a "learning phase" that approximates Simple Linear Regression.

READ FULL TEXT
research
07/19/2021

Learned Sorted Table Search and Static Indexes in Small Model Space

Machine Learning Techniques, properly combined with Data Structures, hav...
research
04/05/2019

A New Approach to Speed up Combinatorial Search Strategies Using Stack and Hash Table

Owing to the significance of combinatorial search strategies both for ac...
research
07/11/2018

Data-Parallel Hashing Techniques for GPU Architectures

Hash tables are one of the most fundamental data structures for effectiv...
research
02/21/2022

On the Suitability of Neural Networks as Building Blocks for The Design of Efficient Learned Indexes

With the aim of obtaining time/space improvements in classic Data Struct...
research
06/21/2022

TabText: a Systematic Approach to Aggregate Knowledge Across Tabular Data Structures

Processing and analyzing tabular data in a productive and efficient way ...

Please sign up or login with your details

Forgot password? Click here to reset