PaCHash: Packed and Compressed Hash Tables

05/10/2022
by   Florian Kurpicz, et al.
0

We introduce PaCHash, a hash table that stores its objects contiguously in an array without intervening space, even if the objects have variable size. In particular, each object can be compressed using standard compression techniques. A small search data structure allows locating the objects in constant expected time. PaCHash is most naturally described as a static external hash table where it needs a constant number of bits of internal memory per block of external memory. However, PaCHash can be dynamized and is also useful for internal memory, having lower space consumption than all previous approaches even when considering only objects of identical size. For example, in some sense it beats a lower bound on the space consumption of k-perfect hashing. An implementation for fast SSDs needs about 5 bits of internal memory per block of external memory, requires only one disk access (of variable length) per search operation and has internal search overhead small compared to the disk access cost.

READ FULL TEXT

page 1

page 3

page 7

page 9

page 11

page 13

page 15

page 19

research
05/23/2018

Optimal Hashing in External Memory

Hash tables are a ubiquitous class of dictionary data structures. Howeve...
research
12/19/2022

High Performance Construction of RecSplit Based Minimal Perfect Hash Functions

A minimal perfect hash function (MPHF) is a bijection from a set of obje...
research
07/27/2022

Balanced Encoding of Near-Zero Correlation for an AES Implementation

Power consumption of a circuit can be exploited to recover the secret ke...
research
10/04/2022

SicHash – Small Irregular Cuckoo Tables for Perfect Hashing

A Perfect Hash Function (PHF) is a hash function that has no collisions ...
research
07/02/2021

Linear Probing Revisited: Tombstones Mark the Death of Primary Clustering

First introduced in 1954, linear probing is one of the oldest data struc...
research
11/24/2021

Tiny Pointers

This paper introduces a new data-structural object that we call the tiny...
research
07/16/2020

A Genetic Algorithm for Obtaining Memory Constrained Near-Perfect Hashing

The problem of fast items retrieval from a fixed collection is often enc...

Please sign up or login with your details

Forgot password? Click here to reset