The Power of Hashing with Mersenne Primes

08/19/2020
by   Thomas Dybdahl Ahle, et al.
0

The classic way of computing a k-universal hash function is to use a random degree-(k-1) polynomial over a prime field ℤ_p. For a fast computation of the polynomial, the prime p is often chosen as a Mersenne prime p=2^b-1. In this paper, we show that there are other nice advantages to using Mersenne primes. Our view is that the output of the hash function is a b-bit integer that is uniformly distributed in [2^b], except that p (the all 1s value) is missing. Uniform bit strings have many nice properties, such as splitting into substrings which gives us two or more hash functions for the cost of one, while preserving strong theoretical qualities. We call this trick "Two for one" hashing, and we demonstrate it on 4-universal hashing in the classic Count Sketch algorithm for second moment estimation. We also provide a new fast branch-free code for division and modulus with Mersenne primes. Contrasting our analytic work, this code generalizes to Pseudo-Mersenne primes p=2^b-c for small c, improving upon a classical algorithm of Crandall.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2021

HalftimeHash: Modern Hashing without 64-bit Multipliers or Finite Fields

HalftimeHash is a new algorithm for hashing long strings. The goals are ...
research
10/12/2020

MMH* with arbitrary modulus is always almost-universal

Universal hash functions, discovered by Carter and Wegman in 1979, are o...
research
08/27/2023

Locally Uniform Hashing

Hashing is a common technique used in data processing, with a strong imp...
research
11/23/2017

Practical Hash Functions for Similarity Estimation and Dimensionality Reduction

Hashing is a basic tool for dimensionality reduction employed in several...
research
03/14/2019

Keyed hash function from large girth expander graphs

In this paper we present an algorithm to compute keyed hash function (me...
research
08/21/2018

Composite Hashing for Data Stream Sketches

In rapid and massive data streams, it is often not possible to estimate ...
research
09/23/2022

Analysis of the new standard hash function

On 2^nd October 2012 the NIST (National Institute of Standards and Techn...

Please sign up or login with your details

Forgot password? Click here to reset