How to Store a Random Walk

07/25/2019
by   Emanuele Viola, et al.
0

Motivated by storage applications, we study the following data structure problem: An encoder wishes to store a collection of jointly-distributed files X:=(X_1,X_2,..., X_n) ∼μ which are correlated (H_μ(X) ≪∑_i H_μ(X_i)), using as little (expected) memory as possible, such that each individual file X_i can be recovered quickly with few (ideally constant) memory accesses. In the case of independent random files, a dramatic result by (FOCS'08) and subsequently by Dodis, and Thorup (STOC'10) shows that it is possible to store X using just a constant number of extra bits beyond the information-theoretic minimum space, while at the same time decoding each X_i in constant time. However, in the (realistic) case where the files are correlated, much weaker results are known, requiring at least Ω(n/poly n) extra bits for constant decoding time, even for "simple" joint distributions μ. We focus on the natural case of compressingMarkov chains, i.e., storing a length-n random walk on any (possibly directed) graph G. Denoting by κ(G,n) the number of length-n walks on G, we show that there is a succinct data structure storing a random walk using _2 κ(G,n) + O( n) bits of space, such that any vertex along the walk can be decoded in O(1) time on a word-RAM. For the harder task of matching the point-wise optimal space of the walk, i.e., the empirical entropy ∑_i=1^n-1 (deg(v_i)), we present a data structure with O(1) extra bits at the price of O( n) decoding time, and show that any improvement on this would lead to an improved solution on the long-standing Dictionary problem. All of our data structures support the online version of the problem with constant update and query time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2019

Entropy Trees and Range-Minimum Queries In Optimal Average-Case Space

The range-minimum query (RMQ) problem is a fundamental data structuring ...
research
06/25/2018

Fast entropy-bounded string dictionary look-up with mismatches

We revisit the fundamental problem of dictionary look-up with mismatches...
research
11/02/2017

An Optimal Choice Dictionary

A choice dictionary is a data structure that can be initialized with a p...
research
07/13/2018

Pairwise Independent Random Walks can be Slightly Unbounded

A family of problems that have been studied in the context of various st...
research
02/26/2018

Random Walks on Polytopes of Constant Corank

We show that the pivoting process associated with one line and n points ...
research
09/20/2018

Small Uncolored and Colored Choice Dictionaries

A choice dictionary can be initialized with a parameter n∈N and subseque...
research
01/23/2020

O(loglog n) Worst-Case Local Decoding and Update Efficiency for Data Compression

This paper addresses the problem of data compression with local decoding...

Please sign up or login with your details

Forgot password? Click here to reset