Efficient Representation of Large-Alphabet Probability Distributions

05/08/2022
by   Aviv Adler, et al.
0

A number of engineering and scientific problems require representing and manipulating probability distributions over large alphabets, which we may think of as long vectors of reals summing to 1. In some cases it is required to represent such a vector with only b bits per entry. A natural choice is to partition the interval [0,1] into 2^b uniform bins and quantize entries to each bin independently. We show that a minor modification of this procedure – applying an entrywise non-linear function (compander) f(x) prior to quantization – yields an extremely effective quantization method. For example, for b=8 (16) and 10^5-sized alphabets, the quality of representation improves from a loss (under KL divergence) of 0.5 (0.1) bits/entry to 10^-4 (10^-9) bits/entry. Compared to floating point representations, our compander method improves the loss from 10^-1(10^-6) to 10^-4(10^-9) bits/entry. These numbers hold for both real-world data (word frequencies in books and DNA k-mer counts) and for synthetic randomly generated distributions. Theoretically, we set up a minimax optimality criterion and show that the compander f(x)  ∝ ArcSinh(√((1/2) (K log K) x)) achieves near-optimal performance, attaining a KL-quantization loss of ≍ 2^-2blog^2 K for a K-letter alphabet and b→∞. Interestingly, a similar minimax criterion for the quadratic loss on the hypercube shows optimality of the standard uniform quantizer. This suggests that the ArcSinh quantizer is as fundamental for KL-distortion as the uniform quantizer for quadratic distortion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2020

The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions

This paper introduces a new algorithm for the fundamental problem of gen...
research
11/01/2018

Rethinking floating point for deep learning

Reducing hardware overhead of neural networks for faster or lower power ...
research
04/02/2021

Covariance estimation under one-bit quantization

We consider the classical problem of estimating the covariance matrix of...
research
06/14/2021

KL Guided Domain Adaptation

Domain adaptation is an important problem and often needed for real-worl...
research
05/05/2021

Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization

Various post-training uniform quantization methods have usually been stu...
research
09/17/2020

Binarized Johnson-Lindenstrauss embeddings

We consider the problem of encoding a set of vectors into a minimal numb...
research
07/23/2020

Improving distribution and flexible quantization for DCT coefficients

While it is a common knowledge that AC coefficients of Fourier-related t...

Please sign up or login with your details

Forgot password? Click here to reset