Estimating Entropy of Distributions in Constant Space

by   Jayadev Acharya, et al.

We consider the task of estimating the entropy of k-ary distributions from samples in the streaming model, where space is limited. Our main contribution is an algorithm that requires O(k log (1/ε)^2/ε^3) samples and a constant O(1) memory words of space and outputs a ±ε estimate of H(p). Without space limitations, the sample complexity has been established as S(k,ε)=Θ(k/εlog k+log^2 k/ε^2), which is sub-linear in the domain size k, and the current algorithms that achieve optimal sample complexity also require nearly-linear space in k. Our algorithm partitions [0,1] into intervals and estimates the entropy contribution of probability values in each interval. The intervals are designed to trade off the bias and variance of these estimates.


page 1

page 2

page 3

page 4


Estimation of Entropy in Constant Space with Improved Sample Complexity

Recent work of Acharya et al. (NeurIPS 2019) showed how to estimate the ...

Stochastic Canonical Correlation Analysis

We tightly analyze the sample complexity of CCA, provide a learning algo...

Entropy Rate Estimation for Markov Chains with Large State Space

Estimating the entropy based on data is one of the prototypical problems...

Computing Accurate Probabilistic Estimates of One-D Entropy from Equiprobable Random Samples

We develop a simple Quantile Spacing (QS) method for accurate probabilis...

Learning a Single Neuron with Adversarial Label Noise via Gradient Descent

We study the fundamental problem of learning a single neuron, i.e., a fu...

Sequential algorithms for testing identity and closeness of distributions

What advantage do sequential procedures provide over batch algorithms fo...

Targeting Makes Sample Efficiency in Auction Design

This paper introduces the targeted sampling model in optimal auction des...