Computing Accurate Probabilistic Estimates of One-D Entropy from Equiprobable Random Samples

02/25/2021
by   Hoshin V Gupta, et al.
0

We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one-dimensional entropy from equiprobable random samples, and compare it with the popular Bin-Counting (BC) method. In contrast to BC, which uses equal-width bins with varying probability mass, the QS method uses estimates of the quantiles that divide the support of the data generating probability density function (pdf) into equal-probability-mass intervals. Whereas BC requires optimal tuning of a bin-width hyper-parameter whose value varies with sample size and shape of the pdf, QS requires specification of the number of quantiles to be used. Results indicate, for the class of distributions tested, that the optimal number of quantile-spacings is a fixed fraction of the sample size (empirically determined to be  0.25-0.35), and that this value is relatively insensitive to distributional form or sample size, providing a clear advantage over BC since hyperparameter tuning is not required. Bootstrapping is used to approximate the sampling variability distribution of the resulting entropy estimate, and is shown to accurately reflect the true uncertainty. For the four distributional forms studied (Gaussian, Log-Normal, Exponential and Bimodal Gaussian Mixture), expected estimation bias is less than 1 small sample sizes. We speculate that estimating quantile locations, rather than bin-probabilities, results in more efficient use of the information in the data to approximate the underlying shape of an unknown data generating pdf.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2018

Calculation of sample size guaranteeing the required width of the empirical confidence interval with predefined probability

The goal of any estimation study is an interval estimation of a the para...
research
12/12/2021

Optimal Partitions for Nonparametric Multivariate Entropy Estimation

Efficient and accurate estimation of multivariate empirical probability ...
research
11/18/2019

Estimating Entropy of Distributions in Constant Space

We consider the task of estimating the entropy of k-ary distributions fr...
research
02/25/2019

Multiscale quantile segmentation

We introduce a new methodology for analyzing serial data by quantile reg...
research
02/25/2019

Multiscale quantile regression

We introduce a new methodology for analyzing serial data by quantile reg...
research
02/20/2018

Computing the Cumulative Distribution Function and Quantiles of the One-sided Kolmogorov-Smirnov Statistic

The cumulative distribution and quantile functions for the one-sided one...
research
08/23/2019

Economically rational sample-size choice and irreproducibility

Several systematic studies have suggested that a large fraction of publi...

Please sign up or login with your details

Forgot password? Click here to reset