Missing Mass Estimation from Sticky Channels

by   Prafulla Chandra, et al.

Distribution estimation under error-prone or non-ideal sampling modelled as "sticky" channels have been studied recently motivated by applications such as DNA computing. Missing mass, the sum of probabilities of missing letters, is an important quantity that plays a crucial role in distribution estimation, particularly in the large alphabet regime. In this work, we consider the problem of estimation of missing mass, which has been well-studied under independent and identically distributed (i.i.d) sampling, in the case when sampling is "sticky". Precisely, we consider the scenario where each sample from an unknown distribution gets repeated a geometrically-distributed number of times. We characterise the minimax rate of Mean Squared Error (MSE) of estimating missing mass from such sticky sampling channels. An upper bound on the minimax rate is obtained by bounding the risk of a modified Good-Turing estimator. We derive a matching lower bound on the minimax rate by extending the Le Cam method.


page 1

page 2

page 3

page 4


On consistent estimation of the missing mass

Given n samples from a population of individuals belonging to different ...

Missing Mass of Rank-2 Markov Chains

Estimation of missing mass with the popular Good-Turing (GT) estimator i...

Concentration of the missing mass in metric spaces

We study the estimation of the probability to observe data further than ...

Adaptive Estimation of Random Vectors with Bandit Feedback

We consider the problem of sequentially learning to estimate, in the mea...

A Good-Turing estimator for feature allocation models

Feature allocation models generalize species sampling models by allowing...

Adaptive Sampling for Estimating Distributions: A Bayesian Upper Confidence Bound Approach

The problem of adaptive sampling for estimating probability mass functio...

Estimating linear functionals of a sparse family of Poisson means

Assume that we observe a sample of size n composed of p-dimensional sign...