Missing Mass Estimation from Sticky Channels

02/06/2022
by   Prafulla Chandra, et al.
0

Distribution estimation under error-prone or non-ideal sampling modelled as "sticky" channels have been studied recently motivated by applications such as DNA computing. Missing mass, the sum of probabilities of missing letters, is an important quantity that plays a crucial role in distribution estimation, particularly in the large alphabet regime. In this work, we consider the problem of estimation of missing mass, which has been well-studied under independent and identically distributed (i.i.d) sampling, in the case when sampling is "sticky". Precisely, we consider the scenario where each sample from an unknown distribution gets repeated a geometrically-distributed number of times. We characterise the minimax rate of Mean Squared Error (MSE) of estimating missing mass from such sticky sampling channels. An upper bound on the minimax rate is obtained by bounding the risk of a modified Good-Turing estimator. We derive a matching lower bound on the minimax rate by extending the Le Cam method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/25/2018

On consistent estimation of the missing mass

Given n samples from a population of individuals belonging to different ...
research
02/03/2021

Missing Mass of Rank-2 Markov Chains

Estimation of missing mass with the popular Good-Turing (GT) estimator i...
research
06/04/2022

Concentration of the missing mass in metric spaces

We study the estimation of the probability to observe data further than ...
research
06/26/2023

Optimal estimation of high-order missing masses, and the rare-type match problem

Consider a random sample (X_1,…,X_n) from an unknown discrete distributi...
research
03/31/2022

Adaptive Estimation of Random Vectors with Bandit Feedback

We consider the problem of sequentially learning to estimate, in the mea...
research
01/10/2019

Mean Estimation from One-Bit Measurements

We consider the problem of estimating the mean of a symmetric log-concav...
research
02/27/2019

A Good-Turing estimator for feature allocation models

Feature allocation models generalize species sampling models by allowing...

Please sign up or login with your details

Forgot password? Click here to reset