Data Amplification: A Unified and Competitive Approach to Property Estimation

03/29/2019
by   Yi Hao, et al.
0

Estimating properties of discrete distributions is a fundamental problem in statistical learning. We design the first unified, linear-time, competitive, property estimator that for a wide class of properties and for all underlying distributions uses just 2n samples to achieve the performance attained by the empirical estimator with n√( n) samples. This provides off-the-shelf, distribution-independent, "amplification" of the amount of data available relative to common-practice estimators. We illustrate the estimator's practical advantages by comparing it to existing estimators for a wide variety of properties and distributions. In most cases, its performance with n samples is even as good as that of the empirical estimator with n n samples, and for essentially all properties, its performance is comparable to that of the best existing estimator designed specifically for that property.

READ FULL TEXT
research
03/04/2019

Data Amplification: Instance-Optimal Property Estimation

The best-known and most commonly used distribution-property estimation t...
research
07/03/2020

Monotonicity preservation properties of kernel regression estimators

Three common classes of kernel regression estimators are considered: the...
research
03/23/2018

Determinantal Point Processes for Coresets

When one is faced with a dataset too large to be used all at once, an ob...
research
08/27/2020

On the High Accuracy Limitation of Adaptive Property Estimation

Recent years have witnessed the success of adaptive (or unified) approac...
research
02/26/2020

Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Discrete Distributions

The profile of a sample is the multiset of its symbol frequencies. We sh...
research
07/09/2023

Bayesian estimation of the Kullback-Leibler divergence for categorical sytems using mixtures of Dirichlet priors

In many applications in biology, engineering and economics, identifying ...
research
06/25/2019

Distribution-robust mean estimation via smoothed random perturbations

We consider the problem of mean estimation assuming only finite variance...

Please sign up or login with your details

Forgot password? Click here to reset