Unified Sample-Optimal Property Estimation in Near-Linear Time

by   Yi Hao, et al.

We consider the fundamental learning problem of estimating properties of distributions over large domains. Using a novel piecewise-polynomial approximation technique, we derive the first unified methodology for constructing sample- and time-efficient estimators for all sufficiently smooth, symmetric and non-symmetric, additive properties. This technique yields near-linear-time computable estimators whose approximation values are asymptotically optimal and highly-concentrated, resulting in the first: 1) estimators achieving the O(k/(ε^2log k)) min-max ε-error sample complexity for all k-symbol Lipschitz properties; 2) unified near-optimal differentially private estimators for a variety of properties; 3) unified estimator achieving optimal bias and near-optimal variance for five important properties; 4) near-optimal sample-complexity estimators for several important symmetric properties over both domain sizes and confidence levels. In addition, we establish a McDiarmid's inequality under Poisson sampling, which is of independent interest.


page 1

page 2

page 3

page 4


The Broad Optimality of Profile Maximum Likelihood

We study three fundamental statistical-learning problems: distribution e...

The Optimality of Profile Maximum Likelihood in Estimating Sorted Discrete Distributions

A striking result of [Acharya et al. 2017] showed that to estimate symme...

Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance

We present Local Moment Matching (LMM), a unified methodology for symmet...

Data Amplification: A Unified and Competitive Approach to Property Estimation

Estimating properties of discrete distributions is a fundamental problem...

Data Amplification: Instance-Optimal Property Estimation

The best-known and most commonly used distribution-property estimation t...

On the High Accuracy Limitation of Adaptive Property Estimation

Recent years have witnessed the success of adaptive (or unified) approac...

Asymptotic normality of a linear threshold estimator in fixed dimension with near-optimal rate

Linear thresholding models postulate that the conditional distribution o...

Code Repositories