The Broad Optimality of Profile Maximum Likelihood

06/10/2019
by   Yi Hao, et al.
0

We study three fundamental statistical-learning problems: distribution estimation, property estimation, and property testing. We establish the profile maximum likelihood (PML) estimator as the first unified sample-optimal approach to a wide range of learning tasks. In particular, for every alphabet size k and desired accuracy ε: Distribution estimation Under ℓ_1 distance, PML yields optimal Θ(k/(ε^2 k)) sample complexity for sorted-distribution estimation, and a PML-based estimator empirically outperforms the Good-Turing estimator on the actual distribution; Additive property estimation For a broad class of additive properties, the PML plug-in estimator uses just four times the sample size required by the best estimator to achieve roughly twice its error, with exponentially higher confidence; α-Rényi entropy estimation For an integer α>1, the PML plug-in estimator has optimal k^1-1/α sample complexity; for non-integer α>3/4, the PML plug-in estimator has sample complexity lower than the state of the art; Identity testing In testing whether an unknown distribution is equal to or at least ε far from a given distribution in ℓ_1 distance, a PML-based tester achieves the optimal sample complexity up to logarithmic factors of k. With minor modifications, most of these results also hold for a near-linear-time computable variant of PML.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2020

On the High Accuracy Limitation of Adaptive Property Estimation

Recent years have witnessed the success of adaptive (or unified) approac...
research
10/13/2022

On the Efficient Implementation of High Accuracy Optimality of Profile Maximum Likelihood

We provide an efficient unified plug-in approach for estimating symmetri...
research
03/04/2019

Data Amplification: Instance-Optimal Property Estimation

The best-known and most commonly used distribution-property estimation t...
research
11/08/2019

Unified Sample-Optimal Property Estimation in Near-Linear Time

We consider the fundamental learning problem of estimating properties of...
research
08/02/2022

Bias Reduction for Sum Estimation

In classical statistics and distribution testing, it is often assumed th...
research
02/23/2018

Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance

We present Local Moment Matching (LMM), a unified methodology for symmet...
research
02/21/2020

Practical Estimation of Renyi Entropy

Entropy Estimation is an important problem with many applications in cry...

Please sign up or login with your details

Forgot password? Click here to reset