Instance Based Approximations to Profile Maximum Likelihood

by   Nima Anari, et al.

In this paper we provide a new efficient algorithm for approximately computing the profile maximum likelihood (PML) distribution, a prominent quantity in symmetric property estimation. We provide an algorithm which matches the previous best known efficient algorithms for computing approximate PML distributions and improves when the number of distinct observed frequencies in the given instance is small. We achieve this result by exploiting new sparsity structure in approximate PML distributions and providing a new matrix rounding algorithm, of independent interest. Leveraging this result, we obtain the first provable computationally efficient implementation of PseudoPML, a general framework for estimating a broad class of symmetric properties. Additionally, we obtain efficient PML-based estimators for distributions with small profile entropy, a natural instance-based complexity measure. Further, we provide a simpler and more practical PseudoPML implementation that matches the best-known theoretical guarantees of such an estimator and evaluate this method empirically.


page 1

page 2

page 3

page 4


On the Efficient Implementation of High Accuracy Optimality of Profile Maximum Likelihood

We provide an efficient unified plug-in approach for estimating symmetri...

A General Framework for Symmetric Property Estimation

In this paper we provide a general framework for estimating symmetric pr...

Approximate Profile Maximum Likelihood

We propose an efficient algorithm for approximate computation of the pro...

UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting

Since its invention HyperLogLog has become the standard algorithm for ap...

The Bethe and Sinkhorn Permanents of Low Rank Matrices and Implications for Profile Maximum Likelihood

In this paper we consider the problem of computing the likelihood of the...

Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Discrete Distributions

The profile of a sample is the multiset of its symbol frequencies. We sh...

On Modeling Profiles instead of Values

We consider the problem of estimating the distribution underlying an obs...

Please sign up or login with your details

Forgot password? Click here to reset