On Modeling Profiles instead of Values

07/11/2012
by   Alon Orlitsky, et al.
0

We consider the problem of estimating the distribution underlying an observed sample of data. Instead of maximum likelihood, which maximizes the probability of the ob served values, we propose a different estimate, the high-profile distribution, which maximizes the probability of the observed profile the number of symbols appearing any given number of times. We determine the high-profile distribution of several data samples, establish some of its general properties, and show that when the number of distinct symbols observed is small compared to the data size, the high-profile and maximum-likelihood distributions are roughly the same, but when the number of symbols is large, the distributions differ, and high-profile better explains the data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2020

The Optimality of Profile Maximum Likelihood in Estimating Sorted Discrete Distributions

A striking result of [Acharya et al. 2017] showed that to estimate symme...
research
12/19/2017

Approximate Profile Maximum Likelihood

We propose an efficient algorithm for approximate computation of the pro...
research
04/06/2020

The Bethe and Sinkhorn Permanents of Low Rank Matrices and Implications for Profile Maximum Likelihood

In this paper we consider the problem of computing the likelihood of the...
research
11/05/2020

Instance Based Approximations to Profile Maximum Likelihood

In this paper we provide a new efficient algorithm for approximately com...
research
02/26/2020

Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Discrete Distributions

The profile of a sample is the multiset of its symbol frequencies. We sh...
research
01/13/2018

Is profile likelihood a true likelihood? An argument in favor

Profile likelihood is the key tool for dealing with nuisance parameters ...

Please sign up or login with your details

Forgot password? Click here to reset