Maximum Entropy competes with Maximum Likelihood

by   A. E. Allahverdyan, et al.

Maximum entropy (MAXENT) method has a large number of applications in theoretical and applied machine learning, since it provides a convenient non-parametric tool for estimating unknown probabilities. The method is a major contribution of statistical physics to probabilistic inference. However, a systematic approach towards its validity limits is currently missing. Here we study MAXENT in a Bayesian decision theory set-up, i.e. assuming that there exists a well-defined prior Dirichlet density for unknown probabilities, and that the average Kullback-Leibler (KL) distance can be employed for deciding on the quality and applicability of various estimators. These allow to evaluate the relevance of various MAXENT constraints, check its general applicability, and compare MAXENT with estimators having various degrees of dependence on the prior, viz. the regularized maximum likelihood (ML) and the Bayesian estimators. We show that MAXENT applies in sparse data regimes, but needs specific types of prior information. In particular, MAXENT can outperform the optimally regularized ML provided that there are prior rank correlations between the estimated random quantity and its probabilities.



There are no comments yet.


page 1

page 2

page 3

page 4


Inference from Sampling with Response Probabilities Estimated via Calibration

A solution to control for nonresponse bias consists of multiplying the d...

Bayesian estimation of a decreasing density

Suppose X_1,..., X_n is a random sample from a bounded and decreasing de...

Generalized maximum likelihood estimation of the mean of parameters of mixtures, with applications to sampling

Let f(y|θ), θ∈Ω be a parametric family, η(θ) a given function, and G an...

Filtering Additive Measurement Noise with Maximum Entropy in the Mean

The purpose of this note is to show how the method of maximum entropy in...

Minimum divergence estimators, Maximum Likelihood and the generalized bootstrap

This paper is an attempt to set a justification for making use of some d...

Entropy, Information, and the Updating of Probabilities

This paper is a review of a particular approach to the method of maximum...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.