## What is the Principle of Maximum Entropy?

The principle of maximum entropy is a model creation rule that requires selecting the most unpredictable (maximum entropy) prior assumption if only a single parameter is known about a probability distribution. The goal is to maximize “uniformitiveness,” or uncertainty when making a prior probability assumption so that subjective bias is minimized in the model’s results.

For example, if only the mean of a certain parameter is known (the average outcome over long-term trials), then a researcher could use almost any probability distribution to build the model. They might be tempted to choose a probability function like Normal distribution, since knowing the mean first lets them fill in more variables in the prior assumption. However, under the maximum entropy principle, the researcher should go with whatever probability distribution they know the least about already.

### Common Probability Distribution Parameterizations in Machine Learning:

While all probability models follow either Bayesian or Frequentist inference, they can yield vastly different results depending upon what specific parameter distribution algorithm is employed.

- Bernoulli distribution – one parameter
- Beta distribution – multiple parameters
- Binomial distribution – two parameters
- Exponential distribution – multiple parameters
- Gamma distribution – multiple parameters
- Geometric distribution – one parameter
- Gaussian (normal) distribution – multiple parameters
- Lognormal distribution – one parameter
- Negative binomial distribution – two parameters
- Poisson distribution – one parameter