The von Mises Fisher (VMF) Distribution (also known as the Langevin Distribution [Watamori96]
) is a probability distribution on the-dimensional hypersphere in [Fisher53]. If the distribution reduces to the von Mises distribution on the circle, and if it reduces to the Fisher distribution on a sphere. It was introduced by [Fisher53] and has been studied extensively by [Mardia14, Mardia75]. The first Bayesian analysis was in [Mardia76] and recently it has been used for clustering on a hypersphere by [Banerjee05].
We will use to denote the natural logarithm of throughout this article. Before continuing it will be useful to define the Gamma function ,
and its relation, the incomplete Gamma function ,
and the Modified Bessel Function of the First Kind ,
which also has the following integral representations [Abramowitz72],
Also of interest is the logarithm of this quantity (using the second integral definition (6)),
Note that the second term does not depend on .
The Exponential Integral function is given by,
An identity that will be useful is,
2.2 The von Mises Fisher (Vmf) distribution
The probability density function (PDF) of the VMF
distribution for a random d-dimensional unit vectoris given by:
where the normalisation constant is given by,
The (non-symmetric) Kullback Leibler (KL)-Divergence from one probability distributions to another probability distribution is defined as,
Although this is general to any two distributions, we will assume that is the “prior” distribution and is the “posterior” distribution as commonly used in Bayesian analysis.
3 Kl-Divergence for the Vmf Distribution
3.1 General Case
We will assume that we have prior and posterior distributions defined over vectors as follows,
We will now derive the KL-Divergence for two VMF distributions. The main problem in doing so will be the the normalisation constants and . For prior and posterior distributions as defined above over vectors odd111For even we can simply add a “null” dimension, we have
From (12), letting , , and , we have,