Rapid and deterministic estimation of probability densities using scale-free field theories

by   Justin B. Kinney, et al.
Cold Spring Harbor Laboratory

The question of how best to estimate a continuous probability density from finite data is an intriguing open problem at the interface of statistics and physics. Previous work has argued that this problem can be addressed in a natural way using methods from statistical field theory. Here I describe new results that allow this field-theoretic approach to be rapidly and deterministically computed in low dimensions, making it practical for use in day-to-day data analysis. Importantly, this approach does not impose a privileged length scale for smoothness of the inferred probability density, but rather learns a natural length scale from the data due to the tradeoff between goodness-of-fit and an Occam factor. Open source software implementing this method in one and two dimensions is provided.



page 4


Unification of field theory and maximum entropy methods for learning probability densities

The need to estimate smooth probability distributions (a.k.a. probabilit...

Approximation of probability density functions via location-scale finite mixtures in Lebesgue spaces

The class of location-scale finite mixtures is of enduring interest both...

At the Interface of Algebra and Statistics

This thesis takes inspiration from quantum physics to investigate mathem...

Continuous Herded Gibbs Sampling

Herding is a technique to sequentially generate deterministic samples fr...

Sequential Density Estimation via NCWFAs Sequential Density Estimation via Nonlinear Continuous Weighted Finite Automata

Weighted finite automata (WFAs) have been widely applied in many fields....

Bounded Statistics

If two probability density functions (PDFs) have values for their first ...

The Optimal Approximation Factor in Density Estimation

Consider the following problem: given two arbitrary densities q_1,q_2 an...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


  • Eggermont and LaRiccia (2001) P. P. B. Eggermont and V. N. LaRiccia, Maximum Penalized Likelihood Estimation: Volume 1: Density Estimation (Springer, 2001).
  • Bialek et al. (1996) W. Bialek, C. G. Callan,  and S. P. Strong, Phys. Rev. Lett. 77, 4693 (1996).
  • Nemenman and Bialek (2002) I. Nemenman and W. Bialek, Phys. Rev. E 65, 026137 (2002).
  • Holy (1997) T. E. Holy, Phys. Rev. Lett. 79, 3545 (1997).
  • Periwal (1997) V. Periwal, Phys. Rev. Lett. 78, 4671 (1997).
  • Aida (1999) T. Aida, Physical review letters 83, 3554 (1999).
  • Schmidt (2000) D. M. Schmidt, Phys. Rev. E 61, 1052 (2000).
  • Lemm (2003) J. C. Lemm, Bayesian Field Theory (Johns Hopkins, 2003).
  • Nemenman (2005) I. Nemenman, Neural Comput. 17, 2006 (2005).
  • Enßlin et al. (2009) T. A. Enßlin, M. Frommert,  and F. S. Kitaura, Phys. Rev. D 80, 105005 (2009).
  • Balasubramanian (1997) V. Balasubramanian, Neural Comput. 9, 349 (1997).
  • (12) The identity is used here with and .
  • Allgower and Georg (1990) E. L. Allgower and K. Georg, Numerical Continuation Methods: An Introduction (Springer, 1990).
  • Skilling (2007) J. Skilling, AIP Conf. Proc. 954, 39 (2007).
  • (15) Computation times were assessed on a computer having a 2.8 GHz dual core processor, 16 GB of RAM, and running the Canopy Python distribution (Enthought).