Parameter estimation for Gibbs distributions
We consider Gibbs distributions, which are families of probability distributions over a discrete space Ω with probability mass function of the form μ^Ω_β(ω) ∝ e^β H(ω) for β in an interval [β_min, β_max] and H(ω) ∈{0 }∪ [1, n]. The partition function is the normalization factor Z(β)=∑_ω∈Ωe^β H(ω). Two important parameters of these distributions are the partition ratio q = logZ(β_max)Z(β_min) and the counts c_x = |H^-1(x)|. These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the values c_x using roughly Õ( q/ε^2) samples for general Gibbs distributions and Õ( n^2/ε^2 ) samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs and perfect matchings in a graph. A key subroutine we develop is to estimate the partition function Z; specifically, we generate a data structure capable of estimating Z(β) for all values β, without further samples. Constructing the data structure requires Õ(q/ε^2) samples for general Gibbs distributions and Õ(n^2/ε^2) samples for integer-valued distributions. This improves over a prior algorithm of Kolmogorov (2018) which computes the single point estimate Z(β_max) using Õ(q/ε^2) samples. We show matching lower bounds, demonstrating that this complexity is optimal as a function of n and q up to logarithmic terms.
READ FULL TEXT