1 Background and setup
A binary pairwise Markov random field (MRF) over variables
models a probability distribution. The non-diagonal entries of the matrix encode pairwise potentials between variables while its diagonal entries encode unary potentials. The exponentiated linear term is the negative energy or simply the score of the MRF. A restricted Boltzmann machine (RBM) is a particular MRF whose variables are split into two classes, visible and hidden, and in which intra-class pairwise potentials are disallowed.
We write for the set of symmetric real matrices, and to denote the unit sphere
. All vectors are columns unless stated otherwise.
1.1 Integer quadratic programming
Finding the maximum a posteriori (MAP) value of a discrete pairwise MRF can be cast as an integer quadratic program (IQP) given by
Note that we have the domain constraint rather than . We relate the two in Section LABEL:sec:hypercubes.
Solving eqn:iqp is NP-hard in general. In fact, the MAX-CUT problem is a special case. Even the cases where encodes an RBM are NP-hard in general (alon2006approximating). We can trade off exactness for efficiency and instead optimize a relaxed (indefinite) quadratic program:
Such a relaxation is tight for positive semidefinite : global optima of the QP and the IQP have equal objective values.111We can always ensure tightness when is not PSD, as in ravikumar2006quadratic. Therefore eqn:qp is just hard in general as eqn:iqp, even though it affords optimization by gradient-based methods in place of combinatorial search.
The following semidefinite program (SDP) is a looser relaxation of eqn:iqp obtained by extending to higher ambient dimension: