Statistical Analysis of Loopy Belief Propagation in Random Fields

03/16/2015 ∙ by Muneki Yasuda, et al. ∙ 0

Loopy belief propagation (LBP), which is equivalent to the Bethe approximation in statistical mechanics, is a message-passing-type inference method that is widely used to analyze systems based on Markov random fields (MRFs). In this paper, we propose a message-passing-type method to analytically evaluate the quenched average of LBP in random fields by using the replica cluster variation method. The proposed analytical method is applicable to general pair-wise MRFs with random fields whose distributions differ from each other and can give the quenched averages of the Bethe free energies over random fields, which are consistent with numerical results. The order of its computational cost is equivalent to that of standard LBP. In the latter part of this paper, we describe the application of the proposed method to Bayesian image restoration, in which we observed that our theoretical results are in good agreement with the numerical results for natural images.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Loopy belief propagation (LBP) Pearl (1988), which is a message-passing-type inference method, is widely prevalent in various fields, including computer science, as a powerful tool for statistical procedures in systems based on Markov random fields (MRFs) Opper and Saad(Eds.) (2001); Mézard and Montanari (2009). LBP is equivalent to the Bethe approximation in statistical mechanics Kabashima and Saad (1998); Yedidia et al. (2005) and is also known as the cavity method. An analysis of the statistical behaviors of LBP is important to develop an understanding of LBP. In this paper, we focus on LBP in pair-wise MRFs with random fields and we present a statistical analysis of it, namely, an analysis of the quenched average of LBP over random fields. The topic of pair-wise MRFs in random fields is an important research field in statistical mechanics Schneider and Pytte (1977); Krzakala et al. (2010). As described below, a statistical analysis of LBP in random fields is also important for the field of Bayesian signal processing in computer science.

Bayesian image restoration Geman and Geman (1984), in which images degraded by noise are restored using the Bayesian framework, is an important generic technique for various types of signal processing. Suppose that there is an original image and that the original image is degraded through a specific noise process. We observe only the degraded image as the input, and we want to produce the restored image as the output. From the statistical mechanics point of view, the standard framework of Bayesian image restoration corresponds to the framework of a two-dimensional ferromagnetic spin model in random fields Nishimori (2001); Tanaka (2002). In this correspondence, the input image, namely, the degraded image, is regarded as the random fields in the Bayesian image restoration system.

Since the model used in the Bayesian image restoration system is designed by using an intractable pair-wise MRF, LBP is often applied to implement it. Hence, in the evaluation of the statistical performance of the implemented image restoration system, we encounter the evaluation of the quenched average of LBP over the random fields, namely, over the input images. For this purpose, for Ising systems, the authors proposed an analytical evaluation method for it Kataoka et al. (2010); Tanaka et al. (2010). In the previous method, the evaluation of the quenched average of LBP is reduced to solving simultaneous integral equations with respect to the distributions of the messages. However, the method is not very practical, because the computational cost of solving the integral equations is considerable and its approximation accuracy is poor. Furthermore, the method cannot evaluate the quenched average of the free energy and is formulated only in Ising systems.

In this paper, we propose a new analytical method for evaluating the quenched average of LBP over random fields based on the idea of the replica cluster variation method (RCVM) Rizzo et al. (2010); Lage-Castellanos et al. (2013). The presented method allows the quenched average of the Bethe free energies over random fields in general pair-wise MRFs to be evaluated, unlike the previous method.

The remaining part of this paper is organized as follows. A brief explanation of LBP is given in section II. Section III constitutes the main part of the paper. The proposed method is shown in section III.3, and some numerical results for checking its validity are shown in section III.4. In section III.5, we show a case that is exactly solvable by the present method. In section IV, we explain the framework of the framework of Bayesian image restoration and compute the statistical performance of the Bayesian image restoration system using the proposed method. Finally, section V closes the paper with concluding remarks.

Ii Loopy Belief Propagation in Random Fields

ii.1 Model Definition

Let us consider an undirected graph consisting of vertices and some edges, where is the set of vertices, and is the set of edges between a pair of vertices, where denotes the undirected edge between vertices and

. On the graph, with the discrete random variables

, let us define the pair-wise MRF expressed by

(1)

where

is the Hamiltonian of the MRF. Here, is a specific function of the variable and the random field on vertex , and is a specific function on edge . The notations, and , are the partition function and inverse temperature, which takes a positive value, respectively. In this paper, only random fields are treated as the quenched parameters.

ii.2 Loopy Belief Propagation

It is known that LBP is derived from the minimum condition of the variational Bethe free energy Yedidia et al. (2005). In this section, we give a brief explanation of the derivation of LBP according to the cluster variation method (CVM) Kikuchi (1951); Pelizzola (2005). The free energy of the MRF in equation (1) is defined by

(2)

In the Bethe approximation in the CVM, we approximate the MRF by

(3)

where and are the one-vertex and two-vertex marginal distributions (or the beliefs) of the MRF. This approximation corresponds to the cluster decomposition shown in figure 1. The right-hand side of equation (3) is the product of the marginal distributions of the clusters divided by the product of the marginal distributions of the double-counted clusters. By applying this approximation to in the logarithmic function of the last term in equation (2), we obtain the variational Bethe free energy expressed by

(4)

where

and

are the one-vertex and two-vertex negative entropies.

Figure 1: (a) Square graph with four vertices. (b) Cluster decomposition of the Bethe approximation of the CVM for (a). The vertices and two connected vertices are selected as the clusters.

LBP is obtained by the variational minimization of the variational Bethe free energy with respect to the beliefs. By minimizing the variational Bethe free energy under the normalizing constraints, , and the marginalizing constraints, and , we obtain the message-passing equation of LBP:

(5)

where is the message (or the effective field) from vertex to vertex . Using the messages satisfying the message-passing equation, we can compute the beliefs that minimize the variational Bethe free energy as

(6)
(7)

where is the set of vertices connected to vertex : . The Bethe free energy of the MRF is the minimum of the variational Bethe free energy,

and is obtained by substituting the beliefs obtained by equations (6) and (7) into the variational Bethe free energy in equation (4). In LBP, the beliefs obtained by equations (6) and (7) are regarded as the Bethe approximations of the true marginal distributions of the MRF. When an undirected graph , on which the MRF is defined, has no loops, the Bethe free energy and the beliefs are equivalent to the true free energy and the true marginal distributions of the MRF, respectively.

The main proposal presented in this paper is a method for evaluating the quenched average of the Bethe free energy over the random fields:

(8)

where the notation represents the average value over the random fields, and is the distribution of the field on vertex , where these distributions can vary by vertex in general.

Iii Proposed Method

According to equation (8), in principle, we have to perform the averaging operation after constructing the Bethe free energy to obtain (see figure 2(a)). However, it is not straightforward to directly integrate the Bethe free energy. Thus, we adopt another strategy.

In this paper, we propose an approximate method based on the idea of the RCVM Rizzo et al. (2010); Lage-Castellanos et al. (2013). Figure 2(b) shows the procedure of our method.

Figure 2: Procedures of obtaining the quenched average of the Bethe free energy. (a) Principled procedure. (b) Procedure of the proposed method based on the RCVM.

In the method, we first take the average of the free energy using the replica method and CVM, namely, the RCVM (section III.1), and then, we apply the Bethe approximation to the resulting form of the RCVM (section III.2). If the exchange of the order of the two operations, the Bethe approximation and the quenched averaging operation, is allowed, we can expect to obtain a good approximation of the quenched average of the Bethe free energy.

iii.1 Replica Cluster Variation Method

First, we obtain the quenched average of the true free energy of the MRF in equation (1), that is,

In the context of the replica method Mezard et al. (1987); Nishimori (2001), we have

(9)

where

By assuming that is a natural number, we obtain

where , , and the function is defined by

We regard as the partition of the -replicated system and define the -replicated free energy as

(10)

where

is the Gibbs distribution of the -replicated system, and . The energy function is defined by

and is just the interaction term of the original system. The distributions, and , are the marginal distributions of the distribution . The factor graph representation of the -replicated system is shown in figure 3(a).

Figure 3: (a) Factor graph representation of when . (b) Cluster decomposition for (a). In the decomposition, the three different types of clusters are employed: the clusters consist of , of , and of .

In accordance with the cluster decomposition based on the CVM shown in figure 3(b), by using the marginal distributions, , together with the one-variable marginal distributions of , , we approximate the Gibbs distribution of the -replicated system as

(11)

As in equation (3), is approximated by the product of the marginal distributions of the clusters divided by the product of the marginal distributions of the double-counted clusters. By applying this approximation to in the logarithmic function in the last term in equation (10), we obtain the expression of the variational free energy as

(12)

where the functionals, and , are defined as

(13)
(14)

In the context of the CVM, the -replicated free energy in equation (10) is approximated by the minimum of the variational free energy in equation (12) with respect to the marginal distributions , i.e., . Note that, at the minimum point, the normalization constraints for and the marginal constraints,

(15)

and

(16)

should hold.

In order to minimize the variational free energy with respect to , by using the Lagrange multipliers, we perform the variational minimization of

with respect to . From the result of this minimization, we obtain

(17)

The Lagrange multipliers, , are determined such that they satisfy equation (15). By substituting equation (17) into equation (13), while noting the marginal constraints in equation (15), we obtain the partially minimized variational free energy,

as

(18)

where the notation “extr” denotes the extremum with respect to the assigned parameters.

iii.2 Bethe Approximation and Replica Symmetric Ansatz

The functional can be interpreted as the variational free energy for the interaction term of the original system (see equation (14)). Since this variational free energy is intractable in general, we approximate it by the Bethe approximation. As in equation (3), we approximate by

(19)

where the distributions, and , are the marginal distributions of , so that the marginal constraints,

and

are satisfied. By applying equation (19) to in the logarithmic function in the last term of equation (14), we obtain the Bethe approximation of as

(20)

By substituting the Bethe approximation in equation (20) into equation (18), we obtain the Bethe approximation of equation (18): . After the Bethe approximation, we make the replica symmetric (RS) assumption Mezard et al. (1987); Nishimori (2001) in , and subsequently, by taking the limit as , we finally reach the variational free energy expressed as

(21)

The detailed derivation of this variational free energy is shown in appendix A. We expect that the minimum of this variational free energy is the approximation of the quenched average of the Bethe free energy in equation (8): , where

(22)

At the minimum of the variational free energy, , the normalization constraints,

(23)

and the marginal constraints,

(24)

and

(25)

should hold.

iii.3 Message-passing Equation

In this section, we show the message-passing equation for minimizing the variational free energy in equation (21) obtained in the previous section.

The message-passing equation for our method is obtained as

(26)
(27)
(28)

The quantity is the message from vertex to vertex , and the Lagrange multipliers in equation (28) satisfy the extremal conditions in the first term in equation (21). By using the messages and , the two-vertex marginal distributions are obtained as

(29)

The detailed derivation of equations (26)–(29) is shown in appendix B. The order of the computational cost of the proposed message-passing equation is equivalent to that of standard LBP in section II.2.

When the distributions of the random fields are Dirac delta functions, , namely, the fields are not the quenched parameters but the fixed parameters, the method presented in equations (26)–(29) is reduced to the standard LBP in equations (5)–(7). In this case, in equation (22) is equivalent to the Bethe free energy .

After numerically solving the simultaneous equations in equations (26)–(29), by substituting the solutions, , , and , into equation (21), we obtain the minimum values of the variational free energy, , and regard it as the approximation of in equation (8).

For the moment, we suppose that the function

can be divided as . The variations in the quenched average of the Bethe free energy in equation (8) with respect to and are

(30)

and

(31)

respectively, which are the quenched average of the beliefs obtained from LBP. On the other hand, the variations in with respect to and are obtained as

(32)

and

(33)

respectively. By comparing equations (30) and (31) with equations (32) and (33), it can be expected that, if is a good approximation of , the marginal distributions, and , are also good approximations of the quenched averages of the beliefs, and , respectively.

iii.4 Numerical Experiment

In this section, we describe the evaluation of the validity of our method by using numerical experiments. In the experiments, we used the model expressed as

(34)

which is defined on a certain graph, where takes real values in the interval as , i.e., when , when , and so on. The fields

are i.i.d. random fields drawn from the Gaussian distribution with a mean of zero and a variance of

, .

We compared the free energy per variable obtained by our method, , with the quenched average of the Bethe free energy (per variable) shown in equation (8), which was obtained by numerically averaging the Bethe free energy in equation (4) over the random fields, and we compared the behaviors of the quenched average of the magnetizations obtained by the two different methods. The magnetizations obtained from LBP and our method are given by and , respectively.

iii.4.1 Square Lattice

We show the results when the model in equation (34) is defined on a graph of an square lattice with the free boundary condition and when all of the interactions are unique, . Figures 46 show the plots for , , and . “LBP” represents the results obtained by the numerically averaged Bethe free energy, and “RLBP” represents the results obtained by our method. Each plot of LBP is numerically averaged over 10000 realizations of the random fields.

Figure 4: Quenched Bethe free energies per variable for . The left panel shows the free energies versus with , and the right panel shows the free energies versus with

. The error bars are the standard deviation.

Figure 5: Quenched Bethe free energies per variable for . The left panel shows the free energies versus with , and the right panel shows the free energies versus with . The error bars are the standard deviation.
Figure 6: Quenched Bethe free energies per variable for . The left panel shows the free energies versus with , and the right panel shows the free energies versus with . The error bars are the standard deviation.

In almost all cases, the results of our method are consistent with the numerically averaged Bethe free energies, as expected. However, in the cases of large , mismatches between the two methods are observed.

Figure 7 shows the plot of the quenched average of the magnetizations, and , when the model shown in equation (34) is defined on a graph of a square lattice with periodic boundary conditions and when .

Figure 7: Quenched magnetizations versus , where and .

We observe that the two methods show the different nature of the magnetizations. The magnetization obtained by standard LBP continuously increases with the increase in , whereas that obtained by the proposed method drastically increases, like a first-order transition, around

. This different physical picture probably causes the mismatches between the two methods in the cases of large

in figures 46.

Our formulation allows for the approximate evaluation of the quenched average of Bethe free energy over the random fields with disordered interactions. We show the results when the model in equation (34) is defined on a graph of a square lattice with free boundary conditions and when the interactions are independently drawn from . Figure 8 shows the results of the free energies versus for .

Figure 8: Quenched Bethe free energies per variable versus for . The left panel shows the free energies when , and the right panel shows the free energies when . The error bars are the standard deviation.

Each plot obtained by LBP is numerically averaged over 100 realizations of the random fields and over 200 realizations of the interactions, and that obtained by our method is averaged over 200 realizations of the interactions. Since the error bars of our method are quite small compared to LBP, we omit them in the figure. The results obtained by our method are consistent with the numerically averaged Bethe free energies.

To see the effect of the disorder in the interactions on the behavior of the magnetization, we show the plot of the quenched average of the magnetizations when the model shown in equation (34) is defined on a graph of a square lattice with periodic boundary conditions and when and the interactions are independently drawn from in figure 9.

Figure 9: Quenched magnetizations versus for and when and .

We observe that the magnetizations obtained by our method show the first-order transition, as in figure 7. However, the values of the magnetizations after the transition are quite small compared to those in figure 7.

iii.4.2 Random Regular Graph

A random regular graph (RRG) is a random graph in which the degrees of all vertices are fixed by the constant .

Figure 10: Quenched Bethe free energies per variable for on the RRG. The left panel shows the free energies versus with , and the right panel shows the free energies versus with . The error bars are the standard deviation.

Figure 10 shows the results when the model in equation (34) is defined on an RRG with 200 vertices and and when and . Each plot obtained by LBP is numerically averaged over 100 realizations of the random fields and over 100 realizations of the structure of graph, and that obtained by our method is averaged over 100 realizations of the structure of graph. Since the error bars of our method are quite small compared to LBP, we omit them in the figure, as in figure 8. The behaviors of the quenched magnetizations, and , obtained by the two methods in this case are shown in figure 11.

Figure 11: Quenched magnetizations versus on the RRG, where and .

The behaviors of the quenched magnetizations in this figure are similar to those shown in figure 7.

LBP is asymptotically justified on an RRG Krzakala et al. (2010), because an RRG is quite sparse. Therefore, we can expect that the results obtained by LBP are close to the exact solutions. Except for the RS assumption, our method consists of two approximations: the approximation in equation (11) and the approximation in equation (19). Since the latter approximation is the Bethe approximation, it can be justified on a sparse graph such as an RRG. This suggests that the mismatch between the two methods in the right panel in figure 10 is mainly caused by the first approximation, and that the first approximation produces the metastable state that causes the first-order transition in figure 11.

As in section III.4.1, we again see the case with the disordered interactions. Figure 12 shows the plots of the quenched Bethe free energies versus for when the model in equation (34) is defined on an RRG with 200 vertices and and when the interactions are independently drawn from . Each plot in the figure is obtained in the same manner as the case in figure 8.

Figure 12: Quenched Bethe free energies per variable versus for on the RRG. The left panel shows the free energies when , and the right panel shows the free energies when . The error bars are the standard deviation.

As is the case in figure 8, the results of our method are consistent with the numerically averaged Bethe free energies. Figure 13 shows the plot of the quenched average of the magnetizations when the model in equation (34) is defined on an RRG with 200 vertices and and when and are independently drawn from .

Figure 13: Quenched magnetizations versus on the RRG for and when and .

It can be observed that the transition of the magnetization obtained by our method is nearly continuous with the increase in the magnitude of the disorder. This suggests that the disorder in the interactions violates the metastable state of the quenched Bethe free energy obtained by our method in the case of an RRG.

iii.5 Exactly Solvable Case – Ferromagnetic Mean-field Model in Random Fields

In this section, we consider the ferromagnetic mean-field model in random fields expressed as Schneider and Pytte (1977)