Bayesian image modeling based on Markov random fields (MRF) and loopy belief propagations (LBP) is one of the interesting research topics in statistical-mechanical informatics [2, 3, 4, 5, 6]. Its advantages are two fold. First, Bayesian analysis provides useful statistical models for probabilistic information processing to treat massive and realistic datasets. Second, statistical-mechanical informatics provides powerful algorithms based on the advanced mean field methods, including the LBP, which is equivalent to the Bethe approximation in statistical mechanics [6, 7, 8, 9, 10, 11].
Because MRF’s usually include some hyperparameters which correspond to the temperature and interactions in classical spin systems, one can determine these hyperparameters by maximizing marginal likelihoods in Bayesian modeling. The marginal likelihoods are constructed from probabilities of observed data with given hyperparameters and are expressed by free energies of prior and posterior probabilities. Practical algorithms can often be constructed based on the expectation-maximization (EM) algorithm. From the statistical-mechanical stand-point, EM algorithms used in Bayesian image analysis have been investigated by applying LBP to some classical spin systems[13, 14]. We have to mention that, in the EM algorithm, the differentiability of marginal likelihood with respect to hyperparameters is very important. The classical spin systems in Refs.[13, 14] have only second order phase transitions and the marginal likelihoods are always differentiable with respect to hyperparameters.
Image segmentation, as one of the primary but challenging topics in image processing, corresponds to the labeling of pixels in term of the three chromatic values at each pixel in the observed image. Because image segmentation is usually defined on a finite square lattice of pixels, the MRF’s can be formulated as having a high probability when the number of neighbouring pairs of pixels with the same labeling state is large. Such MRF modeling can be realized by considering ferromagnetic Potts models on the square lattice in the statistical mechanics. The state at each pixel corresponds to the label in clustering the observed data. Bayesian modeling for image segmentations typically provides a posterior probabilistic model of labeling when a natural image is given. It is often reduced to a -state Potts model () with spatially non-uniform external fields and uniform nearest-neighbour interactions.
Various useful probabilistic inference algorithms for image segmentations have been proposed[16, 17, 18, 19, 20, 21, 22, 23, 24, 25] by means of the maximum likelihood framework for MRF’s. Particularly, inference algorithms in Refs.[17, 20, 21, 22, 23, 24, 25] are based on advanced mean field methods, including the LBP; and MRF’s for image segmentations are using
-state Potts models as prior probabilities. Carlucci and Inoue adopted
-state Potts models with infinite-range interactions as prior probability distributions, and they investigated statistical performance in Bayesian image modeling by using the replica method in the spin glass theory. As shown in Fig.1, it is known that, for -state Potts model with , the approximate free energies of the advanced mean field methods are continuous functions but have non-differentiable points with respect to the temperature. Such singularities are often referred to as the first order phase transitions in the statistical mechanics. Applications of LBP often leads to phase transitions for systems that include cycles in their graphical representations, even if they are finite-size systems. In Bayesian image restoration, the approximate marginal likelihood in LBP for three-state Potts prior has been computed for some artificial images and the above singularities have been shown to appear in the approximate marginal likelihood. Recently, an efficient iterative inference algorithm has been proposed to realize the hyperparameter estimation in the standpoint of a conditional maximization of entropy for Bayesian image restoration by means of generalized sparse MRF prior and LBP. The scheme works well for prior probability with the first order phase transition. In addition, this scheme is equivalent to the EM algorithm for maximization of marginal likelihood when the differentiate of marginal likelihood with respect to hyperparameters is always a continuous function, and the prior probability has the second order phase transitions or no phase transitions.
In the present paper, we will explain, for Bayesian image segmentation, how the first order phase transitions in LBP’s for -state Potts models influence EM algorithms in the maximum likelihood framework and how the inference algorithm in Ref. works from the standpoint of statistical-mechanical informatics. In §2, we construct a Potts prior probability distribution for Bayesian image segmentation modeling from the standpoint of the constrained maximization of entropy. In §3, we propose a novel inference scheme, which is based on a conditional maximum likelihood framework, for estimating hyperparameters from an observed natural color image in terms of our Potts prior distribution and the LBP. In §4, we survey the inference procedure for estimating hyperparameters using the conventional maximum likelihood framework and give numerical experiments in the frameworks with the LBP. We will also clarify how the first order phase transition appears in the conventional scheme with the LBP. In §5, we give some concluding remarks.
2 Potts Prior for Probabilistic Image Segmentation
We consider an image as defined on a set of pixels arranged on a square grid graph , where is the set of all the pixels and is defined by . There is a link between every nearest-neighbour pair of pixels and , and denotes the set of all the nearest-neighbour pairs of pixels . The total numbers of elements in the sets and are denoted by and
, respectively. The goal of image segmentation is to classify the pixels into several regions. Each pixel will be assigned one of the integersas its region label. In the present section, we give the prior probability distribution of labeled configurations on the square grid graph .
The label at each pixel
is regarded as a random variable, denoted by. Then the random field of labels is represented by , and every labeled configuration is denoted by . The prior probability of a labeled configuration is assumed to be specified by a constant as
where and . By introducing the Lagrange multipliers for the constraints, we reduce the prior probability to
where is a normalization constant. The interaction is a function of and should be determined to satisfy the following constraint condition
In order to calculate the estimate of the hyperparameter , we have to solve the following equation:
In the above mathematical framework, as shown in the deterministic equation (4) together with eqs.(2) and (5), computation of the two terms and is critical to . In LBP[9, 10, 11, 30], the marginal prior probability distributions in eq.(5) and can be approximately reduced to
where denotes the set of neighbouring pixels of pixel . The quantities and in eqs.(6) and (7) correspond to normalization constants of approximate representations of marginal probabilities in LBP. Here are messages in the LBP for the prior probability in eq.(2), and the free energy per pixel in the Potts prior (2) is also approximately expressed as
The messages (, , ) are determined so as to satisfy the following simultaneous equations:
3 Segmentation Algorithm for Potts Posterior and Loopy Belief Propagation
In this section, we provide a posterior probability and a hyperparameter estimation scheme in terms of the Potts prior constructed in the previous section. We combine the conditional maximization of entropy with Bayesian modeling to derive simultaneous deterministic equations for estimating hyperparameters from the given data.
The intensities of red, green, and blue channels at pixel in the observed image are regarded as random variables denoted by , and , respectively. The random fields of red, green and blue intensities in the observed color image are then represented by the -dimensional vector , where . The actual color image is denoted by , where . The random variables , and at each pixel can take any real numbers in the interval . The generative process of natural color images is assumed to be the following conditional probability:
Another way of defining the posterior probability of a labeling can be introduced through the following definition:
By introducing Lagrange multipliers , , () and () for the constraints and by considering the extremum condition with respect to , the right-hand side of eq.(16) is reduced to the following expression:
up to the normalization constant including . The Lagrange multipliers , () and () are determined so as to satisfy the constraint conditions:
Moreover, in order to ensure eq.(17) as an identity with respect to every label configuration of , we have to impose the following equalities:
as sufficient conditions for eq.(15) with respect to the right-hand sides of equations (13) and (17). Because () are symmetric matrices, we can show that () in eq.(17) by using eq.(23). By combining the above arguments (13), (17), (18)-(20), and (21)-(22) with the ones in eq.(2) and eq.(3), the simultaneous deterministic equations of estimates and of and should then be reduced to the following constraints:
Given the estimates and , the estimate of labeling is determined by
The above method of producing the labeling is called maximum posterior marginal (MPM) estimation.
In LBP, the marginal probability distributions and can be approximately reduced to
The quantities and in eqs.(30) and (31) correspond to normalization constants of approximate representation to marginal probabilities in LBP. Here are messages in the LBP for the posterior probabilities in eq.(13). They are determined so as to satisfy the following simultaneous fixed-point equations:
The practical segmentation algorithm for an observed image is summarized as follows:
- Step 1
Input the data . Set initial values for , , and , and .
- Step 2
Set initial values for and repeat the following update rules until and converge:
(33) (34) (35) (36) (37)
- Step 3
Update , and according to the following rules:
(38) (39) (40) (41) (42) (43) (44) (45)
- Step 4
Output the following quantities:
(46) (47) (48)
and stop if and converge. Go to Step 2 otherwise.
We use six test images, as shown in Figs.3(a)-(f), where three images are from the Berkeley Segmentation Data Set 500 (BSDS500)[31, 32] and the other three images are from the image database of Signal and Image Processing Institute, University of Southern California (SPIP-USC) to demonstrate the effectiveness of our method. The processes of the proposed hyperparameter estimation for the images in Fig.3(a)-(f) are plotted in Figs.4(a)-(f) and 5(a)-(f) under and , respectively. The solid circles in Figs.4 and 5 correspond to in Step 4, and the solid lines are for various values of and are also given in Fig.2. In Table 1, we show the estimates and in the cases of and for the images in Fig.3. The segmentation results for the test images in Fig.3 are shown in Figs.