Pull Message Passing for Nonparametric Belief Propagation

07/27/2018 ∙ by Karthik Desingh, et al. ∙ University of Michigan 4

We present a "pull" approach to approximate products of Gaussian mixtures within message updates for Nonparametric Belief Propagation (NBP) inference. Existing NBP methods often represent messages between continuous-valued latent variables as Gaussian mixture models. To avoid computational intractability in loopy graphs, NBP necessitates an approximation of the product of such mixtures. Sampling-based product approximations have shown effectiveness for NBP inference. However, such approximations used within the traditional "push" message update procedures quickly become computationally prohibitive for multi-modal distributions over high-dimensional variables. In contrast, we propose a "pull" method, as the Pull Message Passing for Nonparametric Belief propagation (PMPNBP) algorithm, and demonstrate its viability for efficient inference. We report results using an experiment from an existing NBP method, PAMPAS, for inferring the pose of an articulated structure in clutter. Results from this illustrative problem found PMPNBP has a greater ability to efficiently scale the number of components in its mixtures and, consequently, improve inference accuracy.



There are no comments yet.


page 6

page 7

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We present the Pull Message Passing Nonparametric Belief Propagation (PMPNBP) algorithm as a “pull” approach to approximating message productions for loopy belief propagation in a Markov Random Field model. Building on existing methods for Nonparametric Belief Propagation (NBP), PMPNBP aims to perform inference of continuous, high-dimensional, and multi-modal random variables. Propagation of belief in NBP involves updating messages that inform the belief of one random variable based on the belief of another variable. This message update typically requires taking the product of

mixture models, each with Gaussian components. In exact form, the distribution resulting from this product will be comprised of Gaussian components. Consequently, the number of components needed to represent messages will grow towards intractability with respect to the number of update iterations.

Existing methods for NBP commonly address this intractability through a sampling-based approximation within a “push” message update procedure. This push procedure updates a message by first approximating the mixture product, often by Gibbs sampling, and then propagating this product to the message update. While effective for accurate inference, Gibbs sampling is computationally costly and prohibitive for many applications with time-critical demands and bounded computational resources, such as in robotics. More specifically, push updating in this manner suffers from two critical issues: 1) the computational cost incurred for iterative sampling of the approximated product, and 2) the limited number of mixture components that can be asymptotically accommodated.

Consider the problem of robot perception in cluttered scenes desingh2016physically ; narayanan2016discriminatively ; papazov2012rigid ; sui2017goal . Such scene perception requires inference over continuous-valued pose spaces for a varying number of objects. Inference in these continuous spaces must also contend with high dimensionality, scaling with the number of objects, and multi-modal distributions, due to partial ambiguous observations. A vast body of existing literature has explored methods to address this type of inference problem sigal2004tracking ; sudderth2004visual ; vondrak2013dynamical . Among these methods, we focus our attention on algorithms for inference by belief propagation in loopy probabilistic graphical models. In particular, Nonparametric Belief Propagation  isard2003pampas ; sudderth2003nonparametric has demonstrated considerable potential to address the challenges of inference for continuous, high-dimensional, and multi-modal random variables. However, direct application of these methods remains a substantial computational investment and intellectual challenge.

In this paper, we address the computational challenges of existing NBP methods and provide a more efficient “pull” message passing approach through the PMPNBP algorithm. The key idea of pull message updating is to evaluate samples taken from the belief of the receiving node with respect to the densities informing the sending node. The mixture product approximation can then be performed individually per sample, and then later normalized into a distribution. This pull updating avoids the computational pitfalls of push updating of message distributions, which requires exponential growth in the number of components or expensive iterative methods. We demonstrate the accuracy and efficiency of inference by PMPNBP with respect to PAMPAS isard2003pampas , a pioneering method for NBP. These results focus on an experiment for finding an articulated 2D pattern, reconstructed from the description of PAMPAS. These results indicate PMPNBP enables both faster convergence to an appropriate inference and greater scaling of message mixture components for improved accuracy.

2 Related work

Probabilistic graphical models, such as the Markov Random Field (MRF), are widely used in computational perception to model problems involving inference of random variables under considerable uncertainty. Many algorithms have been proposed to compute the joint probability of graphical models in these cases. Belief propagation algorithms are a category of algorithms that are guaranteed to converge on tree-structured graphs. For graph structures with loops, Loopy Belief Propagation (LBP) 

murphy1999loopy is empirically proven to perform for discrete variables. Recently, Chua et al. chua2016scene proposed a belief propagation over factor graphs to generate scenes satisfying the scene grammars. The problem becomes much more challenging when the latent variables take continuous values. Sudderth et al.  sudderth2003nonparametric and Isard et al. isard2003pampas

introduced methods for nonparametric belief propagation to address such continuous-valued cases. Both of these approaches approximate a continuous-valued function as a mixture of weighted Gaussians and use local Gibbs sampling to approximate the product of mixtures. This approach to message passing has been effectively used in applications such as human pose estimation 

sigal2004tracking and hand tracking sudderth2004visual by modeling the graph as a tree structured particle network. In order to viably pursue NBP for robotic problems, such as scene perception, the computational efficiency of NBP methods needs to be revisited.

Some recent works address the computational efficiency of Nonparametric Belief Propagation. Similar in spirit to PMPNBB, Ihler et. al ihler2009particle

describe a conceptual theory of particle belief propagation, where a target node’s samples are used to generate a message going from source to target. This work emphasizes the advantages of using large number of particles to represent incoming messages, along with theoretical analysis. This work uses an expensive iterative Markov Chain Monte Carlo sampling step, mimicking the Gibbs sampling step in NBP 

isard2003pampas ; sudderth2003nonparametric . PMPNBP is able to avoid this cost through a resampling step.

Kernel based methods have been proposed to improve the efficiency of NBP. Song et. al song2011kernel propose a kernel belief propagation method. Messages in this work are represented as functions in a Reproducing Kernel Hilbert space (RKHS) and message updates are linear operations in RKHS. Results presented in this work claim to be more accurate and faster than NBP with Gibbs sampling  isard2003pampas ; sudderth2003nonparametric and particle belief propagation ihler2009particle over applications such as image denoising, depth prediction, and angle prediction in protein folding problem. We consider comparisons with kernel-based approximators as a direction for future work. Han et. al han2006efficientnb introduces mode propagation to approximate the slow sampling based products in NBP isard2003pampas ; sudderth2003nonparametric with a few mode propagation and kernel fitting steps. However, their approach is limited to non-occluded observations. Our proposed algorithm PMPNBP handles occlusions with convergence characteristics comparable to PAMPAS isard2003pampas .

3 Nonparametric Belief Propagation

Let be an undirected graph with nodes and edges . The nodes in are each random variables that have dependencies with each other in the graph through edges . If is a Markov Random Field (MRF), then it has two types of variables and

, denoting the collection of hidden and observed variables, respectively. Each variable is considered to take assignments of continuous-valued vectors. The joint probability of the graph

, considering only second order cliques, is given as


where is the pairwise potential between nodes and 111Note, dimensionality remains the same,

, in the case of estimating 6 degree-of-freedom object pose

, is the unary potential between the hidden node and observed node , and is a normalizing factor. The problem is to infer belief over possible states assigned to the hidden variables such that the joint probability is maximized. This inference is generally performed by passing messages between hidden variables until convergence of their belief distributions over several iterations.

A message is denoted as directed from node to node if there is an edge between the nodes in the graph . The message represents the distribution of what node thinks node should take in terms of the hidden variable . Typically, if is in the continuous domain, then is represented as a Gaussian mixture to approximate the real distribution:


where , is the number of Gaussian components, is the weight associated with the component, and are the mean and covariance of the component, respectively. We use the terms components, particles and samples interchangeably in this paper. Hence, a message can be expressed as triplets:


Assuming the graph has tree or loopy structure, computing these message updates is nontrivial computationally. A message update in a continuous domain at an iteration from a node is given by


where is a set of neighbor nodes of . The marginal belief over each hidden node at iteration is given by


where is the number of components used to represent the belief. NBP sudderth2003nonparametric provides a Gibbs sampling approach to compute an approximation of the product . Assuming that is pointwise computable, a “pre-message” ihler2009particle is defined as


which can be computed in the Gibbs sampling procedure. This reduces Equation 4 to


[ innerlinewidth=0.5pt, innerleftmargin=10pt, innerrightmargin=10pt, innertopmargin = 10pt, innerbottommargin=10pt, skipabove=roundcorner=5pt, frametitle=Algorithm - Message update, frametitlerule=true, frametitlerulewidth=1pt] Given input messages for each , and methods to compute functions and point-wise, the algorithm computes

  1. Draw independent samples from .

    1. If the

      is a uniform distribution or informed by a prior distribution.

    2. If the is a belief computed at iteration using importance sampling.

  2. For each , compute

    1. Sample

    2. Unary weight is computed using .

    3. Neighboring weight is computed using .

      1. For each compute where

      2. Each neighboring weight is computed by

    4. The final weights are computed as .

  3. The weights are associated with the samples to represent .

[ innerlinewidth=0.5pt, innerleftmargin=10pt, innerrightmargin=10pt, innertopmargin = 10pt, innerbottommargin=10pt, skipabove=roundcorner=5pt, frametitle=Algorithm - Belief update, frametitlerule=true, frametitlerulewidth=1pt] Given incoming messages for each , and methods to compute functions point-wise, the algorithm computes

  1. For each

    1. Update weights .

    2. Normalize the weights such that .

  2. Combine all the incoming messages to form a single set of samples and their weights , where is the sum of all the incoming number of samples.

  3. Normalize the weights such that .

  4. Perform a resampling step to sample new set that represent the marginal belief of .

The pairwise term can be approximated as the marginal influence function to make the right side of Equation 7 independent of . The marginal influence function provides the influence of for sampling . However, this function can be ignored if the pairwise potential function is based on the distance between the variables. This assumption makes Equation 7 avoid the step of integration and sample from the “pre-message” followed by a pairwise sampling where is acting as to get a sample . To represent message , the samples are considered as .

are computed using Kernel Density Estimation methods. PAMPAS 

isard2003pampas has a slightly different notation and methods to compute the samples.

The Gibbs sampling procedure in itself is an iterative procedure and hence makes the computation of the "pre-message" (as the Foundation function described for PAMPAS) expensive as increases. In the next section, we provide our proposed message representation followed by the algorithm to compute at iteration .

4 Nonparametric Belief Propagation using Pull Message Passing

Given the overview of Nonparametric Belief Propagation above in Section 3, we now describe our “pull” message passing algorithm. We represent message as a set of pairs instead of triplets in Equation 3 which is


Similarly, the marginal belief is summarized as a sample set


where is the number of samples representing the marginal belief. We assume that there is a marginal belief over as from the previous iteration. To compute the , at iteration , we initially sample from the belief . Pass these samples over to the neighboring nodes and compute the weights . This step is described in 3. The computation of is described in  3. The key difference between the “push” approach of the earlier methods (NBP and PAMPAS) sudderth2003nonparametric ; isard2003pampas and our “pull” approach is the message generation. In the “push” approach, the incoming messages to determines the outgoing message . Whereas, in the “pull” approach, samples representing are drawn from its belief from previous iteration and weighted by the incoming messages to . This weighting strategy is computationally efficient. Additionally, the product of incoming messages to compute is approximated by a resampling step as described in  3.

5 Experimental Setup

We compare our proposed PMPNBP method with PAMPAS isard2003pampas on their 2D illustratory example (Figure 3). The pattern has circle node with state variable denoting its position in the 2D image and the radius of the circle. This circle node has four arms with two links each. These links are nodes in the graph with state variables . The links connected to the circle are indexed as with their connected outer links as . In the recreation of this illustratory example, we define the unary potential as


where is the patch of image centered at with the same size as the template image rendered with state of the nodes (circle/links). and are the number of white/observed pixel locations in and respectively. Figure 8 illustrates the computation of the unary potential for nodes , , visually.

(a) (a) Graphical model
(b) (b) Geometrical structure
Figure 3: The pattern used for the experiments has 9 nodes with one circle at the center and four arms with two links each. This forms the graphical model shown in (a), where hidden nodes are connected to their neighbors and informed by observed nodes . Geometrically, the circle and links are defined by their location , orientation and dimensions as shown in (b). Color coding here is used to distinguish the links for the qualitative results in the paper.
(a) (a) Image observation
(b) (b)
(c) (c)
(d) (d)
Figure 8: a) Shows the actual pattern used in the experiments of the paper. b-c) shows the unary potential for (circle, vertical rectangular link and horizontal rectangular link respectively) with taking all the pixels in the image (a). For ease of understanding, the orientation of the nodes in this illustration are set to , and .
(a) (a)
(b) (b)
(c) (c)
(d) (d)
Figure 13: This figure shows sampling the neighbors based on a given current node sample. For illustration we show the relation between nodes , and . Each sub-figure shows 20 samples (green color) drawn given its neighboring node (red color) at its ground truth location, constrained by their geometrical relationship.

The pairwise sampling is done similar to the original description in PAMPAS  isard2003pampas . The procedure to generate samples is described in Appendix A. Figure 13 visually illustrates the pairwise sampling for nodes , , . With the unary potential and pairwise sampling, we perform inference and report their convergence over iterations in the next section.

Our implementation of PAMPAS and PMPNBP is in Matlab on a Ubuntu 14.04 Linux machine. A CPU with Core i7 6700HQ - 16 GB RAM is used for all the experiments. Implementation does not involve any type of parallelization to avoid bias in comparisons.

6 Results

(a) Belief at iteration 0
(b) Belief at iteration 1
(c) Belief at iteration 10
(d) Belief at iteration 24
(e) MLE at iteration 0
(f) MLE at iteration 1
(g) MLE at iteration 10
(h) MLE at Iteration 24
Figure 22: PMPNBP results with circle node observed. Each message contains 200 particles initialized randomly at locations where their . The top row shows the belief samples for each of the nodes and the bottom row shows their Maximum Likely Estimate (MLE). MLE at iteration 24 has all the links and circle converged to their ground truth states (Best viewed in color).

We show the convergence of the PMPNBP qualitatively in Figure 22 and Figure 31. The pattern referred in Figure 3 is placed in a clutter made of 12 circles and 100 rectangles. There are 16 messages, i.e., 4 from circle to inner links, 4 from inner links to circle, 4 from inner links to outer links and 4 from outer to inner links. The initialization of the messages is done with particles at locations of the image where . This is assumed to be the coarse feature detection of the circle and rectangles in the image replicating the initialization in  isard2003pampas . In the future iterations, the message has of the samples uniformly sampled in the image to keep exploring, while the other of the samples are sampled from the marginal belief . As it can be seen in Figure 22, the initialization (Belief at Iteration 0) is distributed across the image. At iteration 1, the message passing starts to influence the belief of the nodes and at iteration 10, they form the spatial arrangement satisfying their geometrical structure. At iteration 24, the most likely estimate of all the links and circle are close to the pose of the ground truth pattern.

The second example in Figure 31 has no circle in the center of the pattern, demonstrating an occlusion scenario. This scenario demonstrates that the proposed algorithm retains the power of the probabilistic modeling of the belief propagation approach. The initialization is done similar to the first example, where there were no samples near the "occluded" circle. The convergence is similar to the first example but takes 34 iterations to converge.

In Figure 34(a), we show the convergence of the PMPNBP with respect to the previous algorithm PAMPAS  isard2003pampas which uses Gaussian mixture to represent the messages and use Gibbs sampler to perform message products (for circle). Convergence here is shown as the average error of the Maximum Likely Estimate from its ground truth with respect to the number of belief iterations over 10 trials. We plot this convergence for PMPNBP with components versus PAMPAS with . The convergence of PMPNBP is better than our implementation of PAMPAS. It can also be noted that the PMPNBP has decreasing average error with increasing numbers of particles. This essentially indicates that as larger the better the inference will be. To evaluate whether PMPNBP accommodates the use of larger in practice, we plot the CPU run time per message update iteration in Figure 34(b). An entire message generation in PAMPAS takes operations, where is the number of messages to compute product in the “pre-message”, is the number of iterations for the Gibbs sampler and is the number of components representing a message. In contrast, PMPNBP takes only operations. For the plots in Figure 34(b) with PAMPAS we use as the Gibbs sampler iterations.

These results indicate that the proposed PMPNBP has similar convergence properties as the earlier approaches with greater computational efficiency.

(a) Belief at iteration 0
(b) Belief at iteration 1
(c) Belief at iteration 10
(d) Belief at iteration 34
(e) MLE at iteration 0
(f) MLE at iteration 1
(g) MLE at iteration 10
(h) MLE at iteration 34
Figure 31: PMPNBP results with circle node “occluded”. Each message contains 200 particles initialized randomly at locations where their . The top row shows the belief samples for each of the nodes and the bottom row shows their Maximum Likely Estimate (MLE). MLE at iteration 34 has all the links and circle converged to their ground truth states (Best viewed in color).
(a) (a) Average error vs Iterations
(b) (b) CPU time vs Particles
Figure 34: Convergence and execution time: (a) shows the average position error of Maximum Likely Estimate (MLE) achieved by PMPNBP () in comparison to PAMPAS (our implementation) for the experiment in Figure 22. (b) shows CPU time per iteration required for PMPNBP and PAMPAS, as the number of particles grow. This shows the PMPNBP achieves comparable convergence with efficient computation.

7 Conclusion

We proposed a new message passing scheme that uses a “pull” approach to update messages in Nonparametric Belief Propagation. We represent messages as weighted particles instead of Gaussian Mixtures as proposed in earlier algorithms. The proposed message passing scheme avoids Gibbs sampling based message products of the earlier methods and provides faster product approximations. We show the efficiency of the proposed algorithm both in terms of its convergence properties and the computing time with respect to PAMPAS. The 2D illustration chosen in this paper, is suggestive of the real-world problems in scene estimation. We believe that the results are promising enough to stimulate further research in applying PMPNBP algorithm for scene estimation problems, especially in robotic perception where a notion of uncertainty in the inference is inevitable.


  • (1) J. Chua and P. F. Felzenszwalb. Scene grammars, factor graphs, and belief propagation. arXiv preprint arXiv:1606.01307, 2016.
  • (2) K. Desingh, O. C. Jenkins, L. Reveret, and Z. Sui. Physically plausible scene estimation for manipulation in clutter. In IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pages 1073–1080, 2016.
  • (3) T. X. Han, H. Ning, and T. S. Huang. Efficient nonparametric belief propagation with application to articulated body tracking. In

    IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)

    , pages 214–221, 2006.
  • (4) A. Ihler and D. McAllester. Particle belief propagation. In Artificial Intelligence and Statistics, pages 256–263, 2009.
  • (5) M. Isard. PAMPAS: Real-valued graphical models for computer vision. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pages 613–620, 2003.
  • (6) K. P. Murphy, Y. Weiss, and M. I. Jordan. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, pages 467–475, 1999.
  • (7) V. Narayanan and M. Likhachev. Discriminatively-guided deliberative perception for pose estimation of multiple 3d object instances. In Robotics: Science and Systems, 2016.
  • (8) C. Papazov, S. Haddadin, S. Parusel, K. Krieger, and D. Burschka. Rigid 3d geometry matching for grasping of known objects in cluttered scenes. The International Journal of Robotics Research, 2012.
  • (9) L. Sigal, S. Bhatia, S. Roth, M. J. Black, and M. Isard. Tracking loose-limbed people. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pages 421–428, 2004.
  • (10) L. Song, A. Gretton, D. Bickson, Y. Low, and C. Guestrin. Kernel belief propagation. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pages 707–715, 2011.
  • (11) E. B. Sudderth, A. T. Ihler, W. T. Freeman, and A. S. Willsky. Nonparametric belief propagation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), page 605, 2003.
  • (12) E. B. Sudderth, M. I. Mandel, W. T. Freeman, and A. S. Willsky. Visual hand tracking using nonparametric belief propagation. In IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW’04), pages 189–189, 2004.
  • (13) Z. Sui, L. Xiang, O. C. Jenkins, and K. Desingh. Goal-directed robot manipulation through axiomatic scene estimation. The International Journal of Robotics Research, 36(1):86–104, 2017.
  • (14) M. Vondrak, L. Sigal, and O. C. Jenkins. Dynamical simulation priors for human motion tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1):52–65, 2013.

Appendix A

The details of pairwise sampling procedure described in Section 5 are provided here. , and denote the circle node, inner link nodes and outer link nodes respectively, where and . The samples generated for outer arms given an inner arm using

The samples for inner arms are generated given an outer arm using

The samples for circle are generated given an inner arm using

The samples for inner arms are generated given the circle using

The parameters for the dimensions are , and

. The values used as the standard deviations are

, , . These values are set constant for all the results in this paper.