Recent years have seen a significant interest in tractable probabilistic representations that allow exact inference on highly expressive models in polynomial time, thus overcoming the shortcomings of classical probabilistic graphical models . In particular, Sum-Product Networks (SPNs) , deep probabilistic models augmenting arithmetic circuits (ACs)  with a latent variable interpretation [4, 5]
have been successfully employed as state-of-the-art models in many application domains such as computer vision[6, 7], speech 9, 10], and robotics .
Here, we introduce SPFlow111The latest source code and documentation are available at: https://github.com/SPFlow, a python library intended to enable researchers on probabilistic modeling to promptly leverage efficiently implemented SPN inference routines, while at the same time allowing them to easily adapt and extend the algorithms available in the literature on SPNs. In the following, we briefly review SPNs and some of the functions available in SPFlow. We also present a small example on how to extend the library.
2 Sum-Product Networks
As illustrated in Fig. 1, an SPN is a rooted directed acyclic graph, comprising sum, product or leaf
nodes. The scope of an SPN is the set of random variables appearing on the network. An SPN can be defined recursively as follows: (1) a tractable univariate distribution is an SPN; (2) a product of SPNs defined over different scopes is an SPN; and (3), a convex combination of SPNs over the same scope is an SPN. Thus, a product node in an SPN represents a factorization over independent distributions defined over different random variables, while a sum node stands for a mixture of distributions defined over the same variables. From this definition, it follows that the joint distribution modeled by such an SPN is a valid probability distribution, i.e., each complete and partial evidence inference query produces a consistent probability value[2, 12].
To answer probabilistic queries in an SPN, we evaluate the nodes starting at the leaves. Given some evidence, the probability output of querying leaf distributions is propagated bottom up following the respective operations. To compute marginals, i.e., the probability of partial configurations, we set the probability at the leaves for those variables to and then proceed as before. To compute MPE states, we replace sum by max nodes and then evaluate the graph first with a bottom-up pass, but instead of weighted sums, we pass along the weighted maximum value. Finally, in a top-down pass, we select the paths that lead to the maximum value, finding approximate MPE states . All these operations traverse the tree at most twice and therefore can be achieved in linear time w.r.t. the size of the SPN.
3 An Overview of the SPFlow Library
As most operations on SPNs are based on traversing the graph in a bottom-up or top-down fashion, we model the library as basic node structures and generic traversal operations on them. The rest of the SPN operations are then implemented as lambda functions that rely on the generic operations.
Therefore, the SPFlow library puts the graph structure at the center. All other operations receive or produce a graph that can be then used by the other operations. This increases the flexibility and potential uses. As an example, one can create a structure using different algorithms and then save it to disk. Later on, one can load it again and do parameter optimization using, e.g., TensorFlow , and then do inference to answer probabilistic queries. All those operations can be composed as they rely only on the given structure. More specifically, the functionality of SPFlow covers:
4 SPFlow Programming Examples
To create the SPN shown already in Fig. 1, one simply writes the follow code after loading the library:
Alternatively, we can create the same spn using the following code:
Actually, Fig. 1 itself was plotted by calling plot_spn(spn, ’basicspn.pdf’). To evaluate the likelihood of the SPN on some data, one can use Python or TensorFlow:
To learn the parameters of the SPN using TensorFlow, one calls
Marginal likelihoods just require setting ”nan” on the features to be marginalized:
Sampling creates instances where samples are obtained for the cells that contain ”nan”:
To learn the structure of an SPN, say for binary classification, let us first create a 2D dataset with a binary label. An instance has label 0, when the features are close to the generating Gaussian with mean 5. It has label 1, when the features are closer to the generating Gaussian with a mean of 15:
Now we specify the statistical types of the random variables and learn a SPN classifier:
Doing MPE on the classification SPN gives us the classifications.
The third column is the label and we can see that it behaves as expected in this synthetic example.
5 Extending the SPFlow library
To illustrate the flexibility of SPflow, we show how to extend inference to other leave types. Here we implement the Pareto leaf distribution. It relies on the infrastructure already present.
The same kind of extensions are possible for all other operations. This way it is easy to extend the library, by adding new nodes or even new operations. For instance, one could easily interface probabilistic programming languages and tools such as PyMC or Pyro.
Acknowledgements. RP acknowledges support from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 797223 — HYBSPN. This work has benefited from the DFG project CAML (KE 1686/3-1), as part of the SPP 1999, and from the BMBF project MADESI (01IS18043B).
-  Daphne Koller and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009.
-  Hoifung Poon and Pedro Domingos. Sum-Product Networks: a New Deep Architecture. Proc. of UAI, 2011.
A differential approach to inference in bayesian networks.J.ACM, 2003.
-  Arthur Choi and Adnan Darwiche. On relaxing determinism in arithmetic circuits. In Proceedings of ICML, pages 825–833, 2017.
-  Robert Peharz, Robert Gens, Franz Pernkopf, and Pedro M. Domingos. On the latent variable interpretation in sum-product networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, PP, Issue 99, 2016.
-  Robert Gens and Pedro Domingos. Discriminative Learning of Sum-Product Networks. In Advances in Neural Information Processing Systems 25, pages 3239–3247, 2012.
-  Mohamed Amer and Sinisa Todorovic. Sum product networks for activity recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2015.
-  Matthias Zohrer, Robert Peharz, and Franz Pernkopf. Representation learning for single-channel source separation and bandwidth extension. Audio, Speech, and Language Processing, IEEE/ACM Transactions on, 23(12):2398–2409, 2015.
-  Wei-Chen Cheng, Stanley Kok, Hoai Vu Pham, Hai Leong Chieu, and Kian Ming Adam Chai. Language modeling with Sum-Product Networks. In INTERSPEECH 2014, pages 2098–2102, 2014.
Alejandro Molina, Sriraam Natarajan, and Kristian Kersting.
Poisson sum-product networks: A deep architecture for tractable multivariate poisson distributions.In Proc. of AAAI, 2017.
-  Andrzej Pronobis, Francesco Riccio, and Rajesh PN Rao. Deep spatial affordance hierarchy: Spatial knowledge representation for planning in large-scale environments. In ICAPS 2017 Workshop on Planning and Robotics, Pittsburgh, PA, USA, 2017.
-  Robert Peharz, Sebastian Tschiatschek, Franz Pernkopf, and Pedro Domingos. On theoretical properties of sum-product networks. In Proc. of AISTATS, 2015.
-  Martín Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.