Constraint satisfaction and optimization problems are usually assumed to be deterministic, meaning that all parameters of the problem, also known as problem data, are known with certainty. This ignores the complex, uncertain, and dynamic nature of the realworld problems. Stochastic constraint programming is an attempt to address the problem of decision making under uncertainty using the constraint programming paradigm [Walsh02, hnich2011survey].
Recent developments in machine learning, together with abundance of collected data, has made it possible to capture our uncertain knowledge about the world as probabilistic models.Probabilistic graphical models
are a popular representation which assume a factorized joint distribution over random variables[Koller09]. This has motivated work on Factored Stochastic Constraint Programming (FSCP) which assumes that random variables follow such a factored model. FSCP allows us to model many applications in which decision variables are set before the random variables in alternating stages, i.e. we act first and observe later. For example, practical problems arise in transportation, finance, and the energy sector [wallace2005].
The state-of-the-art method for solving FSCP problems, called And-Or Branch and Bound (AOBB), explores a search space consisting of two types of nodes: The And nodes which correspond to random variables, and the Or nodes which correspond to decision variables. To explore this search space efficiently, AOBB uses two pruning techniques that are commonly used in constraint satisfaction and optimization, namely constraint propagation and bounding [babaki2017stochastic]. However, these techniques are mainly applicable to the Or nodes. The presence of random variables calls for alternative techniques to improve search-space exploration. A similar issue has been encountered for search-based probabilistic inference algorithms, and has been addressed by identifying repeated subproblems, among other methods [BacchusDP09]. Identification of repeated subproblems has recently received attention in the constraint programming community, too [UnaGSS19, ChuBS12]. In this paper we apply this idea to the FSCP problems and demonstrate the gains that can be obtained from them in problems with repeated subproblems. The contributions of this work are:
Proposing a method for identification of repeated subproblems in FSCP problems,
Compiling an And-Or search tree into a decision diagram,
Extending a generic CP solver with the capability of performing And-Or search and compilation, and evaluating this approach through comparison with existing alternatives.
The paper is organized as follows. We first present the background material in Section 2. In Section 3 we present a brief description of FSCP and And-Or search. Section 4 describes our method for caching the subproblems and compilation of FSCP into a decision diagram. We evaluate the proposed method in Section 5. We discuss the relation with existing work in Section 6, and conclude with directions for future research in Section 7.
In this section we review several key topics on which our proposed method relies. We start by reviewing multi-stage stochastic decision making problems and then review Bayesian networks, and decision diagrams.
2.1 Multi-stage stochastic decision making
We study a class of multi-stage stochastic decision making problems. At each stage of such problems the decision variables need to be set before the random variables are observed. In other words we act first and observe later. For example, we first need to decide how many workers we need to assign for a task and only later we observe the actual workload for the task. The goal is to assign values to decision variables at each stage in a way that the expected utility is maximized (or if desired, minimized). Note that in multi-stage stochastic problems the values chosen for the set of decision variables at each stage are conditioned both on the values of previously determined decision variables and the previously observed random variables. An example problem follows.
Example 1 (Production Planning)
) In each quarter we sell between 101 and 105 items of a product. We need to satisfy the uncertain demand with probability 0.8 in every quarter. At the start of each quarter we decide how many books to print for the quarter, and the demand is known at the end of that quarter. The optimal production plan should minimize the expected cost of storing surplus items.
The uncertainties of a multi-stage stochastic problem can be modeled as a factored distribution among the random and decision variables. Bayesian networks are one of the most popular representations of factored distribution that we review next.
2.2 Bayesian Network
A Bayesian network is a probabilistic graphical model which represents the conditional dependencies among a set of variables by edges in a directed graph. This representation facilitates compact encoding and efficient inference [Koller09]
. In addition to the graph structure we must specify the conditional probability distribution at each node. If the variables are discrete, this can be represented as aconditional probability table, which lists the probability that the child node takes on each of its different values for each combination of values of its parents.
A hidden Markov model which is a simple Bayesian network that captures a multi-stage stochastic process. Figure 1 shows the structure and probability tables of a hidden Markov model.
2.3 Decision Diagram
Decision diagrams are compact alternatives to decision trees. A decision diagram is a directed acyclic graph where nodes are variables and edges represent the assignment of value to the variables. Every path from a root node to a terminal node represents an assignment to all variables.
3 Stochastic Constraint Programming
Stochastic constraint programming (SCP) is a framework for modeling and solving multi-stage stochastic decision making problems. A multi-stage stochastic constraint satisfaction program is defined as a 7-tuple [hnich2011survey]. and are decision variables and random (stochastic) variables, respectively. is the domain of variables in , is a function that for each variable in defines a probability distribution over its domain. is a set of constraints. Each constraint is specified over a non-empty subset of and a (possibly empty) subset of . is a function that assigns a minimum satisfaction probability to each constraint in . is a partial ordering over : . The sets and respectively partition and and can be possibly empty, and is the number of stages.
A solution to the stochastic constraint program is a policy tree where each path represents an assignment to the variables in , and follows the ordering . In this tree each decision variable has just one child (corresponding to the selected value) and each random variable has as many children as the number of values in its domain. For each constraint in , the sum of probabilities of paths in which the constraint is satisfied should meet the minimum probability requirement specified by .
Given a utility function this definition can be extended to an optimization setting where the objective is to maximize (or minimize) the expected utility:
3.1 Factored Stochastic Constraint Program
The assumption of independent random variables falls short of representing the existing correlations between random variables in the real world. This motivates a generalization in which specifies a join probability distribution over variables in . In factored stochastic constraint programming (FSCP) the join distribution is factorized, i.e. . This is the assumption that we are making in the rest of this work. We also make the same extra assumptions as those made by [babaki2017stochastic]: 1) The utility function is represented by a single utility variable. This is not a restriction as long as the utility function can be encoded by a set of constraints. 2) The threshold assigned to all constraints by is one, i.e. all constraints are hard and should be satisfied in all possible (non-0 probability) paths.
Example 2 (Production Planning, continued)
Assume that in Example 1 the demand and supply in quarter are represented respectively by random variable and decision variable , both with domain . The demand depends on the market sentiment (represented by random variable ), which itself depends on the market sentiment in the previous quarter. The goal is to minimize the expected number of unsold books in the last quarter, while disallowing shortages.
Assuming quarters, the dependencies between random variables can be represented by the Bayesian network of Figure 1. The objective function, constraints, and domains of the corresponding FSCP are as follows:
The structure of an FSCP problem can be summarized in a graphical representation called the factor graph. We will later use this structure the factor graph to identify the identical subproblems during search.
Definition 1 (factor graph)
The factor graph is a bipartite graph which represents the factorization of a function with several variables. An FSCP factor graph can be represented by a graph where each corresponds to a variable, and each node corresponds to a factor (that is, a constraint or conditional probability table). The nodes and are connected to each other if and only if the variable corresponding to appears in the scope of the factor corresponding to .
3.2 Solving FSCP using And-Or search
The expression in Equation 1 can be represented by a graphical structure called the And-Or search tree. Solving an SCP problem, i.e. evaluating this expression, is possible by traversing this tree. An And-Or search tree has two types of internal nodes: 1) And nodes which correspond to random variables and sum operator, and 2) Or nodes which correspond to decision variables and max operator. An edge represents the assignment of a value to the variable that corresponds to the source node. A path from the root to a leaf represents an assignment to every variable in and the order of variables on each path follows .
Given an assignment on a path, the value of the leaf node is defined as . The value of an internal node can be computed recursively: If corresponds a random variable, . Otherwise . The optimal policy can be extracted by examining the trace of a bottom-up traversal of the tree.
Instead of storing the value at the leaves, we can take advantage of the factorization of probabilities and store these values on edges of the tree. Recall that an edge represents the assignment of a value to a variable. This assignment might reduce some factors of the distribution to a value. It might also reduce the domain of the utility variable to a value. We define the weight of an edge as the product of these values. Figure 4 (left) shows the And-Or search tree of Example 2.
When constraints are present, only subtrees that satisfy the constraints are included in this evaluation procedure. In such cases we can use constraint propagation to explore the search space more efficiently. Since the hard constraints should be satisfied in all possible scenarios, two modifications should be made to the standard constraint programming machinery: First, failure of any child of an And node immediately fails the node itself. Second, a reduction in the domain of a random variable caused by propagation is considered failure, as it implies that there is a possible scenario in which a constraint is violated.
The procedure described above uses constraint reasoning to prune the search space. Another possible improvement is to establish bounds that will guarantee that a subtree cannot lead to a solution better than what is already obtained. The And-Or Branch-and-Bound method uses such bounds to further prune the search space [babaki2017stochastic].
4 Compiling FSCP to Decision Diagram
Processing a factorized model can sometimes result in solving identical subproblems repeatedly. Some search-based algorithms for processing graphical models avoid these redundant computations by identifying identical subtrees in the search tree and merging them, hence obtaining a compact equivalent graph [MateescuDM08].
The factorized nature of FSCP problems suggests the possibility of applying a similar approach to these problems. This will turn the search tree into a graph which we call an And-Or Decision Diagram (AODD). However, merging identical subtrees repeatedly is not a practical method for compiling FSCPs, as it requires construction of the And-Or search tree. In this section, we describe a method for generating AODDs during the search, without the need to materialize the full search tree.
We traverse the And-Or search tree in a depth-first manner. However, before expanding each node, we first check whether the subtree rooted at this node is identical to another subtree visited earlier during the search. If this is the case, instead of expanding the node we connect its parent to the root of the existing subtree.
The described procedure depends on a method for testing the equivalence of subproblems without exploring them. Each subtree is uniquely identified by assignment to the variables preceding this node in the tree. However, it can be the case that the subproblem only depends on a subset of those variable. Following the terminology used in the probabilistic reasoning community, we call this subset the context of the subproblem:
Definition 2 (context)
For every internal node in the And-Or search tree, the path from the root to that node defines a (partial) assignment. We call a factor (i.e. constraint or probabilistic factor) active if it has some unassigned variable in its scope. The context of a node is the set of assignments to variables on its path which are in the scope of some active factor.
In Example 2, variable is assigned after . Figure 1 shows that the context of this variable is . As one can observe in Figure 4, the subtrees for assignments and are identical and can be merged. A similar case holds for subtrees of assignments and .
Algorithm 1 summarizes the procedure for compiling an FSCP over domain into an AODD. Before solving each subproblem, the cache key is generated from the context of the subproblem root node (Line 6). The cache is then inspected ( Line 7). If an identical subproblem is found, the node is merged with the existing subgraph. Otherwise, the search proceeds. The value of each node is calculated based on the values of its children (Line 19-25), and the node is stored in the cache before backtracking (Line 29).
Figure 4 (right) shows the And-Or decision diagram of Example 2 obtained using the described method. It can be observed that the identical subproblems which we mentioned earlier, are now merged. Once the AODD is generated, the optimal policy can be retrieved in the same way that it is obtained from an And-Or search tree.
The proposed method makes it possible to compile an FSCP to AODD on the fly, by introducing a small modification to the And-Or search procedure. This can lead to significant performance gains, as demonstrated in the next section.
To evaluate our approach, in this section we investigate the following research questions:
How effective is our method in identifying identical subproblems and compressing the search tree?
What is the effect of compilation on the performance of And-Or search compared with the existing methods?
When doesn’t the compilation help?
We address the above research questions using the knapsack and the investment problems.
Knapsack problem (based on a problem from [hnich2011survey]): Consider a knapsack with a certain capacity. Assume at each stage, an item has arrived and we need to choose to pick the item or leave it. The weight and value of each item is stochastic and is observed after making the decision. There are 5 different possibilities for items’ weight, and 3 possibilities for their values. The goal is to maximize the expected sum of values of the collected items subject to the hard constraint that the total weight of the items is less than the capacity of the knapsack. We implemented two variations of this problem: In the independent version (Knapsack-I), all variables are independent and in the chain version (Knapsack-C), weight and value at each stage depend on the similar variables at the previous stage.
Investment problem (based on a problem from [babaki2017stochastic]): Consider a company that has two options for investment at the start of each season, and only at the end of the season observes the stochastic return. There are 4 possibilities of return for each investment option. The first option has a higher return on average but the second option brings more tax relaxation at the end of the horizon. The goal is to maximize the expected returns by considering the tax relaxations. Similar to the previous problem, we have two variations of this problem (denoted by Investment-I and Investment-C). Note that for this problem, we have a hard constraint that the total sum of return of the second option should be less or equal to the total sum of return of the first option.
We ran experiments on machines with an Intel i5-4590 processor (3.3GHz) and 8GB of RAM running Linux Ubuntu 16.04. We extended the constraint programming solver Mini-CP [minicp] with And-Or search, caching, and compilation functionality. The time-out used was 1800 seconds. The MIP solver is Gurobi-8.1111www.gurobi.com.
To address Q1, we compare the performance of our approach on both problems with and without compilation. We measure the effects of compilation by varying the number of stages in both problems.
As shown in Figure 5, compilation leads to significant reductions for both problems. As the number of stages increase, so does the number of identical subproblems, which in turn results in exponential reductions.
To investigate the performance of our approach compared with the existing methods and address Q2, we compare our algorithm with the scenario-based conversion to MIP and And-Or branch and bound approach (AOBB) [babaki2017stochastic]. The results presented in Figure 6 show that without compilation, we are not able to solve problems beyond 6 stages using both MIP and AOBB approaches. However our approach scales to 25 stages and easily solves these problem in less than 5 minutes.
It is important to note that the compilation is more effective when the number of identical subproblems is high. Hence, the structure of the model affects its performance. To address Q3, we consider the knapsack problem and change the Bayesian network by including a hidden variable per item as described in [babaki2017stochastic]. We refer to this model as knapsack-H. The results in Table 1 show that compilation is less effective for this variant of the knapsack problem compared to the chain and independent versions. While solving 25-stages Knapsack-C takes only 9 seconds and 20-stages Knapsack-I takes 1 minutes to solve using AODD, Knapsack-H is not solved beyond 5 stages. When solving Knapsack-H, all hidden variables appear last in the ordering. Since all other variables depend on these hidden variables, there are no identical subproblems before reaching the hidden variables in the search tree. This leads to less reduction in Knapsack-H compared to the other two variants (see Figure 7). The AOBB approach takes advantage of bounding to solve the 6-stages problem, which suggests a future direction to explore bounding in AODD.
6 Related work
Our method is closely related to the And-Or search trees for graphical models (for example see [MateescuDM08]). In those studies the And-Or nodes have a different meaning from ours, where an And node corresponds to problem decomposition, and an Or node represents branching. Most of these works assume only one type of variable (only decision or random variables). Mixed deterministic-probabilistic networks [MateescuD08] include both deterministic and probabilistic factors, but only include decision variables, and solve the probabilistic reasoning problems (e.g. MPE and MAP inference) subject to constraints. To the best of our knowledge, [Marinescu09] is the only work in this area which includes both decision and random variables. This work evaluates influence diagrams using And-Or search graphs and uses a SAT solver to avoid exploring the subproblems with zero probability. Our method generalizes this approach by incorporating hard constraints on decision variables and using global constraints and the propagation power of CP.
Factored SCPs bridge the gap between influence diagrams and stochastic constraint programming by imposing probabilistic and deterministic factors over decision and random variables [babaki2017stochastic]. And-Or search with branch and bound has been previously used to evaluate influence diagrams when no constraint propagation is involved [YuanWH10]. And-Or search trees have also been used to solve stochastic constraint programs with independent random variables [Walsh02]. Neither of these methods exploits identical subproblems during the search.
Compilation to decision diagrams is a well-known technique in AI with celebrated success in model counting [MuiseMBH12], probabilistic inference [ChoiKD13]
, probabilistic logic programming[FierensBRSGTJR15], and planning [SpeckGM18, HoeySHB99]
, among others. Recently, decision diagrams have received attention in combinatorial optimization, too. However, in most of these studies the construction of decision diagrams is problem-specific. A notable exception is the work of[UnaGSS19] which proposes methods for compilation of CP subproblems to decision diagrams. This study proposes sophisticated methods for identification of identical subproblems, which require interaction with the propagation algorithms.
Scenario-based approaches are approximate methods that solve SCP problems by sampling a subset of possible scenarios from the probability distribution [TarimMW06, HemmiT018]. Our work demonstrates that it is possible to take all scenarios into account without the need to explicitly enumerate them.
7 Conclusion & Future Work
We presented a method for compiling factored stochastic constraint programs into And-Or decision diagrams. Our experiments demonstrate the advantages of such a compilation, especially when there is a lot of redundancy in the search space.
Decision diagrams have been successfully used in combinatorial optimization for obtaining bounds [BergmanCHH16]. A direction for future work is to devise compilation methods that create relaxed And-Or decision diagrams, which can then be used for obtaining bounds during search. There exists some recent work on using more sophisticated techniques for identification of equivalent and dominant subproblems [UnaGSS19, ChuBS12]. This motivates future work on subproblem identification in And-Or search.
The SCP problems usually include chance constraints, i.e. constraints that should be satisfied at least in a certain fraction of possible scenarios. Our current formalism only considers hard constraints. Generalizing this work to include chance constraints is another promising direction for future work.