In this paper, we consider the following optimization problem:
where is a closed, convex subset of and is a closed, possibly non-convex, subset of . This model is a formal way to “decompose” the feasible region into the “convex” constraints and the “non-convexities” of the problem at hand. The bulk of this paper will be concerned with non-convexity coming from integrality constraints, i.e., , where ; the special case will be referred to as a pure-integer lattice and the general case as a mixed-integer lattice ( gives us standard continuous convex optimization). However, some of the ideas put forward apply to other non-convexities like sparsity or complementarity constraints as well (see Theorem 2.7 below, where the only assumption on is closedness).
Cutting Planes and Branch-and-Bound.
Cutting planes were first successfully employed for solving combinatorial problems with special structure, such as the traveling salesman problem [2, 43, 44, 45, 46, 16, 29, 30, 28, 26, 27], the independent set problem [51, 53, 17, 60], the knapsack problem [3, 61], amongst others. For general mixed-integer problems, cutting plane ideas were introduced by Gomory [41, 40], but did not make any practical impact until the mid 1990s . Since then, cutting planes have been cited as the most significant component of modern solvers , where they are combined with a systematic enumeration scheme called Branch-and-Bound. Both of these ideas are based on the following notion.
Given a closed subset , a disjunction covering is a finite union of closed convex sets such that . Such a union is also called a valid disjunction.
Observe that the feasible region of (1.1) is always contained in any valid disjunction . This leads to a fundamental algorithmic idea: one iteratively refines the initial convex “relaxation” by intersecting it with valid disjunctions. More formally, a cutting plane for derived from a disjunction is any halfspace such that . The point is that the feasible region .
Thus, the convex region is refined or updated to a tighter convex set . The hope is that iterating this process with clever choices of disjunctions and cutting planes derived from them will converge to the convex hull of , where the problem can be solved with standard convex optimization tools. Since the objective is linear, solving over the convex hull suffices.
is a polytope. The dashed line shows the bounding hyperplane of the cutting plane; the dark triangle in both cases is the part of that is “shaved off”, i.e. .
Split disjunctions for the mixed-integer lattice were introduced by Cook, Kannan and Schrijver . These are disjunctions that are a union of two rational halfspaces that cover the mixed-integer lattice. Note that this implies the bounding hyperplanes of the two halfspaces have to be parallel. See Figure 1 where the disjunctions are colored in light gray. The right figure illustrates an example of Chvátal-Gomory cuts , which are cutting planes derived from split disjunctions where one side of the disjunction does not intersect the convex set
. Most cutting planes used in combinatorial optimization are Chvátal-Gomory cuts. For example, in the Maximum Matching problem the so-calledodd set or blossom inequalities are an example of Chvátal-Gomory cuts [35, 36, 15], where is the standard polyhedral formulation for maximum matching with 0/1 variables for the edges of the graph.
We will now formally define algorithms based on cutting planes and branch-and-bound, assuming access to a continuous, convex optimization solver. Below, when we use the word “solve” to process a continuous convex optimization problem, we assume that the output of such a solver will either (i) report infeasibility, or (ii) report that a maximizer does not exist either because the problem is unbounded or because the supremum is not attained, or (iii) report an optimal solution to the convex optimization problem.
Cutting plane (CP) algorithm based on a family of disjunctions:
Solve . If , report “INFEASIBLE” and STOP. If no maximizer exists, report “EXCEPTION” and STOP.
If , report as OPTIMAL and STOP. Else, choose a disjunction and a cutting plane for derived from such that . Set . If no cutting plane can be derived, report “NO CUTTING PLANE” and STOP.
The outputs “EXCEPTION” and “NO CUTTING PLANE” correspond to situations in which the CP algorithm stops without finding an optimal solution to the given problem. Note however that if, e.g., is compact, then the output “EXCEPTION” will never occur. Also, if is the mixed-integer lattice and is the family of split disjunctions, the output “NO CUTTING PLANE” will never occur if is an extreme point of . We call the sequence of operations in Definition 1.3 an “algorithm”, even though it may not terminate in finitely many iterations.
In the framework above, at every iteration there are usually many possibilities for the choice of the disjunction from , and then many possibilities for the choice of the cutting plane from the chosen disjunction. Specific strategies for these two choices give a particular instance of a cutting plane algorithm based on the family of disjunctions .
Disjunctions can also be used to simply search, as opposed to convexification by cutting planes. This leads to the idea of branching with pruning by bounds.
Branch-and-bound (BB) algorithm based on a family of disjunctions:
Initialize a list , .
Choose and update . Solve . If , continue the loop. If no maximizer exists, report “EXCEPTION” and STOP.
If , then check if . If yes, update ; if no, choose a disjunction such that and update . If no such disjunction exists, report “NO DISJUNCTION FOUND” and STOP.
If report “INFEASIBLE”. Else, return corresponding to current as OPTIMAL and STOP.
The idea is to maintain a list of convex subsets of the initial convex set which are guaranteed to contain the optimal point. stores the objective value of the best feasible solution found so far, which is a lower bound for the optimal value. In the worst case, one will go through each integer point (or connected component of ) in .
A branch-and-cut (BC) algorithm is a version of the algorithm defined in Definition 1.4, where there is an additional decision point in Step 2 (b), where one decides if one should add a cutting plane or branch as described in the step.
The literature on the complexity of cutting plane methods has often focused on the concepts of cutting plane proof and cutting plane rank, which are closely related to the efficiency of a cutting plane algorithm (Definition 1.3).
Let be a convex set, let model the non-convexity, and let be a family of valid disjunctions for (see Definition 1.1).
Suppose the inequality is valid for all points in A cutting plane (CP) proof based on with respect to of length for this inequality is a sequence of halfspaces such that 1) there exists a sequence of disjunctions such that for each , is a cutting plane derived from the disjunction applied to 111We make the standard notational convention that the trivial intersection . and 2) is valid for all points in .
The -closure is defined as the intersection of with all cutting planes that can be derived from all disjunctions in , which we denote by . We can iterate this operator times, which we will denote by . The -rank of a valid inequality for is the smallest such that this inequality is valid for (the rank is if the inequality is invalid for all , ).
The relation to the notion of a cutting plane algorithm (Definition 1.3) is simple: Suppose an instance of (1.1) has optimal value and a cutting plane algorithm ends with this value in iterations. Clearly, the cutting planes generated during the algorithm give a cutting plane proof of length for the inequality . Thus, establishing lower bounds on the length of cutting plane proofs are a way to derive lower bounds on the efficiency of a cutting plane algorithm. Also, proving a lower bound on the rank of an inequality gives a lower bound on the cutting plane proof length. See [31, 32, 33, 20, 19, 18, 24, 11, 37, 57] for a sample of upper and lower bound results on rank and proof length.
One can define analogous concepts for branch-and-bound. In fact, we will now define a generalization of the notion of cutting plane proof to branch-and-cut procedures.
Let be a convex set, let model the non-convexity, and let be a family of valid disjunctions for (see Definition 1.1). Suppose the inequality is valid for all points in A branch-and-cut (BC) proof based on with respect to for this inequality is a rooted tree such that 1) every node represents a convex subset of , 2) the root node represents itself, 3) every non-leaf node is labeled as a cutting node or a branching node, 4) any cutting node representing has exactly one child and there exists a disjunction and a cutting plane for derived from such that the child represents , 5) for any branching node representing , there exists a disjunction in given by such that the children of this node represent the sets , , and 6) is valid for all the subsets represented by the leaves of the tree. The size of the BC proof is the total number of nodes in the tree minus 1 (we exclude the root representing ).
Note that if all nodes in a branch-and-cut proof are cutting nodes, then we have a cutting plane proof and notions of length and size coincide. If all nodes are branching nodes, then we say we have a branch-and-bound (BB) proof.
As with the case of cutting planes, a branch-and-bound/branch-and-cut algorithm also provides a branch-and-bound/branch-and-cut proof of optimality with size equal to the number of iterations of the algorithm. Also, note that by just changing the stopping criterion in the algorithms, one can use them to derive upper bounds on the objective function value, i.e., proving validity of , instead of stopping at optimality or infeasibility. Consequently, CP/BB/BC proofs can be seen as a way to prove unsatisfiability in logic. This led to a rich literature at the intersection of optimization and proof complexity [12, 56, 47, 14, 23, 38, 39, 21, 54, 55, 49, 42], to cite a few.
The difference between a CP/BB/BC algorithm and a CP/BB/BC proof of optimality (or validity of an upper bound) is that the algorithm is restricted to choose a disjunction (and a cutting plane based on this disjunction) that “shaves off” the current optimal solution. There is no such restriction on a proof. In fact, there are two-dimensional examples where every CP algorithm based on a standard family of disjunctions converges to a value that is 12.5% larger than the optimal value even after infinitely many iterations, while there are finite length CP proofs based on the same family that can certify the validity of any upper bound strictly bigger than the optimal value .
2 Main Results
We summarize our main results in Table 1 and the discussion below.
|0/1 convex sets, variable disjunctions||general convex sets, variable disjunctions||0/1 convex sets, split disjunctions||general convex sets, split disjunctions|
|Variable dim.||CP BB (Thm. 2.1)||BB vs. CP (Thm. 2.4)||Open question||Open question|
|CP vs. BB (Thm. 2.2)||CP vs. BB (Thm. 2.2)|
|Fixed dim.||BB and CP (Thm. 2.5)||BB (CP) (Thm. 2.7)||BB and CP (Thm. 2.5)||BB (CP) (Thm. 2.7)|
|BB vs. CP (Thm. 2.9)||BB vs. CP (Remark 2.10)|
0/1 convex sets and variable disjunctions.
Consider the family of pure-integer instances where and is a convex set contained in the hypercube. This captures most combinatorial optimization problems, for instance. We focus on the most commonly used disjunctions in practice: variable disjunctions, i.e., every disjunction is a union of two halfspaces of the form
Let be any closed, convex set. Let and be the family of variable disjunctions. Let be a valid inequality for (possibly , if ). If there exists a branch-and-cut proof of size based on that certifies , then for any , there exists a cutting plane proof based on of size at most certifying .
If is a polytope, then the statement is also true with .
Thus, in this setting, cutting planes are always at least as good as branch-and-bound (up to slack in the general convex case and exactly for the polyhedral case). The question arises if CP can be provably much better. We show that this is indeed the case with an instance of maximum independent set problems. Let denote the standard formulation of the independent set problem on a graph : ; so the independent sets are represented by . The objective is to maximize .
Let be the graph given by disjoint copies of (cliques of size ). Then for with objective , and representing the family of all variable disjunctions, there is a cutting plane algorithm which solves the maximum independent set problem in iterations, but any branch-and-bound proof certifying an upper bound of on the optimal value has size at least for all .
General polytopes and variable disjunctions.
If one allows polytopes that are not necessarily subsets of the -hypercube, but sticks with and variable disjunctions, then the advantage of CP over BB discussed above disappears. We created examples of polytopes in every dimension such that there is a BB algorithm that solves the problem in iterations, but any CP proof will necessarily have infinite length (recall that only cutting planes based on variable disjunctions are allowed).
Let be defined as the convex hull of . For every , let be the Cartesian product of and the hypercube , i.e., . Consider instances of (1.1) with the objective , and , and let be the family of variable disjunctions. The optimal value is 0 and there is a branch-and-bound algorithm based on which certifies in steps. However, any cutting plane proof of has infinite length.
Fixing the dimension.
The following relatively straightforward result shows that in fixed dimensions, one must go beyond the setting for anything interesting to happen.
Let be compact, convex and let be valid for . There exist BB and CP proofs based on variable disjunctions that prove the validity of this inequality and whose size is bounded by a function only of the dimension. Thus, if we consider instances in a fixed dimension, there are O(1) size BB and CP proofs.
It is actually easy to present a BB algorithm that takes iterations (this is what we do in the proof of Theorem 2.5); however, it is not clear if the CP proof presented in the proof can be converted into a CP algorithm that takes iterations. This remains an open question.
The independent set example in Theorem 2.2 shows that if the dimension varies, CP can be exponentially better than BB. One can ask if a family of polyhedral instances can be constructed in some fixed dimension such that CP is better than BB by a factor that is exponential in the size of the input data (i.e., bit complexity of the numerical entries of the constraints). Interestingly, in fixed dimensions, the situation is reversed; roughly speaking BB is always as good as CP (at most polynomial blow-up), if the family of disjunctions has “bounded complexity”. Moreover, we can establish such a result for quite general non-convexities and convex relaxations.
The complexity of a disjunction is defined as follows: if any of the ’s are non-polyhedral, then the complexity of is . Else, the complexity of is the sum of the number of facets of each . The complexity of a family of disjunctions is the maximum complexity of any disjunction in (e.g., the complexity of the family of split disjunctions is ).
Fix . Let be any closed set modeling the non-convexity and let be a family of valid disjunctions for with complexity bounded by . For any convex set and any inequality valid for , if there is a cutting plane proof of length for its validity, then there is a branch-and-bound algorithm which proves the validity and takes at most iterations.
A notion that becomes important in some branch-and-bound type algorithms is the so-called flatness constant of a convex set. The flatness theorem [7, 6, 58] states that there exists a function , such that for every , if is a convex set with , then there exists such that . The number is usually referred to as the flatness constant (where the word “constant” is justified by the fact that this function has its main use when the dimension is fixed). Combining the flatness theorem with the proof techniques that go behind Theorem 2.7, we can also prove the following related theorem for the pure-integer lattice.
Fix . Let and let be the family of split disjunctions. For any convex set and any inequality valid for with , there is a branch-and-bound algorithm which proves the validity of and takes at most iterations, where is the flatness constant.
Theorems 2.7 and 2.8 provide some mathematical reasons for why the best complexity guarantees in fixed dimensions are for Lenstra-style algorithms, which can be interpreted as branch-and-bound algorithms. To complement this, the instances from Theorem 2.4 can be interpreted to be fixed dimension examples showing that BB can be infinitely better than CP. Nevertheless, in that instance, there is an size CP proof for -optimality, i.e., to prove for any . So the CP proof for approximate optimality is polynomial size in terms of the approximation parameter. We present another family of instances in fixed dimensions in Theorem 2.9 below where there are CP algorithms that finish in finite time but any such algorithm will take exponentially (in the data size) many steps ( we state the result for exact CP proofs, but the CP proofs remain exponential in size even when allowing for -approximations).
Given a rational , let be the convex hull of . Consider the instances of (1.1) with the objective , and . The optimal value is and there is a branch-and-bound algorithm with variable disjunctions which certifies in steps. However, any cutting plane proof with variable disjunctions of has length. The instances can be created in any dimension by using the construction from Theorem 2.4 where one takes a Cartesian product with a hypercube.
For the case of general polytopes and general split disjunctions in fixed dimension, examples are known in which BB solves the problem in iterations while any CP proof of optimality takes a number of iterations that is at least polynomial in the size of the input data. An instance of this type can be found in [22, Lemma 19]: if is the convex hull of the points , , and , where , and , then the rank of the inequality with respect to general split disjunctions is .
The second row of Table 1 summarizes the results in fixed dimension.
3 Proofs of main results
We first recall with the following well-known result from LP sensitivity analysis.
Let be a polyhedron given as the intersection of halfspaces, and let be a valid inequality for . Then for any , there exists such that is valid for .
See, for example, equation (22) in [59, Chapter 10]. ∎
We will need the following version of the above result.
Let be a compact, convex set. Let , and be such that is valid for . For any , there exists such that is valid for .
Let . Since is valid for , so is . Therefore, if we define , we have that . is a compact, convex set as it is a closed, convex subset of a compact set. Thus, there exists a (strongly) separating hyperplane given by such that is valid for and for all (see, for example, Problem 3 in Section III.1.3 in ). By appealing to Lemma 3.1, there exists such that is valid for , using the notation of Lemma 3.1. In particular, , i.e., . Therefore, is valid for , which is what we wish to establish. ∎
Let be a compact, convex set. Let be a valid inequality for that defines the face . Let be a valid inequality for . Then, for any , there exists such that is valid for where and .
Define the set . If , then is valid for and therefore works. Otherwise, note that is compact and any is not in and therefore . Define
which is a well-defined real number because we are maximizing a continuous function (the function has strictly positive denominator for all by the argument above) over a compact set. For any , either or . If , by definition of above, . If , then , where the first inequality follows from the fact that is a valid inequality for and the second inequality follows from the fact that . ∎
If is a polytope, then the above theorem holds with as well.
In this case, can be defined by maximizing over all vertices of not in ( should be defined as if the maximum is negative).∎
Let , and . An inequality is said to be an approximate rotation of with respect to if there exists such that and .
If is an approximate rotation of with respect to , then .
Let be a compact, convex set. Let be a valid inequality for that defines the face . Let be any non-convexity and be some family of valid disjunctions. Suppose is a valid inequality for and is a cutting plane proof of based on (with respect to ). Then, for any , there exists a sequence of inequalities and an inequality such that all of the following hold.
For each , is an -approximate rotation of with respect to .
is an -approximate rotation of with respect to .
is a cutting plane proof of based on , with respect to .
We prove the theorem by induction on the length of the CP proof. If , then is a valid inequality for itself and the result follows from Lemma 3.3. Consider . Fix an arbitrary as the “error parameter”. Apply Lemma 3.2 with , , , and to get . Set .
Let be the disjunction used to derive for . For each , let be defined by applying Lemma 3.2 with , , , , , and . Define .
By the induction hypothesis applied to the CP proof , viewed as a proof of , with , there exists a sequence of inequalities such that is an -approximate rotation of for each and is a CP proof of based on , with respect to (if , then we consider the trivial CP proof of length for the trivial inequality .)
By Remark 3.6, . Since is valid for , it is valid for for every . By Lemma 3.2 and the choice of , is valid for for every . Since is a face of induced by , we have that is a face of induced by the same inequality. By Lemma 3.3, there exists an -approximate rotation of valid for for every . In other words, there exist , such that is valid for . Set , and . Thus, is valid for for all , and therefore for . Since by choice, is an -approximate rotation of , and thus condition 1. is satisfied for .
From the hypothesis, is valid for . The definition of implies that is valid for . Also, by choice and so , where the second containment follows from Remark 3.6. Since is a face of induced by , by Lemma 3.3 with , there exists such that is valid for , with and . Thus, conditions 2. and 3. of the theorem are satisfied. ∎
If is a polytope, the statement of Theorem 3.7 holds with .
The above proof is inspired by ideas in Dash , where the polyhedral case is analyzed. In the general convex case, faces may not be exposed and the above proof deals with this issue.
3.2 Proofs of the main results
Proof of Theorem 2.1.
We prove this by induction on the number of branching nodes in the BC proof. If , then we have a CP proof and we are done. Now consider a BC proof with branching nodes. Note that all nodes in the tree represent subsets of obtained by intersecting with additional halfspaces (either cutting planes or disjunction inequalities of the form or ), i.e., each node in the tree is a compact, convex subset of . Consider any maximal depth branching node, that is, all its descendants are cutting nodes or leaves. Suppose this branching node represents a compact, convex subset and uses the disjunction . Let the two children of be and , which are both faces of .
Let . Since corresponds to a maximal depth branching node, the BC proof under consideration yields a CP proof of with respect to . Using Theorem 3.7 with , we can find a CP proof of with respect to such that is an -approximate rotation of with respect to . In other words, there exists such that and , where is the
-th standard unit vector. Similarly, applying Theorem3.7 to the CP proof of with respect to (the other branch obtained from ), we can find a CP proof of with respect to such that is an -approximate rotation of with respect to . In other words, there exists such that and .
Consider the set obtained by intersecting with the two inequalities and . Now observe that if we consider the face of defined by , the inequality reduces to , i.e., . Similarly, on the face defined by , the inequality also reduces to . Thus, is valid for both these faces of . Thus, we can derive this inequality as a cutting plane for using the disjunction . Thus, by concatenating the CP proofs of and and then deriving using the disjunction , we have replaced the entire tree below with a CP proof such that is valid for the leaf. Moreover, the length of this CP proof is exactly one less than the number of nodes below in the original branch-and-cut tree (since we do not have the two branching nodes, but have an extra cutting plane derivation). Thus, this replacement gives us a new BC proof of with one less branching node. We now appeal to the induction hypothesis with for the new, modified BC proof of . Thus, we obtain a CP proof of size at most the new BC proof (whose size is at most , the size of the original BC proof) for the inequality . By choice, , we have thus produced a CP proof of with size at most .
The case when is a polytope is handled by appealing to Corollary 3.8 in the above proof. ∎
Proof of Theorem 2.2.
We will use the fact that, for a clique with vertices, the LP value of the independent set problem is (and 1 if ).
We first prove a lower bound on the size of any BB proof that establishes the upper bound of on the objective value.
For , any branch-and-bound proof of has size at least .
Consider first any feasible node of the branch-and-bound tree at depth . Consider the path from the root node to this node at depth and suppose that of nodes on this path that were branched on, were set to and were set to (so ). Feasibility implies that all the vertices set to are from different cliques. From the remaining cliques, vertices were set to , so the LP value at this node is at least . Since we assumed , the LP has value strictly bigger than . Thus, if the branch-and-bound tree has only feasible nodes, only nodes at depth or more can have LP values at most . Since every branching node has at least two children, this means we have at least nodes in the tree, thus the size of the proof is at least .
Now, assume that some node of the BB tree is infeasible, which means at the node , we have set two variables, which are denoted as and , of the same clique to . Let be the parent node of , and let be the other child of . In , we have and (or vice versa). However, is a redundant constraint once is imposed since they belong to the same clique. Thus, and have the same LP objective value. So we can eliminate the node , and contract and to obtain an equivalent BB tree with fewer nodes. By doing the same thing to every infeasible node, we can get a new BB tree with fewer nodes than the original BB tree. Thus, the smallest size BB proofs are those with only feasible nodes. And we proved a lower bound of on the size of such BB proofs above.∎
Next we show that there is an efficient cutting plane algorithm that solves the problem.
There is a cutting plane algorithm based on variable disjunctions that solves the problem in iterations.
. Applying the disjunction