    # Robust two-stage combinatorial optimization problems under convex uncertainty

In this paper a class of robust two-stage combinatorial optimization problems is discussed. It is assumed that the uncertain second stage costs are specified in the form of a convex uncertainty set, in particular polyhedral or ellipsoidal ones. It is shown that the robust two-stage versions of basic network and selection problems are NP-hard, even in a very restrictive cases. Some exact and approximation algorithms for the general problem are constructed. Polynomial and approximation algorithms for the robust two-stage versions of basic problems, such as the selection and shortest path problems, are also provided.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In a traditional combinatorial optimization problem we seek a cheapest object composed of elements chosen from a finite element set . For example, can be a set of arcs of a given graph with specified arc costs, and we wish to compute an path, spanning tree, perfect matching etc. with minimum costs (see, for example, [1, 28]). In many practical situations the exact values of the element costs are unknown. An uncertainty (scenario) set

is then provided, which contains all realizations of the element costs, called scenarios, which may occur. The probability distribution in

can be known, partially known, or unknown. In the latter case the robust optimization framework can be used, which consists in computing a solution minimizing the cost in a worst case. Single-stage robust combinatorial optimization problems, under various uncertainty sets, have been extensively discussed over the last decade. Survey of the results in this area can be found in [2, 24, 20, 10]. For these problems a complete solution must be determined before the true scenario is revealed.

In many practical applications a solution can be constructed in more than one stage. For combinatorial problems, a part of the object can be chosen now (in the first stage) and completed in a future (in the second stage), after the structure of the costs has been changed. Typically, the first stage costs are known while the second stage costs can only be predicted to belong to an uncertainty set . First such models were discussed in [16, 18, 26, 23], where the robust two-stage spanning tree and perfect matching problems were considered. In these papers, the uncertainty set contains explicitly listed scenarios. Several negative and positive complexity results for this uncertainty representation were established. Some of them have been recently extended in , where also the robust two-stage shortest path problem has been investigated. In  and  the robust two-stage selection problem has been explored. The problem is NP-hard for discrete uncertainty representation but it is polynomially solvable under a special case of polyhedral uncertainty set, called continuous budgeted uncertainty (see ).

Robust two-stage problems belong to the class of three-level, min-max-min optimization problems. In mathematical programming, this approach is also called adjustable robustness (see, e.g. [5, 31]). Namely, some variables must be determined before the realization of the uncertain parameters, while the other part are variables that can be chosen after the realization. Several such models have been recently considered in combinatorial optimization, which can be represented as a 0-1 programming problem. Among them there is the robust two-stage problem discussed in this paper, but also the robust recoverable models [11, 12] and the -adaptability approach . In general, problems of this type can be hard to solve exactly. A standard approach is to apply row and column generation techniques, which consists in solving a sequence of MIP formulations (see, e.g., ). However, this method can be inefficient for larger problems, especially when the underlying deterministic problem is already NP-hard. Therefore, some faster approximation algorithms can be useful in this case.

In this paper we consider the class of robust two-stage combinatorial problems under convex uncertainty, i.e. when the uncertainty set is convex. Important special cases are polyhedral and ellipsoidal uncertainty, which are widely used in single-stage robust optimization. Notice that in the problems discussed in [16, 18, 26, 23], contains a fixed number of scenarios, so it is not a convex set. The problem formulation and description of the uncertainty sets are provided in Section 2. The complexity status of basic problems, in particular network and selection problems, has been open to date. In Section 3 we show that all these basic problems are NP-hard, both under polyhedral and ellipsoidal uncertainty. In Section 4, we construct compact MIP formulations for a special class of robust two-stage combinatorial problems and show several of its properties. In Section 5, we propose an algorithm for the general problem, which returns an approximate solution with some guaranteed worst case ratio. This algorithm does not run in polynomial time. However, it requires solving only one (possibly NP-hard) MIP formulation, while a compact MIP formulation for the general case is unknown. Finally, in Sections 67, and 8 we study the robust two-stage versions of three particular problems, namely the selection, representatives selection and shortest path ones. We show some additional negative and positive complexity results for them. There is still a number of open questions concerning the robust two-stage approach. We will state them in the last section.

## 2 Problem formulation

Consider the following generic combinatorial optimization problem :

 minCTxx∈X⊆{0,1}n,

where

is a vector of nonnegative costs and

is a set of feasible solutions. In this paper we consider the general problem , as well as the following special cases:

1. Let be a given network, where is a cost of arc . Set contains characteristic vectors of some objects in , for example the simple paths or spanning trees. Hence is the Shortest Path or Spanning Tree problem, respectively. These basic network problems are polynomially solvable, see, e.g., [1, 28].

2. Let be a set of items. Each item has a cost and we wish to choose exactly  items out of set  to minimize the total cost. Set contains characteristic vectors of the feasible selections, i.e. . We will denote by the set . This is the Selection problem whose robust single and two-stage versions were discussed in [3, 14, 25, 13].

3. Let be a set of tools (items). This set is partitioned into a family of disjoint sets , . Each tool has a cost and we wish to select exactly one tool from each subset  to minimize their total cost. Set contains characteristic vectors of the feasible selections, i.e. . This is the Representatives Selection problem (RS for short) whose robust single-stage version was considered in [17, 15, 21].

Given a vector , let us define the following set of recourse actions:

 R(x)={y∈{0,1}n:x+y∈X}

and a set of partial solutions is defined as follows:

 X′={x∈{0,1}n:R(x)≠∅}.

Observe that and contains all vectors which can be completed to a feasible solution in . A partial solution is completed in the second stage, i.e. we choose which yields . The overall cost of the solution constructed is for a fixed second-stage cost vector . We assume that the vector of the first-stage costs is known but the vector of the second-stage costs is uncertain and belongs to a specified uncertainty (scenario) set . In this paper, we discuss the following robust two-stage problem:

 \textscRTSt:minx∈X′maxc∈Uminy∈R(x)(CTx+cTy).

The RTSt problem is a robust two-stage version of the problem . It is worth pointing out that RTSt is a generalization of four problems, which we also examine in this paper. Namely, given and , we consider the following incremental problem:

 \textscInc(x,c)=miny∈R(x)cTy.

Given scenario , we study the following two-stage problem:

 \textscTSt(c)=minx∈X′miny∈R(x)(CTx+cTy).

Finally, given , we also consider the following evaluation problem:

 \textscEval(x)=CTx+maxc∈Uminy∈R(x)cTy=CTx+maxc∈U\textscInc(x,c).

A scenario which maximizes is called a worst scenario for . The inner maximization problem is called the adversarial problem, i.e., the problem

 maxc∈Uminy∈R(x)(CTx+cTy)

Notice that the robust two stage problem can be equivalently represented as follows:

 \textscRTSt:minx∈X′\textscEval(x).

Further notice that the two-stage problem is a special case of RTSt, where contains only one scenario. The following fact is exploited later in this paper:

###### Observation 1.

Computing for a given (solving the two-stage problem) boils down to solving the underlying deterministic problem .

###### Proof.

Let for each and let be an optimal solution to problem for the costs . Consider solution constructed as follows: set , if ; set , if and ; set , if and . Of course, and . It is easy to verify that is an optimal solution to the two-stage problem with the objective value of . ∎

In this paper, we examine the following three types of convex uncertainty sets:

 UHP ={c––+δ:Aδ≤b,δ≥0}⊂Rn+, (1) UVP =conv{c1,…,cK}⊂Rn+, (2) UE ={c––+Aδ:||δ||2≤1}⊂Rn+, (3)

where is the vector of nominal second stage costs, represents deviations of the second stage costs from their nominal values and is the deviation constraint matrix. There is no loss of generality in assuming that all the sets are bounded. The uncertainty sets and are two representations of the polyhedral uncertainty. By the decomposition theorem [29, Chapter 7.2], both representations are equivalent, i.e. bounded can be represented as and vice versa. However, the corresponding transformations need not be polynomial. Thus the complexity results from one type of polytope do not carry over to the other, and we consider them separately. The set represents ellipsoidal uncertainty, which is a popular uncertainty representation in robust optimization (see, e.g., ). We also study the following special cases of :

 UHP0 ={c––+δ:0≤δ≤d,||δ||1≤Γ}, UHP1 ={c––+δ:∑i∈Ujδi≤Γj,j∈[K],δ≥0}

Set is called continuous budgeted uncertainty [27, 13] and can be seen as a continuous and convex version of the nonconvex uncertainty set proposed in . In set we have budget constraints defined for some (not necessarily disjoint) subsets .

## 3 General hardness results

The robust two-stage problem is not easier than the underlying deterministic problem . So, it is interesting to characterize the complexity of RTSt when is polynomially solvable. In this section we focus on a core problem, which is a special case of all the particular problems studied in Section 2. We will show that it is NP-hard under , and . Hence we get hardness results for all the particular problems. Consider the following set of feasible solutions

 X1={x∈{0,1}n:x1+⋯+xn=n}={1},

i.e. contains only the vector of ones. We have and contains only one solution, as there is only one recourse action for each . Hence, the robust two stage version of the problem with can be rewritten as follows:

 \textscRTSt1:minx∈X′1(CTx+maxc∈UcT(1−x)). (4)

The following result is known:

###### Theorem 1 ([25, 19]).

The problem with is NP-hard. Furthermore, if and is a part of the input, then is strongly NP-hard.

We use Theorem 1 to prove the next complexity results. First observe that the problem under consideration will not change if we replace with in (4). Hence, we immediately get the following corollary:

###### Corollary 1.

The problem with uncertainty set is NP-hard when and strongly NP-hard when is a part of the input.

###### Theorem 2.

The problem with uncertainty set is strongly NP-hard.

###### Proof.

Let , be an instance of the strongly NP-hard problem. Consider an instance of , where are the first stage costs and

 UHP=⎧⎨⎩0+[δλ] : δ=∑j∈[K]λjcj,∑j∈[K]λj=1,δi≥0 ∀i∈[n],λj≥0 ∀j∈[K]⎫⎬⎭⊂Rn+K.

Since the first stage costs of variables are 0, we can fix in every optimal solution to the instance . The problem then reduces to

 minx∈X′1max{λ≥0:||λ||1=1}⎛⎝CTx+∑j∈[K]λjcTj(1−x)⎞⎠=minx∈X′1maxc∈{c1,…,cK}(CTx+cT(1−x)),

where . Consequently, the problem with instance is equivalent to the strongly NP-hard problem with the instance . ∎

Note that the reduction in the proof of Theorem 2 constructs an uncertainty set with a non-constant number of constraints. We will show in Section 4 that if the number of constraints in the description of (except for the nonnegativity constraints) is constant, then the problem is polynomially solvable.

###### Theorem 3.

The problem with uncertainty set is NP-hard.

###### Proof.

Given an instance of , define and . We use the following equality (see ):

 2⋅max{cT1y,cT2y}=(cT1y+cT2y)+√yT(c1−c2)(c1−c2)Ty=(cT1y+cT2y)+√yTAATy,

where is a square matrix (we append columns to . We get

 2⋅ minx∈X′1(CTx+max{cT1y,cT2y})=minx∈X′1(2CTx+c––Ty+√yTAATy) = minx∈X′1(2CTx+c––Ty+||ATy||2)=minx∈X′1(2CTx+maxc∈{c––+Aδ:||δ||2≤1}cTy).

The last equality follows from the fact that (see, e.g., ). In consequence, the NP-hard problem with the instance is equivalent to with the first stage costs and ellipsoidal uncertainty set . ∎

###### Theorem 4.

The robust two-stage versions of the Selection, RS, Spanning Tree, and Shortest Path problems are strongly NP-hard under and , and NP-hard under .

###### Proof.

It is easy to see that is a special case of the RTSt Selection problem, with , and the RTSt RS problem, with , . To see that it is also a special case of the basic network problems, consider the (chain) network shown in Figure 1. This network contains exactly one path and spanning tree. So the problem is only to decide for each arc, whether to choose it in the first or in the second stage, which is equivalent to solving .

In Section 8 we will show that the hardness result from Theorem 4 can be strengthened for the two-stage version of the Shortest Path problem.

## 4 Compact formulations

In this section we construct compact formulations for a special class of problems under uncertainty sets and . We will assume that

 X={x∈{0,1}n:Hx≥g} (5)

and the polyhedron

 N={x∈Rn:Hx≥g,0≤x≤1} (6)

is integral, i.e. is the convex hull of all integral vectors in  or, equivalently, is attained by an integral vector, for each  for which the minimum (maximum) is finite (see [29, Chapter 16.3]). Important examples, where the set of feasible solutions is described by are the shortest path and the selection problems discussed in Section 2. We can also use the constraints to describe and the further reasoning will be the same. We can rewrite the inner adversarial problem (notice that is fixed) as follows:

 maxc∈Uminy∈R(x)cTy = maxc∈Umin{y∈{0,1}n:y+x∈X}cTy = maxc∈Umin{y∈{0,1}n:H(y+x)≥g,y≤1−x}cTy = maxc∈Umin{y∈Rn:H(y+x)≥g,0≤y≤1−x}cTy,

where the last equality follows from the integrality assumptions and the fact that is a fixed binary vector. Since and are convex (compact) sets and is a concave-convex function, by the minimax theorem  we can rewrite the adversarial problem as follows:

 min{y∈Rn:H(y+x)≥g,0≤y≤1−x}maxc∈UcTy. (7)

The robust two-stage problem thus becomes the following min-max problem:

 minx∈X′min{y∈Rn:H(y+x)≥g,0≤y≤1−x}maxc∈U(CTx+cTy). (8)

If , then we can dualize the inner maximization problem in (8), obtaining

 maxc∈UcTy=max{δ≥0:Aδ≤b}(c––+δ)Ty=–cTy+min{u≥0:uTA≥yT}uTb.

As the result we get the following compact MIP formulation for RTSt under :

 minCTx+c––Ty+uTbs.t.H(y+x)≥gx+y≤1uTA≥yTx∈{0,1}ny,u≥0 (9)
###### Observation 2.

The integrality gap of (9) is at least for the RTSt Shortest Path problem under the uncertainty set .

###### Proof.

Consider an instance of RTSt Shortest Path shown in Figure 2. Set contains characteristic vectors of the simple paths from to of the form , . Notice that . It is easy to see that the optimal objective value of (9) equals . In the relaxation of (9) (see also the relaxation of (8)) we can fix , and and for each . The cost of this solution is 1, which gives the integrality gap of . Figure 2: An instance of the robust two-stage shortest path problem with UHP0, Γ=m, and M is a big constant.

Problem (9) can be solved in polynomial time for RTSt Selection under  . In Section 7 we will show that the same result holds for RTSt RS under . On the other hand, (9) is strongly NP-hard for arbitrary , when the constraint becomes , i.e. when (9) models the problem (see Section 3). We now show that is polynomially solvable, when there is only a constant number of constraints in , except for the nonnegativity constraints (note that the hardness result in Section 2 requires an unbounded number of constraints).

###### Theorem 5.

The problem can be solved in polynomial time if the matrix in has a constant number of rows.

###### Proof.

Consider the formulation (9) for with for a constant . Let us assume that

are fixed. The remaining optimization problem can be rewritten as the following linear program with additional

slack variables :

 min bTu s.t. ATu−s=y ui≥0 i∈[m] si≥0 i∈[n]

where and . The coefficient matrix of this problem is , where denotes the identity matrix. Since is nonempty and bounded, there is an optimal basis matrix to this problem, corresponding to basic variables , , so that

 [uBsB]=B−1y. (10)

We will use the fact that the matrix has a special structure. Namely, by reordering the constraints and variables, we can assume that

 B−1=⎡⎣A1O−I(n−m′)⎤⎦−1=⎡⎣A2O−I(n−m′)⎤⎦∈Rn×n

with and being the zero matrix, where is the size of . Fixing a basis matrix , problem thus simplifies to

 min CTx+(c––T+[bTB0TB]B−1)y s.t. B−1y=⎡⎣A2O−I(n−m′)⎤⎦y≥0 x+y=1 x∈{0,1}n y∈{0,1}n

where and are coefficients corresponding to and , respectively. Notice that for each , because and for all . If we fix the values of the first variables in , corresponding to matrix , the resulting problem can be solved in polynomial time. Indeed, in this case all the remaining variables in are either forced to 1, to 0, or are kept free. There are many different candidates to choose a basis, and for each candidate, we enumerate values for the -variables involved. For fixed , the resulting complexity is thus polynomial in the input size. ∎

Let us now focus on ellipsoidal uncertainty. If , then (7) can be rewritten as

 min{y:H(y+x)≥g,0≤y≤1−x}c––Ty+||ATy||2.

Consequently, we get the following compact program for RTSt under :

 minCTx+c––Ty+||ATy||2s.t.H(y+x)≥gx+y≤1x∈{0,1}ny≥0 (11)

Problem (11) is a quadratic 0-1 optimization problem, which can be difficult to solve. In Section 5 we will propose some methods of computing approximate solutions to (11).

###### Observation 3.

The integrality gap of (11) is at least for the RTSt Shortest Path problem under the uncertainty set .

###### Proof.

Consider the same network as in the proof of Observation 2. For each arc we fix and and for each arc we fix an , . Let be a diagonal matrix having the values of on the diagonal. Hence

 UE={c––+mδ:||δ||2≤1}⊂R2m.

The reasoning is then the same as in the proof of Observation 2. ∎

## 5 Computing approximate solutions

A compact formulation for the general RTSt problem is unknown. Therefore, solving the problem requires applying special row and column generation techniques (see, e.g. ). As this method may consist of solving many hard MIP formulations, it can be inefficient for large problems. In this section we propose algorithms, which return solutions with some guaranteed distance to the optimum. We will discuss a general case as well as cases that can be modeled as the min-max problem (8).

### 5.1 General approximation results

Let be expressed as (5), but now no assumptions on the polyhedron (see (6)) are imposed. So, the underlying deterministic problem can be NP-hard and also hard to approximate. By interchanging the min-max operators we get the following lower bound on the optimal objective value of the RTSt problem:

 LB=maxc∈Uminx∈X′miny∈R(x)(CTx+cTy)=maxc∈Umin(x,y)∈Z(CTx+cTy),

where

 Z={(x,y):H(x+y)≥g,x+y≤1,x∈{0,1}n,y∈{0,1}n}.

Consider the following relaxation of :

 Z′={(x,y):H(x+y)≥g,x+y≤1,0≤x≤1,0≤y≤1}.

Since and are convex sets, by the minimax theorem , we have

 LB≥maxc∈Umin(x,y)∈Z′(CTx+cTy)=min(x,y)∈Z′maxc∈U(CTx+cTy)

We also get the following upper bound on the optimal objective value (the min-max problem):

 (12)

We thus get

 UBLB≤min(x,y)∈Zmaxc∈U(CTx+cTy)min(x,y)∈Z′maxc∈U(CTx+cTy)=ρ. (13)

Let be an optimal solution to the min-max problem (12). Then

 \textscEval(x∗)≤CTx∗+maxc∈UcTy∗=UB.

We thus get

 \textscEval(x∗)≤ρ⋅LB (14)

and is a -approximate, first-stage solution to RTSt, i.e. a solution whose value is within a factor of of the value of an optimal solution to RTSt. For the uncertainty sets and the value of LB can be computed in polynomial time by solving convex optimization problems and for by solving an LP problem. On the other hand, the upper bound and approximate solution  can be computed by solving a compact 0-1 problem (after dualizing the inner maximization problem in (12)). In the next part of this section we will show a special case of the problem for which can be computed in polynomial time.

We now consider the polyhedral uncertainty. Using duality, the min-max problem (12) under , can be represented as the following MIP formulation:

 minCTx+c––Ty+uTbs.t.H(y+x)≥gx+y≤1uTA≥yTx,y∈{0,1}nu≥0 (15)

The relaxation of (15), used to compute , is an LP problem, so it can be solved in polynomial time. The problem (15) can be more complex. However, it can be easier to solve than the original robust two-stage problem. Using (13) and (14), we get the following theorem:

###### Theorem 6.

Let be optimal to (15). Then is a -approximate first-stage solution to the RTSt problem and is the integrality gap of (15).

We now describe the case in which can be computed in polynomial time, which yields a -approximation algorithm for the robust two-stage problem. Namely, we consider the continuous budgeted uncertainty . Fix and consider the following problem:

 maxc∈UHP0cTy.

This problem can be solved by observing that either the whole budget is allocated to or the allocation is blocked by the upper bounds on the deviations. So

 maxc∈UHP0cTy=min{c––Ty+Γ,(c––+d)Ty}.

Hence the min-max problem can be rewritten as follows:

 min(x,y)∈Zmaxc∈UHP0(CTx+cTy) = min(x,y)∈Zmin{CTx+c––Ty+Γ,CTx+(c––+d)Ty} = min{\textscTSt(c––)+Γ,\textscTSt(c––+d)}.

In consequence, the minmax problem reduces to solving two two-stage problems, which can be done in polynomial time if the underlying problem is polynomially solvable (see Observation 1). So, in this case a -approximate solution can be computed in polynomial time.

### 5.2 Approximating the problems with the integrality property

In this section we propose some methods of constructing approximate solutions for the RTSt problem if the polyhedron (see (6)) satisfies the integrality property. Recall that in this case we can represent RTSt as the min-max formulation (8), so from now on we explore the approximability of (8). Let be any fixed scenario. Thus the two-stage problem (see Section 2) with , in the second stage, can be then formulated as follows:

 minCTx+~cTyH(x+y)≥gx+y≤1x,y∈{0,1}n (16)

Using Observation 1, we can solve (16) in polynomial time, by solving one underlying deterministic problem . We now show how to obtain an approximate solution to (8) by solving (16) for an appropriately chosen scenario . Let be an optimal solution to (16).

###### Lemma 1.

If , (shortly ) for each , then is a -approximate solution to (8).

###### Proof.

Let be an optimal solution to (8). We then have . The inequality (1) follows from the assumption that and . The inequality (2) holds because is an optimal solution to (16) and this optimal solution will not change when we relax with in (16) due to the integrality property assumed. ∎

Accordingly, we can construct the best guarantee , by solving the following convex optimization problem:

 maxt−1s.t. t−1maxc∈Uci≤~cii∈[n]~c∈Ut−1≥0

where the values , , have to be precomputed by solving additional convex problems.

#### 5.2.1 Polyhedral uncertainty

The next two theorems are consequences of Lemma 1.

###### Theorem 7.

Problem (8) with is approximable within .

###### Proof.

Fix . Then for each , the inequality holds. Thus by fixing in Lemma 1 the theorem follows. ∎

###### Theorem 8.

If , , in , then (8) with is approximable within .

###### Proof.

Fix . Then for each scenario , we get . Thus by fixing in Lemma 1 the result follows. ∎

The next result characterizes the approximability of the problem under .

###### Theorem 9.

Assume that the number of budget constraints in is constant and the following problem is polynomially solvable:

 minCTx+c––Ty\rm s.t.H(y+x)≥gx+y≤10≤y≤dx∈{0,1}n (17)

where , . Then (8) under admits an FPTAS.

###### Proof.

The compact MIP formulation (9) for (8) takes the following form:

 minCTx+c––Ty+∑j∈[K]ujΓjs.t.H(y+x)≥gx+y≤1∑{j∈[K]:i∈Uj}uj≥yii∈[n]x∈{0,1}ny,u≥0 (18)

Since for each , we get for each . Let us fix for some integer , and consider the numbers . Fix vector , where . The problem (18) reduces then to (17), where , . Let us enumerate all vectors , with components ,