In many applications, data are evolving with time and the systems have to be adapted in order to take into account this evolution by providing near optimal solutions over the time. However, moving from a solution to a new one, may induce non-negligible transition costs. This cost may represent e.g. the cost of turning on/off the servers in a data center [Albers17], the cost of changing the quality level in video streaming [Joseph], or the cost for turning on/off nuclear plants in electricity production [thesececile]. Various models and algorithms have been proposed for modifying (re-optimizing) the current solution by making as few changes as possible (see [Anthony, Blanchard, Cohen, Gu, Megow, Nagarajan] and the references therein). In this paper, we follow a new trend, popularized by the works of Eisenstat et al. [Eisenstat] and Gupta et al. [Gupta], known as the multistage model: Given a time horizon and a sequence of instances , (one for each time step), the goal is to find a sequence of solutions (one for each time step) optimizing the quality of the solution in each time step and the stability (transition cost or profit) between solutions in consecutive time steps. Surprisingly, even in the offline case, where the sequence of instances is known in advance, some classic combinatorial optimization problems become much harder in the multistage model [An, Bampis, Eisenstat, Gupta]. For instance, the Minimum Cost Perfect Matching problem, which is polynomially-time solvable in the one-step case, becomes hard to approximate even for bipartite graphs and for only two time steps [Bampis]. In a more recent work, Fluschnik et al. [Fluschnik] study multistage optimization problems from a complexity parameterized point of view, focusing on the multistage Vertex Cover problem.
In this article, we focus on the complexity and the approximability of various multistage minimization problems including Min Cut, Vertex Cover, Prize-Collecting Steiner Tree and Prize-Collecting Traveling Salesman. The central question that we address in this work is to what extend linear-programming based methods can be used in the multistage framework. Arguably one of the main techniques for the design of approximation algorithms for usual (static) problems, LP-based methods have already been fruitfully applied in the multistage setting for facility location problems in [Eisenstat, An].
Our contribution. We first note that the multistage variants of monotone minimization problems (as defined in [Hochbaum]
) such as Min Cut, or IP2-non-monotone binarized minimization problems[Hochbaum] such as Vertex Cover, remain monotone and IP2-non-monotone, respectively. Hence, the multistage variants of monotone static (one-step) problems can be solved in polynomial time, while the multistage variants of IP2-non-monotone static problems have the half-integrality property and consequently, they can be solved by a 2-approximation algorithm. Though obtained using simple arguments, we quote these results as they contrast with the several hardness results for multistage versions of classical problems such as Spanning Tree or Minimum Cost Perfect Matching. Indeed, Min Cut is, to the best of our knowledge, the first problem that is shown to remain polytime solvable in the multistage setting. Also, multistage Vertex Cover has the same approximation guarantee (2-approximation) as the static version (the ratio being tight under UGC).
Then, we focus on minimization problems which are neither monotone nor IP2. We introduce a new rounding scheme, called two-threshold rounding scheme, designed for multistage problems in the sense that it is able to take into account both transition costs and individual costs of solutions when rounding. We show a first application of this rounding scheme leading to a 2-approximation algorithm for the multistage variant of the -Set Cover problem (Set Cover where each element belongs to at most sets). As main results of this article, we then give a more involved use of this rounding scheme for a multistage variant of two prize-collecting problems: the multistage Prize-Collecting Steiner Tree, for which we obtain a 3.53-approximation algorithm, and the multistage Prize-Collecting Traveling Salesman problem, for which we obtain a 3.034-approximation algorithm.
1.1 Related Work
Multistage framework. The multistage model considered in this paper has been introduced in Eisenstat et al. [Eisenstat] and Gupta et al. [Gupta]. Eisenstat et al. [Eisenstat] studied the multistage version of facility location problems for which they proposed logarithmic approximation algorithms. An et al. [An] obtained constant factor approximation algorithms for some related problems. Interestingly, these results are obtained using LP-based techniques, and in particular (randomized) roundings. These roundings are tailored for facility location problems, and thus quite different from the approach followed in this article.
Gupta et al. [Gupta] studied the multistage Maintenance Matroid problem for both the offline and the online settings. They presented a logarithmic approximation algorithm for this problem, which includes as a special case a natural multistage version of the Spanning Tree problem. They also considered the online version of the problem and they devised an efficient randomized competitive algorithm against any oblivious adversary. The same paper also introduced the study of the multistage Minimum Cost Perfect Matching problem for which they proved that it is hard to approximate even for a constant number of stages. Bampis et al. [Bampis] improved this negative result by showing that the problem is hard to approximate even for bipartite graphs and for the case of two time steps. Olver et al. [Olver] studied a multistage version of the Minimum Linear Arrangement problem, which is related to a variant of the List Update problem [Sleator], and provided a logarithmic lower bound for the online version and a polylogarithmic upper bound for the offline version. Very recently, Fluschnik et al. [Fluschnik] study the multistage Vertex Cover problem under a parameterized complexity perspective. They show that the multistage (offline) problem is computationally hard in fairly restricted cases and prove fixed-parameter tractability results for some natural parameterizations in some restricted cases.
Some multistage maximization problems have been studied as well. Let us quote the multistage Max-Min Fair Allocation problem [Bampis+], which is a multistage variant of the Santa Klaus problem [Maxim]. Constant factor approximation algorithms are obtained for the off-line setting. More recently, in [Bampis++], a multistage variant of the Knapsack problem has been studied. A PTAS has been proposed and it has been proved that there is no FPTAS for the problem even in the case of two time steps, unless . In [Bampis+++], general techniques for a family of online multistage problems, called Subset Maximization, are developed and thereby characterizing the models (given by the type of data evolution and the type of similarity measure) that admit a constant-competitive online algorithm.
Finally, let us mention the works by Buchbinder et
al. [Buchbinder] and Buchbinder, Chen and Naor [Buchbinder+] who considered also a multistage model and studied the relation between the online learning and competitive analysis frameworks, mostly for fractional optimization problems.
Static framework. In [Hochbaum], Hochbaum described an easily recognizable class of integer programming problems, called monotone problems, and proposed an algorithm that solves these problems in polynomial time, even in the case where the objective function is nonlinear. She also considered a large class of nonmonotone problems and she proposed a polynomial time algorithm that finds (fractional) optimal solutions that are half integral. These solutions can be used in order to devise 2-approximate solutions. In addition, the proposed 2-approximations are the best results that one can hope for these problems, unless there is a better approximation for Vertex Cover.
For the Prize-Collecting Traveling Salesman problem that has been introduced in [Balas], Bienstock et al. [Bienstock] proposed a 2.5-approximation algorithm. They have also proposed a 3-approximation algorithm for the Prize-Collecting Steiner Tree problem. Both algorithms use a rounding procedure of appropriate linear programming relaxations. Goemans and Williamson [GW] improved these results providing a 2-approximation for both problems, based on the primal-dual scheme. Then, Archer et al. [Archer] devised a -approximation algorithm for both problems. All these results hold for both the rooted and unrooted versions of the problems. This is true since there are approximation-preserving reductions from the rooted to the unrooted version of the problems, and vice versa [Archer].
2 Monotone and IP2 minimization problems
Multistage problem and ILP formulation
Let us consider a minimization problem where we want to minimize some linear cost function under some set of constraints defined as and . In the multistage model defined in [Eisenstat, Gupta], we are given a sequence of instances of the problem, i.e., cost functions , , and matrices and , . The goal is to find a sequence of feasible solutions (i.e., and ), so as to minimize an objective function which is the sum of:
The costs of individual solutions, i.e., ;
A transition cost: each time differs from , this induces a cost . Thus the global transition cost is .
This can be easily modeled as an ILP as follows.
As mentioned in introduction, several such multistage problems are hard to approximate, even when the underlying problem is polynomial. Here, we note that for some other problems the structure of the ILP formulation keep some properties in the multistage setting fitting the setting of Hochbaum [Hochbaum], leading for instance to an exact (polytime) algorithm for multistage Min Cut or a 2-approximation for multistage Vertex Cover.
2.1 Minimum cut and monotone problems
Let us first consider the (static) Min Cut problem, where we are given a graph , two distinguished vertices and of , and a function associating a weight to each edge of the graph. A cut of is a partition of into and such that and . The cost of the cut is . A minimum cut in a graph is a cut whose cost is minimum over all cuts of the graph.
In the multistage version, as mentioned above, we consider that at every time step , we are given a graph , where the edge set and the cost of the edges may change over time. We denote by the edge set of the graph at time and by the cost of edge . We denote by the vertex in . There is a transition cost, , for every vertex changing its partition-set from time to time .
We can solve the problem in polynomial time: indeed, the multistage problem on instances can be seen as a single (static) Min Cut problem on (nearly) vertices. To see this, we construct a new graph in which we copy the graphs and we connect vertex to vertex , for every with an edge of weight . We also add two vertices and . We connect (resp. ) with every vertex (resp. ), for , with an edge (resp. ) of weight . Then, it is sufficient to determine a minimum cut between and in . Clearly, the cost of the minimum cut in has a cost equal to the total cost, i.e. the sum of the cost of the cuts at every time step and of the transition cost between consecutive instances.
Another way to prove this result is to formulate the multistage Min Cut problem as an ILP and then prove that every optimal solution of the relaxation of this ILP is always an integer solution. In order to do that, let us first write the multistage Min Cut problem as an ILP where if vertex belongs to subset , 0 otherwise. Also, let if , 0 otherwise.
In order to show that the relaxation of this IPL is integral, we use the result of Hochbaum [Hochbaum] that a class of minimization integer programming problems, known as monotone, are solvable in polynomial time. The problems in this class are characterized by constraints of the form , where and the variable appears only in that constraint. The direction of the inequality is not important and the coefficients and can be any real number if . The objective function is unrestricted, but the function of must be convex. It is not difficult to verify that the ILP of the multistage Min Cut problem is monotone. It is important here to notice that a minimization problem which is monotone in the static case continues to be monotone in the multistage framework. This is a consequence of the fact that the constraints containing the transition variables and the way the transition cost is added in the objective function induce a monotone integer program.
Multistage Min Cut, as well as the multistage version of any monotone minimization problem, is polynomial time solvalbe.
Note that the same holds in the more general case where the transition costs also depend on the vertex, i.e., there is a cost for changing decision about vertex from time to .
2.2 Vertex Cover and IP2 problems
As it is well known, the relaxation of the standard formulation of the minimum Vertex Cover problem has the semi-integrality property, meaning that any extremal solution has coordinates in . This has been generalized by Hochbaum [Hochbaum] in a class of nonmonotone minimization integer programs, known as IP2. These optimization problems have constraints of the type without restriction in the sign of and . Some NP-hard problems can be modeled this way, but in the case where all coefficients in the constraint matrix are in , the IP2 problem is said to be binarized. For binarized IP2 integer programs, Hochbaum [Hochbaum] propose a polynomial time algorithm that provides half integral (fractional) optimal solutions for the continuous relaxation. When the half integral solution can be rounded to a feasible integral solution, this gives a 2-approximation polynomial time algorithm. As for monotone integer programs, it is easy to see that the multistage variant of an IP2 problem belongs to the class of IP2 problems and hence, a 2-approximation algorithm exists for all these problems, including for instance multistage Vertex Cover.
Multistage Vertex Cover, as well as the multistage version of any minimization binarized IP2 integer problems, has a 2-approximation algorithm.
As previsouly, the same holds in the model where the transition cost also depend on the vertex /variable .
Note that for Vertex Cover this is the best we can hope for since even the static version is hard to approximate within ratio under UGC [KhotR08].
3 Rounding scheme
We now tackle problems which are neither monotone nor IP2. We propose a new rounding method, called 2-threshold rounding scheme, which allows to take into account both individual costs of solutions and transition costs in the multistage setting. We first show in Section 3.1 an easy application of this rounding to the -Set Cover problem (Set Cover where each element appears in at most sets), leading to a -approximation algorithm for the multistage version of this problem. We then show in Sections 3.2 and 3.3 how to use the 2-threshold rounding scheme for a multistage version of the two problems Prize-Collecting Steiner Tree and Prize-Collecting Traveling Salesman. We obtain respectively a 3.53- and a 3.034-approximation algorithm.
Let us now introduce the rounding scheme. A classical rounding scheme for a static problem expressed as an ILP is as follows: starting from an optimal solution of a continuous relaxation, fix a threshold , and fix if , and otherwise. This gives for instance an -approximation algorithm for -Set Cover. However, this rounding is not suitable for multistage problem as it may induce very large transition cost: if and , then while .
To overcome this, we introduce the following 2-threshold rounding scheme.
(2-threshold rounding scheme) Let a sequence of values in , and two parameters with . Then RS is the sequence of values in defined as:
If then .
If then .
Consider a maximal interval with for all . Then: if simultaneously (1) or , and (2) or , then fix for all . Otherwise ((1) or (2) is not verified) then fix for all .
As we will see, these two thresholds allow to bound both transition costs and individual costs of solutions when rounding.
3.1 Rounding scheme for -Set Cover
We consider the Set Cover problem where we are given a ground set , and a collection of subsets of . Each set has a nonnegative weight . The goal is to find a subcollection of of minimum weight such that each element of the ground set is in at least one of the chosen sets. The -Set Cover problem corresponds to instances where each element is in at most sets. Note that for any , Set Cover is not -approximable if [DinurGKR05], and not -approximable under UGC [KhotR08]. In the sequel we show that a general multistage version of the problem is -approximable.
In the multistage version, we consider that a set may change over time (it may contain at time but not at time ), and its weight may also change: we denote the set at time , and its weight. We get a penalty if we change our decision about set between time and .
We write this problem as an ILP, where if set is taken at time , 0 otherwise. if , 0 otherwise.
The first constraint ensures that each element is covered at each time step. In -Set Cover instances, this constraint involves at most variables. Notice that we can also allow the ground set to change.
Let us consider the following algorithm RS-MSC (Rouding Scheme for multistage -Set Cover):
Compute an optimal (fractional) solution of the relaxation of the ILP formulation.
For each set , apply the 2-threshold rounding scheme on with parameters and , i.e. let RS. This defines a solution where is taken at time iff .
RS-MSC is a -approximation algorithm for the -Set Cover problem.
We first show that the solution is feasible: for each element and each , there exists containing such that (feasibility constraint). Then the rounding scheme fix , i.e., is taken at time and is covered.
Now let us show the approximation ratio. First notice that if then , so we have for any . Then .
It remains to bound the transition cost of the solution. In the computed solution, jumps (once) from 0 to 1 only if , for some and for all (or ). But then, the global transition cost of on the period between and is at least , while the transition cost of is (one jump from 0 to 1).
Similarly, jumps (once) from 1 to 0 only if , for some and for all (or ). Again, the global transition cost of on the period between and is at least , while the transition cost of is . Globally, the transition cost of the computed solution is at most times the one of the optimal (continuous) solution. ∎
3.2 Prize-Collecting Steiner Tree
In this section, we show how to use the 2-threshold rounding scheme as an important ingredient in order to derive approximation algorithms for the Prize-Collecting Steiner Tree problem. In the (static) Prize-Collecting Steiner Tree problem we have: a graph , a root , each edge has a cost , each vertex has a penalty . We assume the graph to be complete (w.l.o.g., by putting large weights). Given a subset of edges , we denote the set of vertices connected to the root by this set of edges. The value associated to is . Given that costs are nonnegative, there always exists an optimal solution which forms a tree (with the root inside).
We formulate the (static) problem with the following ILP, where if is taken in the solution, and if is connected to the root. We denote by the set of sets of vertices with and . is the set of edges with one endpoint in and one endpoint outside .
The constraint ensures that if we choose to connect to the root (), at least one edge goes outside .
We consider a multistage version where we have a transition cost induced by modifying our decision about vertex between time steps and . Namely, there is a cost if we connect to the root at time step , but not at time step , or vice-versa. We formulate the problem as an ILP as follows: if edge is taken at time , if is connected to at time , if (transition cost).
Let us consider the algorithm RS-MPCST (Rouding Scheme for Multistage Prize-Collecting Steiner Tree) which works as follows.
Find an optimal solution to the relaxation of the ILP where variables are in .
Apply the rounding scheme RS to each sequence of variables with parameters and (to be specified), let be the corresponding (integer) values. At this point, we have decided for each time which vertices we want to connect to the root (those for which ).
For each , apply the algorithm ST-Algo of [GW] for the Steiner Tree problem on the instance whose costs on edges are the ones at time , and the set of terminals is (those we have to connect). This gives the set of edges chosen at time .
The algorithm ST-Algo has the following property [GW]: it computes a (intregal) solution the value of which is at most times the value of a (feasible) solution of a dual formulation of the problem. More precisely, consider the following ILP formulation of an instance of Steiner Tree, where is the set of terminals, and is the set of sets such that and .
Associating a variable to each , the following LP is the dual of the relaxation (LP-ST) of (ILP-ST) where we relax as .
Then ST-Algo computes a feasible solution of the dual and an integral solution of the primal such that .
With parameters and , RS-MPCST is a -approximation algorithm.
We note that the LP can be solved in polynomial time by the Ellipsoid method. The separation problem for the constraints is reduced to the Min Cut problem and can be solved in polynomial time.
By solving a Steiner Tree problem in step 3, we connect at each time step the vertices we chose (during step 2) to connect to , so the solution is clearly feasible.
Let be the computed solution ( being immediately deduced from ). First, we note that in the rounding scheme, if then , so for any , and we can bound the loss on the cost of not connecting some vertices:
Now, as for the case of Set Cover, a variable jumps (once) from 0 to 1 only if , for some and for all (or ). But then, the global transition cost of on this period between and is at least , while the transition cost of is . The same argument holds for the jumps of from 1 to 0. Then we can bound the loss on the transition costs:
Now we have to deal with the cost of taking edges. We consider some time step . Since is feasible, for all , all :
Now, since if , we have . Then
Considering the formulation (LP-ST) of the Steiner Tree problem where the set of terminals is , we have that for all containing at least one terminal (one with ), and not containing . In other words, is a (continuous) feasible solution for (LP-ST) (with ). By duality, its value is at least the value of any dual feasible solution :
Since ST-Algo computes a solution of cost at most , we get, for all :
In all, the computed solution has ratio at most for the cost of not connecting vertices, ratio at most for the transition costs, and ratio at most for the cost of edges. By choosing and , we get ratio at most in each case. ∎
Improvement via (de)randomization
Following an approach used for the classical Prize-Collecting Steiner Tree problem (see [WilliamsonBook]) we give here a randomized algorithm for the multistage Prize-collecting Steiner Tree problem leading to a better (expected) ratio. We then show how it can be derandomized.
The randomized algorithm is the same as RS-MPCST except that we choose at random uniformly from the range where , and we setis the constant We have
From (2) and linearity of expectations, we obtain:
Finally, from (3) we have
Thus, the expected cost of the obtained solution is no greater than
Now, let us deal with derandomization. The random selection of determines the splitting of the values into three sets , and Since any possible value of corresponds to a splitting of the set of values into three sets. It is clear that there are at most such partitions ( or taking one of the at most values ). Let us call I-RS-MPCST (for improved RS-MPCST) the corresponding derandomized algorithm. Following the above discussion, we get:
I-RS-MPCST is a (deterministic) 3.53-approximation algorithm.
The derandomization can be done in a more efficient way, by considering no more than different values of For each , we set and Let Consider an arbitrary We show that there exists such that and Let and We have and If
then we set so It is clear that From (6) we have
Hence, and If
then we set Then and we obtain that From (7) we have
It follows that and
3.3 Prize-Collecting Traveling Salesman
In this section we consider the Prize-Collecting Traveling Salesman problem. We have a complete graph , a depot , each edge has a cost , each vertex has a penalty We assume that the vertex must be in the tour, i.e. the tour starts and ends at vertex . The edge costs are assumed to satisfy the triangle inequality. In the Price Collecting Traveling Salesman problem, it is required to find a tour that visits a subset of the vertices such that the length of the tour plus the sum of penalties of all vertices not in the tour is as small as possible. We consider the multistage version of the problem in which the costs of edges and the penalties may change over time. Additionally, we have a transition cost induced by modifying our decision about vertex between time and . Namely, we pay a cost if we visit at time step but not at time step , or vice-versa.
We adapt the ILP for the Prize-Collecting Travelling Salesman problem introduced in [Bienstock]. Let be the complete graph resulting from by adding a dummy vertex We set and for all Let if edge is in the tour at time and zero otherwise, if is in the tour at time and zero otherwise, if (transition cost).