The inventory problem studied in this paper captures a number of related models studied in the supply chain literature. One of the simplest is the dynamic economic lot size model . Here we have varying demand for a single item over time units. Demand at time or later can be satisfied by an order at time (but not vice versa). For each day, there is a per-unit cost for holding the items in storage; there is also a fixed setup cost for ordering any positive quantity of the item, which is the same for each day. We want to decide on how many items to order on each day so as to minimize the total ordering and holding cost.
The joint replenishment problem (JRP) generalizes this problem to multiple items. We now have a unique holding cost for each day and each item, and a per item setup cost for ordering any quantity of that item. Furthermore, there is a general setup cost for ordering any items at all. This setup cost structure is called additive. While having limited expressive power in comparison to some of the more complex generalizations that have been studied, the additive joint replenishment problem is long known to be NP-hard . This problem has attracted considerable attention from the theory community in the past, resulting in a line of progressively stronger approximation algorithms [12, 13], the best of which gives an approximation ratio of 1.791 .
A more general version of this problem uses an ordering cost structure in which the setup cost is a submodular function of the items ordered. This model, introduced by Cheung et al. , is called the submodular joint replenishment problem (SJRP) and captures both the additive cost structure as well as other sensible models. In the same work, the authors give constant factor approximation algorithms for special cases of submodular cost functions, such as tree cost, laminar cost (i.e., coverage functions where the sets form a laminar family) and cardinality cost (where the cost is a concave function of the number of distinct items). For the general case, they provide an approximation algorithm.
In the inventory routing problem (IRP) setup costs are routing costs in a metric space. There is a root node and every item corresponds to a point in the metric. The setup cost for a given item set is then given by the length of the shortest tour visiting the depot and every item in the set. The usual interpretation of the model is that the root node represents a central depot and every other point in the metric represents a warehouse to be supplied from the depot. (To streamline terminology with the joint replenishment problem, we will keep using the term “item” rather than “location”.)
The IRP has been extensively studied in the past three decades (see  for a recent survey), but primarily from a computational perspective. But very little is known about its approximability. Fukunaga et al.  presented a constant factor approximation under the restriction that orders for a given item must be scheduled periodically. This restriction appears to significantly simplify the construction of an approximation algorithm, as prior to this work the best known polynomial time approximation algorithms gave (incomparable)  (obtained by combining a constant factor approximation ratio for trees with standard metric embedding arguments) and  performance guarantees.
Nagarajan and Shi  break the logarithmic barrier for both IRP and SJRP, under the condition that holding costs grow as a fixed degree monomial. This is a very natural restriction; in particular it captures the case where holding an item incurs a fixed rate per unit per day, depending only on the item. Building on their approach, we improve exponentially on their approximation factor. We also provide some general techniques to turn (sub)logarithmic approximation algorithms in terms of into equivalent algorithms in terms of ; and we are able to obtain results without restriction on the holding costs. Our main contributions are summarized in the following theorems.
There is a polynomial time -approximation algorithm for the inventory routing problem.
There is a polynomial time -approximation algorithm for the submodular joint replenishment problem.
We also mention the works on submodular partitioning problems [4, 5, 8]. In these problems, a ground set must be partitioned across different sets to minimize a submodular cost function. They use rounding of a relaxation based on the Lovász extension to unify and improve several prior results. Their approach inspired our use of the Lovász extension in the rounding algorithm for SJRP.
2 Preliminaries, model and technical overview
We use for the base logarithm. We write for , and for , for any integers . The support of a function is the subset of for which maps to nonzero values.
The general framework of the inventory problems we investigate is defined by a set of items of size , ordering cost function and a time horizon . We will assume throughout this paper that is monotone and subadditive, with . We will typically refer to the atomic time units as days.
For each item and day , there is a demand . The collection of item-day pairs for which there is positive demand is denoted . Demand for day can be satisfied on or before day . If we satisfy demand for item on day using an order on some day , we need to store the items in the intervening days, and we pay a holding cost of per unit we store. The magnitude of the demand only plays a role in the holding cost; the ordering cost is determined by the unique items ordered and is independent of how many units are ordered.
Given these inputs, we need to place an order for items to be delivered on each day so as to minimize the total ordering cost plus the holding cost. Since the cost of delivering an item does not depend on the size of the order and we want to store items as briefly as possible, it is always optimal to deliver just enough units of an item to satisfy demand until the next order for that item is scheduled. Hence, once we decide which items to order on which days, the optimal schedule is completely determined.
The inventory routing problem is the special case where we have a metric on , some distinguished root node , and the ordering cost of a set is the minimum possible length of a tree containing . Here we differ from the usual definition, where is defined to be the length of a shortest tour on ; but as is well known, these two definitions differ only by a factor of at most 2, which will not concern us. The submodular joint replenishment problem is the special case where is submodular (in addition to the required properties already listed).
An integer programming formulation for this problem is given in (1). Here the variable indicates whether item set is ordered on day , and indicates whether the demand for item on day is satisfied by an order on day .
Let (lp) denote the LP relaxation obtained by replacing the integrality constraints of ILP (1) by nonnegativity constraints; this LP has an exponential number of variables. To efficiently solve (lp), it suffices to provide an efficient separation oracle for the dual. This can be done for both SJRP and IRP (in the latter case, in an approximate sense); see .
Nagarajan and Shi  show that in order to round (lp), it suffices to round an associated covering problem. Essentially, given a solution to (lp), we require that each demand is served within an interval , where is the median of the distribution . Serving anywhere within this interval will incur cost at most twice what the fractional solution pays; and moreover, they show that enforcing this restriction cannot make the optimal solution more than a constant factor more expensive. The holding costs can then be dropped from the objective function. All in all, we obtain an instance of the following subadditive cover over time problem: for each item we are given an associated set of demand windows . We must choose a subset for each day such that every item is covered in each of its demand windows—that is, for some , for each . The goal is to find a feasible solution minimizing the total cost .
We also associate the canonical LP (2) with the subadditive cover over time problem.
Our goal, given a fractional solution to this LP, is to round it to an integral solution. Note that the instance of subadditive cover over time is constructed from a solution to (lp) in such a way that is already a feasible fractional solution to (2).
We now come to our first contribution. We show that this reduction can be taken much further: we can reduce to covering problems where the set of intervals have a very special structure. This structure, which we call left aligned, shares many of the benefits of a laminar family. For example, they have a natural notion of depth, which is always logarithmically bounded by ; the approximation factors of our algorithms are essentially logarithmic in this depth. We describe this reduction, which is rather general and applies identically to both SJRP and IRP, in Section 3. We also show, again generally, how the time horizon can be polynomially bounded in terms of the number of items .
So to obtain our main theorems, it suffices to give -factor approximation algorithms for these well-structured covering variants of SJRP and IRP. Here the approaches diverge; we give rather different algorithms for these two problems, albeit based on iterative rounding  in both cases.
For IRP, the algorithm uses randomized iterative rounding  of a certain path-based relaxation. We can show that after sampling every path in the support of the relaxation times, we can remove a constant fraction of the edges and reorder the remaining paths such that we retain a feasible solution. Details are given in Section 4.
For SJRP, by contrast, the iterative rounding approach is naturally deterministic in nature. Instead of randomly rounding item sets in the support of a relaxation, we carefully try to pick a set for each day such that we win a constant fraction of its cost back in the subsequent reduction of the cost of the relaxation. If such a set cannot be found, we show that we can shrink the time horizon by merging some adjacent time units (or put differently, we are able to remove the bottom “leaf” layer of the left-aligned family); we can then recurse on the smaller instance. We discuss this further in Section 5.
3 Reducing to structured covering problems
The results of this section will not use any properties of the ordering cost function that differ between IRP and SJRP. All that we will need, other than the general properties of assumed at the start of Section 2, is that we are given an (approximately) optimal solution to the LP relaxation of (1).
Let denote the family of dyadic intervals over the nonnegative integers; the value of for one of these intervals in we call the level of that interval.
A family of intervals is called
left aligned if for all there exists with ,
right aligned if for all there exists with .
The level of an interval is the level of the minimal interval of containing .
Observe that if is both left and right aligned, then it is a laminar family. An example of a left-aligned family is shown in Figure 1.
We will call an instance of subadditive cover over time left (right) aligned if is left (right) aligned.
At the loss of a constant factor, we can reduce an instance of the subadditive cover over time problem to a pair of instances, one left aligned and the other right aligned. 333See Figure 2 for an illustration.
Let be a solution to (2). We will first generate two new instances of the subadditive cover over time problem, one being left aligned and the other right aligned.
Given an interval , define the right-aligned part and the left-aligned part by
where are integers such that and is maximal. If , then , and if then by convention. It is clear from this definition that forms a left-aligned family, and similarly the right-aligned parts form a right-aligned family.
Any LP solution must cover every item by at least half in either the right-aligned or left-aligned part of its demand window. For each and demand window , if receives half a unit of coverage under , add as a time window for in the left-aligned instance; otherwise put in the right-aligned instance.
It is immediate from the way in which we constructed the two instances that is a feasible solution to each. Hence the combined cost of the optimal solutions to the LP relaxations of the generated instances is at most 4 times that of the original instance. Furthermore, we can translate integral solutions to the left and right aligned instances back to one for the original instance by adding them together, which does not increase the cost by subadditivity of . ∎
Right-aligned instances can be handled identically to left-aligned ones, so we consider only left-aligned instances in the sequel.
3.1 Bounding the time horizon
Here we will show how to reduce the problem to the case where the time horizon is bounded by the size of the item set . This allows us to focus on proving approximation ratios in terms of , as they carry over to approximation ratios in terms of without further adjustments.
At the loss of a constant factor we can reduce a left-aligned subadditive cover over time problem to a polynomial-sized collection of left-aligned subadditive cover over time problems with time horizons equal to .
This theorem could already be used to improve the dependence of the Nagarajan-Shi algorithm from to .
The remainder of this subsection is devoted to proving this theorem. We will sometimes write as shorthand for in what follows.
First, we show that the cost of singleton items are not too far apart.
At the loss of a constant factor, we can reduce the left-aligned subadditive cover over time problem to a collection of left-aligned subadditive cover over time problems with .
We will iteratively split the item set into a sequence of sets with items of descending singleton cost . Initially, let . Then, for and while , let . Remove all items with from and add them to .
Now each set induces an instance of the required form. We claim that if we treat all these instances separately and take the union of their schedules the resulting schedules cost at most three times the original optimum. The following construction gives a set of schedules that satisfies this bound.
Take an optimal solution to the original instance and generate a schedule for each instance by selecting item set on day .
On each day , the total cost of the resulting schedule is dominated by the cost of two item sets where is the smallest index for which is nonempty. To be precise, it holds that:
Here, in the first inequality we use subadditivity of . For the second, we use the definition of the cutoff value for items that are removed from and put in . In the last inequality, we use that and therefore .
Then, using , , and , we get that:
In particular, the cost of the union of the schedules is at most
as required. Finally, we note we did not touch the demand windows and therefore the instances generated are also left aligned. This finishes the proof. ∎
We also need the following sparsity result on near-optimal solutions to LP relaxation (2) of the subadditive cover over time problem. It allows us to assume that every day, the fractional coverage of the sets in the LP solution adds up to at least a full unit, or else it covers no items at all. The intuition for this is that if it is not the case, we can group time intervals with less than a unit coverage, and copy all coverage inside those intervals to the endpoints.
Take a solution to (2). Then there exists a solution whose cost is within twice the cost of , so that for each day ,
Copy to a new solution . Call a day bad if and good otherwise. We iteratively fix , maintaining that at each point there exists an index such that the first days are good and .
Starting with , suppose the first days are fixed. Find the first day that is bad, and let be the first index such that the fractional coverage of sets on the interval sums to at least one, i.e., , if such an index exists. We will deal with the other case, i.e., later.
Since is the minimal, it must be that fractionally covers strictly less than a full set on the interval . It follows that every node whose time window overlaps with must have or . So, we can copy all coverage inside in to each of the endpoints and and retain a feasible solution, i.e.,
After doing this all days up to are good, and the additional cost can be charged against , so we maintain the cost bound. We only need to deal with the edge case where . In this case, any node whose time window overlaps with , must have , otherwise it could not receive at least a unit coverage under . Hence, we can copy all coverage to without losing feasibility:
The cost of this action can be charged against . Since, we already know is a good day and we increase its coverage, we only need to worry about days to , which get set to and thus become good. So, we conclude all days are good and costs at most twice the cost of , finishing the proof. ∎
Proof of Theorem 3.3.
By Lemma 3.4 we can reduce a left-aligned instance to a collection of instances with . By Lemma 3.5, we may assume we have for each of those instances an LP solution with or for all . At this point, we might as well delete the days with from the instance. So, we can assume that for all .
Suppose that we simply schedule the entire set at time . By Lemma 3.4, this costs at most
By Lemma 3.5, on the other hand, the cost of the LP solution for any day is at least
So, the LP solution for the first days costs at least . This means that we can charge the cost of covering all elements at day against the LP, losing only a constant factor in the cost of the solution.
If we cover all elements on day , we effectively reset the instance and we only need a schedule for the first days. Hence, we can delete any remaining time periods for the instance and we complete the proof. ∎
3.2 Further simplifications
Finally, we make some assumptions that are mainly useful to reduce notation in the coming sections. We assume that every item has only one time for which . If not we make multiple copies of each item, one for every day with positive demand. Theorem 3.3 ensures that this only increases the number of items by at most a factor of . We also assume that is of the form for , so that is a positive integer. If this is not the case, we can simply round up . Our assumptions are summarized by the following definition.
An instance of subadditive cover over time is nice if
the time windows form a left-aligned family,
each item has positive demand on exactly one day, and
for some .
4 Steiner tree over time
So in order to prove Theorem 1.1, we need to give an -approximation algorithm to nice instances of subadditive cover over time, for the appropriate class of order functions . More precisely, is the set of nodes of a semimetric space with distance function ; a root node is specified, and for all other nodes , a time window is given. Since denotes the cost of a cheapest tree containing , we will consider a solution to be described by a collection of trees , all containing the root. To be feasible, each node must be contained in for some . The cost of a tree (i.e., the sum of the length of its edges) is denoted ; the objective is to minimize the total cost . We will call this the Steiner tree over time problem.444 The Steiner tree over time problem should not be confused with the problem known in the literature as the minimum spanning tree with time windows problem . This also involves a root and a set of nodes with time windows, but the goal is to construct a single tree, and to provide an order on the nodes such that a vehicle travelling at unit speed on the tree (optionally waiting at nodes) can visit the nodes within their time windows.
For ease of notation, we associate with each node a level in the natural way, namely as the level of in the left aligned family .
4.1 Fractional path relaxation
The main part of our result works by iteratively massaging a specific type of fractional solution until it becomes integral. We now describe this type of solution.
We let denote the collection of directed paths in . For each such directed path , let signify that connects to , i.e., if there is a directed subpath on from to a node in the tree containing the root on day . If we let hold for all by convention.
A fractional path solution (FPS) is a pair , giving for each day a tree rooted at and weights , with the property that
The cost of the fractional path solution is given by the sum of the cost of the trees and the cost of the paths weighted by :
Note that a fractional path solution with for all and yields a feasible solution to the Steiner tree over time problem. Moreover, we can start with a solution to (2) and convert it to a fractional path solution at the loss of a constant (or more directly, we can solve a compact LP corresponding to fractional path solution). To do this, start by initializing all trees to contain only the root. Then for each day and set in , construct a minimum spanning tree on , and use this to define a path to that contains and has cost at most twice the cost of this spanning tree (simply shortcut the doubled tree). Add to the solution with weight .
Hence, we focus on turning a fractional path solution into an integral one without losing too much in the cost of the solution.
We need some preliminary notation and definitions. Given a directed path , we use and to denote the node set and edge set, respectively. We similarly define and for a tree . The head of , denoted , is the unique node in without an outgoing edge; similarly is the unique node of without an incoming edge. Given a collection of edges , we denote by the collection of directed paths obtained by deleting all edges in from , disconnecting into multiple subpaths (we allow paths consisting of only a single node). Given an edge , we use to denote the path in containing . Given a tree and path with in , adding to (which we may write as ) results in a spanning tree of the union of and . In particular, is a tree, costing no more than , and spanning .
The rounding algorithm will consist of a number of iterations. Each iteration will increase the size of the integral part , while reducing the size of the fractional part , until the solution is entirely integral. We will ensure that the cost increase in the integral part is an factor times the cost decrease in the fractional part, which clearly yields the required approximation guarantee.
Each iteration of the rounding scheme involves two steps. The first step is, for each , to independently sample the paths according to the weights , upscaled by a factor ; is a fixed constant we will choose later. These paths are added (one by one) to . The total cost increase due to this step is , where is the total cost of the fractional part. Our goal will now be to adjust the fractional paths in a way that reduces the total fractional cost by a constant factor, while maintaining feasibility. This will clearly lead to our desired approximation ratio: each iteration we pay times the decrease in the total fractional cost, and once the total fractional cost reaches zero, we have an integral solution.
The main operation that the algorithm will perform in order to achieve this is a “split and shift” operation. Let be the fractional path solution solution after the above sampling step. Let be some path in . Our goal will be to
(split) remove some edges from , resulting in a collection of subpaths , which may not have their heads in ; and then
(shift) for each one of these paths , assign its weight to some day , in such a way that now is in , and is still in the time windows of all the nodes in .
This would ensure feasibility, while reducing the fractional cost by
times the total cost of the removed edges. If each edge is removed with constant probability, we obtain the required cost decrease.
In order to make this work, we need some control of the interaction of the different time windows of the nodes on a given path. The most important fact, immediate from the left aligned structure, is the following.
If the time windows of and overlap, with , then .
This means that if we consider a path with head (which might be a subpath of a path in say) where is minimal amongst all nodes in , then any with will be in for all .
The following definition will aid us in shifting always to earlier days, so that the above can be applied.
Given a fractional path solution , for each , let be maximal such that
Then for any , we call the sow phase of and the reap phase of . (Note that the sow and reap phases both contain .)
Let and denote the sow and reap phases of , at the start of this iteration. The idea is that we will first adjust each path in day so that, other than the head, for each on the path. We do this by simply shortcutting past other nodes, which cannot increase the cost. This of course reduces the connectivity of the nodes removed from the path. To fix this, we simply double all weights; since the total connectivity of a node attributable to paths not in the reap phase of that node is at most , this suffices to regain feasibility. Let denote the resulting weights.
At this point, for each path in , for each . This means that if were to be shifted to any time in for some , then this is certainly a shift to an earlier time. We now split by removing all fully redundant edges from it, as per the following definition.
We say that node has germinated if it lies in for some . For and a path , we say an edge is -redundant if the last node on with level has germinated, or if all nodes in have level strictly larger than . An edge is fully redundant if it is redundant for all .
Note that if is fully redundant, then certainly must have germinated, by considering . An important property is the following.
Let be the set of fully redundant edges on a path . Then for any path with , has the largest level amongst all nodes of .
We prove the following, from which the lemma is an immediate consequence.
Given a fully redundant edge and , let be the last node on with . Then the edge leaving is also fully redundant.
Suppose not. Then there is some such that the last node with has not germinated. But since is the last node on with , clearly is the last node on with . This implies that is not fully redundant, a contradiction. ∎
The final step of the iteration proceeds as follows. Consider each and in turn. Let be the collection of paths obtained by removing all fully redundant edges from . Then for each , shift it to the latest day for which .
The solution is still feasible after the final phase of an iteration.
All that remains for the cost bound is to show that each edge is fully redundant with sufficiently large constant probability (enough to counteract the doubling of the weights).
For each , , and , is fully redundant with probability at least , assuming is chosen sufficiently large.
Consider some . Since , standard calculations imply that the probability none of the paths in this sum are sampled is at most , where .
Now consider an arbitrary path for some and an arbitrary . The edge is fully redundant unless it is not -redundant for some . Now, is not -redundant if the last node on with has not germinated; by the above, this happens with probability at most . By a union bound, is not fully redundant with probability at most
By choosing large enough (and hence small enough) this is less than . ∎
To complete the proof of Theorem 1.1, we should observe that the algorithm can be implemented to run in polynomial time. As a consequence of Lemma 4.7, each iteration any unconnected node gets connected to the root with constant probability. It follows that all nodes are connected after iterations, in expectation and with high probability. Implementing each iteration in time polynomial in and is straightforward: we just need to make sure that the path splitting operations does not cause the number of paths in the support of the fractional path solution to explode. This is not a problem though: since every path must contain at least one node, every path in the input fractional path solution can ‘generate’ at most new paths down the line, a polynomial increase.
5 Submodular cover over time
Here we consider the subadditive cover over time problem where in addition to the previously required properties, is submodular(in other words, is a polymatroid set function). We assume throughout that we have a nice instance, and use to denote the single time window for .
Recall that the Lovasz-extension of a submodular function is the function defined as:
Note that corresponds exactly to for integral inputs. It is well-known that the Lovász extension of a submodular function is convex [9, Sec. 6.3]. Hence the following program is a convex relaxation of the submodular cover over time problem.
In fact, it is precisely equivalent to (2). Since is submodular, there is an optimal solution to (2) such that the support of every day is a chain. Then it is easy to verify that is exactly equal to where .
For and , we define the level set
This perspective has a nice graphical interpretation which will prove useful in understanding the proof of an important technical lemma later on, and the algorithm in general. If we plot , for in the interval , then is the area between the graph and the horizontal and vertical axes. See Figure 3 for an example.
Define the truncation by . Observe that
Finally, we introduce the concept of an -supported set in Definition 5.1. For now, think of these as level sets for which setting the corresponding entries to zero would yield a large reduction in the cost of the Lovász extension.
Given and , we say that the set is -supported if:
We will provide some more intuition for this definition later, but first we describe the algorithm.
5.2 The algorithm
The only steps where the coverage for an item could possibly decrease are steps 7 and 10. In step 7, is added to , so this is clearly no problem. In step 10, we are shifting weight from some time to (see Figure 4 for an illustration of this step). This cannot leave the time window of unless . But if has not been covered by the end of iteration , all of the fractional coverage for will have been merged into the left endpoint of its time window . This means that contains for any , ensuring will be added to in step 5 of iteration .
This is where the key insights lie. The main driver is the following technical lemma.
For any and , at least one of the following holds:
there exists such that is -supported.
Before we formally prove this lemma, let us generate some graphical intuition. Recall that is -supported if
In Figure 5, the plots of the Lovász extension on two different inputs are shown. On the left we have Case 1 of the lemma. If we pick the set as indicated, the cost is the area of the light blue square. However, the dark blue area containing covers a large fraction of that cost. On the right, no such set exists. This implies the graph must decrease quite quickly everywhere and therefore lie entirely above some exponential curve. Consequently the cost of the set , which is the light green rectangle, is small compared to , which is the area between the graph and the axes.
Proof of Lemma 5.2.
Let be such that . Note that this implies . Let .
If , there exists such that
Before we prove the claim, let’s see that it implies the lemma. Suppose that , since otherwise we are done. The condition of the claim then holds, so take the smallest that satisfies (5), and let . We claim that
To see this we first rewrite the right hand side as an integral.
Recall that is monotonically decreasing and that so that . Then
In the final inequality of (8) we use that the fact that we chose to satisfy and .
Now we proceed to prove the claim. Suppose for contradiction that the condition of the claim holds but no satisfies inequality (5). Then, in particular it must hold that and therefore we obtain
Since , . Since also , we deduce