 # Star Routing: Between Vehicle Routing and Vertex Cover

We consider an optimization problem posed by an actual newspaper company, which consists of computing a minimum length route for a delivery truck, such that the driver only stops at street crossings, each time delivering copies to all customers adjacent to the crossing. This can be modeled as an abstract problem that takes an unweighted simple graph G = (V, E) and a subset of edges X and asks for a shortest cycle, not necessarily simple, such that every edge of X has an endpoint in the cycle. We show that the decision version of the problem is strongly NP-complete, even if G is a grid graph. Regarding approximate solutions, we show that the general case of the problem is APX-hard, and thus no PTAS is possible unless P = NP. Despite the hardness of approximation, we show that given any α-approximation algorithm for metric TSP, we can build a 3α-approximation algorithm for our optimization problem, yielding a concrete 9/2-approximation algorithm. The grid case is of particular importance, because it models a city map or some part of it. A usual scenario is having some neighborhood full of customers, which translates as an instance of the abstract problem where almost every edge of G is in X. We model this property as |E - X| = o(|E|), and for these instances we give a (3/2 + ε)-approximation algorithm, for any ε > 0, provided that the grid is sufficiently big.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Every morning, a well-known newspaper111Unfortunately, for confidentiality reasons, we cannot disclose their identity. in Buenos Aires needs to deliver a copy to each subscriber by trucks. For now, assume there is only one truck. Traditionally, the truck stops in front of each customer’s house, every time delivering a single copy of the paper. But now, the company thinks there could be a better (that is, cheaper) way to do it: instead of stopping to make a single delivery, the truck will only stop at street crossings, and each time the driver will pick up a pile of copies and deliver them to all customers located on any of the (typically four) adjacent streets. The goal is to minimize the number of blocks traveled by the truck.

We model the city topology as a simple graph, and the set of customers as a subset of edges. In other words, we distinguish blocks that have at least one customer, but we don’t care if there is more than one customer in a single block. If is a cycle and is a subset of edges of a simple graph, we say that covers if every edge of has an endpoint in . The formal description of the problem is the following:

• STAR ROUTING

• INSTANCE: A simple graph , a non-empty subset of edges , and a positive integer .

• QUESTION: Does have a cycle, not necessarily simple, of length at most that covers ?

Since all edges can be traversed in both directions, STAR ROUTING (or simply STAR) models all streets as two-way streets. Also, note that STAR doesn’t ask about which (or how many) road crossings the truck should stop at and deliver during its journey. (a) A possible set of customers, marked as red dots, on a small part of Boedo neighborhood in Buenos Aires. The light blue area is an arbitrary boundary for the truck. Figure 2: Cleaned-up version of the STAR instance of Figure (b)b. The arrows show a feasible solution.

Consider the example of Figure (a)a, which represents a real-life setting with subscribers shown as red dots. This is mapped to the STAR instance shown in Figure (b)b. Each block that contains at least one customer is mapped as a red edge, and is the set of red edges. A feasible solution is presented in Figure 2. Indeed, this cycle of length given by the arrows is a feasible solution because every red edge has one endpoint in the cycle. In contrast, if we wanted to stop precisely at every customer’s address, we would need to go through at least edges: one per red edge, plus more edges to move between the two connected components induced by the red edges. Thus, STAR’s solution is at least better, in terms of number of blocks traversed. This improvement may (or may not) be at the cost of greater overall time to perform the delivery, since now the driver has to walk from street crossings, carrying the newspapers. Clearly, the more packed the customers are, the better this alternative delivery model works, since a single vertex can cover many edges.

Keep in mind that a rigorous comparison between STAR and other delivery models is beyond the scope of this paper, as there are several practical considerations, like street orientation, speed limits or overall transit time, that we are not taking into account. Our focus is on studying STAR’s theoretical properties.

Despite newspaper delivery was the original motivation for this problem, it is worth noting that STAR may be applicable in other contexts as well, such as police patrol planning. In general, STAR captures characteristics from situations resembling covering problems but also involving vehicle routing features.

### Related work.

A remarkable family of problems in combinatorial optimization are those known as vehicle routing problems (VRP). The basic component of a VRP are vehicles that move throughout a network, maybe starting and ending at some depot point, and moving between customers located over the network to deliver some sort of merchandise. The goal is usually minimizing some metric related to the total consumed time or the traveled distance. The origin of these problems can be traced back to the 1954 paper of Dantzig, Fulkerson and Johnson

, in which they considered the TSP, which is a particular case of VRP. This work was followed by several other papers about the TSP. Clarke and Wright  added more than one vehicle to the problem, which led to the first proper formulation of VRP, though that name was not coined until the work of Golden, Magnanti and Nguyan .

In 1974, Orloff  identified a class of routing problems of a single vehicle, which he called GENERAL ROUTING PROBLEM (GRP). The GRP takes a weighted graph , and two sets and , and asks to find a shortest cycle of that traverses every vertex in and every edge in . This is a generalization of other well-known routing problems, like the CHINESE POSTMAN PROBLEM ( and ), the RURAL POSTMAN PROBLEM (), and the TSP ( complete, and ). Notably, the first can be solved in polynomial time, whereas the decision versions of the latter two are NP-complete .

The STAR problem is a simple VRP with a single vehicle fleet, where we want to minimize the delivery cost, which we model as the total distance traveled by the vehicle. However, in contrast with traditional VRPs, the subset of edges containing customers can be covered just by visiting any of the two adjacent endpoints, rather than traveling along it. There are some variants of TSP with a similar flavor to that of STAR, in which the objective is to cover vertices with a more relaxed criteria than standard TSP. One of them is the COVERING SALESMAN PROBLEM (CSP) [5, 16] that takes a directed weighted graph and a positive integer , and asks to find a minimum-length tour over a subset of vertices of such that every vertex not in the tour is within distance of some vertex in the tour. Current and Schilling 

devised a simple heuristic for this problem, but its performance guarantee cannot be bounded due to the arbitrary weights. Interestingly, our approximation algorithm for the general version of STAR is similar to theirs, but since we assume unit weights we are able to derive a bound on the approximation ratio. Shaelaie et al.

 presented metaheuristics for the CSP but, once again, they do not provide any theoretical guarantees. Another related problem is the TSP WITH NEIGHBORHOODS (TSPN) [1, 9, 2], that takes a set of regions in the euclidean plane, and asks for a shortest closed curve that visits each region. We should note that the grid version of STAR, which we will discuss later on, can be reduced to the rectilinear version of TSPN, in which each edge from is a region, but unfortunately the rectilinear TSPN has not been studied.

To the best of our knowledge, STAR hasn’t been considered before, existing literature has little overlap with it, and it’s the first VRP based on the notion of vertex cover.

### Organization.

In Section 2 we show that STAR is strongly NP-complete, even when the input graph is a grid. In Section 3 we study how well it is possible to approximate the general version of STAR. First, we give a lower bound by showing that STAR is APX-hard. Second, we provide a factory of approximation algorithms, which takes an -approximation algorithm for metric TSP and produces a -approximation algorithm for STAR. This yields a approximation factor for the general case, and a factor when the graph is planar. In Section 4 we develop a -approximation algorithm for grid graphs, assuming there are asymptotically more edges with customers than not and that the grid is large enough. Finally, in Section 5 we state some open problems.

### Notation.

Let be an optimization problem. Let be a valid input of . We write the value of an optimal solution of the problem for .

If is a finite set, is the cardinality of . We denote the complete graph whose set of vertices is .

All graphs we consider in this paper are simple. All cycles and paths we consider are not necessarily simple. If is a graph, is the cardinality of any minimum vertex cover of . If is a subset of edges of , is the subgraph of induced by . If is a path of , is the number of edges of (counting repetitions). If the edges of have weights given by a function , is the sum of the weights of the edges of (counting repetitions). If is another path of , that starts where ends, is the path we get from first traversing and then . If and are two vertices of , is the minimum over every path between and . If and are two points in , is the Manhattan distance between them.

A grid graph with rows and columns is the cartesian product of graphs , where is the path of vertices. A star graph is a complete bipartite graph , for some .

## 2 STAR is hard, even for grids

In this section we show that STAR is NP-complete when we restrict to the class of grid graphs. We call this version of the problem grid STAR. These instances are of practical interest, since grids are the most simple way of modelling a city layout. In particular, the problem is hard for planar graphs and for bipartite graphs, among all superclasses of grid graphs.

To prove completeness, we will reduce from the rectilinear variant of TSP. Recall the TSP takes a set of elements equipped with weights between each pair of elements, and a positive integer , and asks if there exists a hamiltonian cycle in with total weight or less. In the rectilinear version, the input is a set of points in the plane, with positive integer coordinates, and a positive integer , and asks if has a hamiltonian cycle with total Manhattan distance length or less.

The rectilinear TSP is NP-complete. In 1976, Garey et al.  proved this, by reducing from EXACT COVER BY 3-SETS (X3C), which takes a family of -element subsets of a set of elements, and asks if there exists a subfamily of parwise disjoint subsets such that . Since X3C has no numerical arguments, it is strongly NP-complete. The rectilinear TSP instance they build is such that both coordinates of every point in the set , as well as the optimization bound , are bounded by a polynomial on the size of the X3C instance. Thus, rectilinear TSP is strongly NP-complete.

The transformation we will use has a similar flavor than the one devised by Demaine and Rudoy  to show that solving a certain puzzle is NP-complete.

###### Theorem 2.1

Grid STAR is strongly NP-complete.

###### Proof

Given a cycle of it’s easy to check in polynomial time if it covers all edges in , and if it has length or less. Thus, the problem is in NP.

Now we reduce from rectilinear TSP. Let and the bound be an instance of rectilinear TSP. Let be the maximum coordinate of any point in , so that all points lie in the rectangle . Let . We will build a grid graph by taking the rectangular grid of points with lower left corner at , and expanding it by a factor of . Formally, if , then is the set of all integer coordinates points in , and is the natural set of edges we need to produce a grid out of . Note that for every . That is, multiplying by we map points from to .

Let be any adjacent edge to in , and let . Finally, let . Figure 3: Mapping an instance of rectilinear TSP (on the left) to grid STAR (on the right). The marked points on the first grid are pis, which are mapped to the second grid as cpi. The light blue area denotes the graph G. The red edges make up the set X.

### Polynomial time.

Since rectilinear TSP is strongly NP-complete, we can assume and are polynomial. The grid has size , which is polynomial because both and are. The coordinates of every vertex are bounded by . Computing is obviously polynomial. Finally, computing is also polynomial, since is polynomial. Thus, the reduction takes polynomial time, and every numerical value is bounded by a polynomial in the transformation’s input size.

### Rectilinear Tsp to grid Star.

Assume there is a hamiltonian cycle with Manhattan distance length or less, in . W.l.o.g., suppose is such a cycle. For each , let be a shortest path in from to . Then is a cycle in that goes through every vertex , and thus covers . Its length is

 ℓ(S)=n∑i=1ℓ(Si)=n∑i=1d1(cpi,cpi+1)=cn∑i=1d1(pi,pi+1)=cℓd1(T)≤cL=K

### Grid Star to rectilinear Tsp.

Suppose there is a cycle of length or less that covers in . At some point while we traverse , we must get close to each , since the cycle covers . More specifically, there exists an index such that either is exactly , or . This implies that . Assume w.l.o.g. that , since otherwise we can rearrange the indexes of the points . Consider the hamiltonian cycle of . (Define, for convenience, .) We need to show that . Since is an integer, it suffices to prove . We start by rewriting

 ℓd1(T)=n∑i=1d1(pi,pi+1)=(1/c)n∑i=1d1(cpi,cpi+1)

Since is a metric, we can decompose

 d1(cpi,cpi+1)≤d1(cpi,sji)+d1(sji,sji+1)+d1(sji+1,cpi+1)≤2+d1(sji,sji+1)

Therefore

 ℓd1(T)≤(1/c)(2n+n∑i=1d1(sji,sji+1))

Consider the subpaths (here we are using the fact that the indexes are ordered). Then . Since these subpaths are disjoint pieces of , we have , so

 ℓd1(T)≤(1/c)(2n+ℓdG(S))≤2n/c+K/c=n/(n+1)+L<1+L

as desired. ∎

## 3 An approximation algorithm for the general case

Since STAR is a hard problem in regards to finding exact solutions, we investigate approximation algorithms. We start by showing that the general version of the problem is hard to approximate within a constant factor arbitrarily close to . For this, we reduce from approximating the VERTEX COVER (VC) problem, which is known to be APX-hard . Given a simple graph , VC asks for a minimum cardinality vertex cover of .

###### Theorem 3.1

For every -approximation algorithm for STAR there is an -approximation algorithm for VC.

###### Proof

Let be an -approximation algorithm for STAR. Given an input graph , the approximation algorithm for VC proceeds as follows. If , return an empty set. If is a star graph, return the central vertex. Otherwise, every feasible vertex cover has two or more vertices. Consider the instance of STAR, that is, a complete graph where the set of customers are the edges of . The algorithm computes and outputs as a set.

The algorithm is polynomial, since we can construct in polynomial time. Note that every cycle in that covers induces a vertex cover of , and therefore is a feasible vertex cover of . Reciprocally, every vertex cover of induces a cycle in that covers (by fixing any order among the vertices in the cover), which implies that . Since is an -approximation, we have . ∎

Dinur and Safra showed that it’s hard to approximate VC within a factor of optimal . Thus, STAR is hard to approximate as well.

###### Corollary 1

It’s NP-hard to approximate STAR within a factor of optimal.

Therefore, STAR doesn’t admit a PTAS unless , and thus the best we can hope for is some constant-factor approximation algorithm. Indeed, we now show that STAR admits one.

During the rest of this paper, we denote an instance of STAR, and write . Recall that .

###### Lemma 1

If is not a star graph, then .

###### Proof

Let be an optimal solution of STAR. Since covers , we can extract a vertex cover of from the set of vertices of . Since is not a star, it’s easy to see that has two or more vertices, and thus . Hence, . ∎

From now on, we assume is not a star. It’s easy to both recognize a star graph and, in that case, return the optimal solution (the central vertex of the star) in polynomial time.

###### Lemma 2

Let be a vertex cover of . Starting from a feasible solution of TSP for we can build, in polynomial time in , a feasible solution of STAR for , such that .

###### Proof

Let . Let be any shortest path between and , in . Consider the path of , which covers , since it traverses every vertex in . This path can be computed in polynomial time, since it’s the union of a polynomial number of shortest paths of . We have . ∎

Recall the classic -approximation for VC, shown in Algorithm 1. We will refer to it as the approximation via matching.

###### Theorem 3.2

Let be a vertex cover of , built with the approximation via matching. Then .

###### Proof

Let , such that each is an edge of the maximal matching. Let be an optimal solution of STAR for . The key observation is that since , and covers , at least one of or is in . W.l.o.g., assume is in . Hence, for each , there exists an index such that (we define ). W.l.o.g., assume that , since otherwise we can rearrange the elements of to satisfy it. Given this ordering, consider , which is a feasible solution of TSP for . It suffices to show that . Figure 4 shows the sets and cycles defined so far. Figure 4: Relation between C, S, and T. The curly blue arrows denote S, and the green arrows denote T. We do not show the edges that close the cycle. Also, S may contain vis, but we don’t illustrate this.

We have that

 ℓdG(T)=m∑i=1(dG(ui,vi)+dG(vi,ui+1))=m∑i=1(1+dG(vi,ui+1))

Since is a metric, . Hence,

 ℓdG(T)≤m∑i=1(2+dG(ui,ui+1))=2m+m∑i=1dG(ui,ui+1)

Recall that for each . Consider the subpaths . Then, , and therefore

 ℓdG(T)≤2m+m∑i=1ℓ(Si)≤2m+ℓ(S)=2m+%OPT

Since is a -approximation, . Finally, we use Lemma 1 to get , and we arrive to the desired bound. ∎

The proposed approximation algorithm for STAR is shown in Algorithm 2. Note that the instance of TSP that approximates is, indeed, a metric instance, because is a metric.

###### Theorem 3.3

If is an -approximation algorithm for metric TSP, then Algorithm 2 is a -approximation algorithm for STAR.

###### Proof

The algorithm is polynomial, because each step is polynomial. The answer is a feasible solution of STAR, as stated in Lemma 2. Regarding the performance guarantee,

 ℓ(S) =ℓdG(T) (Lemma 2) ≤α TSP∗(C,dG) (ATSP is an α-% approximation) ≤3α OPT (Theorem 3.2)

Using Christofides’ -approximation algorithm for metric TSP , we get the following concrete algorithm.

###### Corollary 2

There is a -approximation algorithm for STAR.

If is restricted to some subclass of graphs, we could use a more specific approximation algorithm (one that doesn’t work for all metric instances), and get a better approximation guarantee. For example, if is a planar graph (for instance, if is a grid graph), then we can use a PTAS .

###### Corollary 3

For every constant , there is a -approximation algorithm for planar instances of STAR.

## 4 An approximation algorithm for grids full of customers

A typical and desired case in the newspaper delivery business is having neighborhoods full of customers. We model such a dense neighborhood with a grid graph, where almost every edge is in . In this section, we propose a method to approximate the optimal solution, tailored for this dense setting.

The key idea is that since almost every edge is in , any feasible solution will cover almost every edge of . What if instead of covering just , we cover the whole set ? We show that if , then there is such a naïve tour that is guaranteed to have length at most a factor of the optimal, for sufficiently large grids.

A cycle that covers every edge in a graph is somewhat similar to the concept of space-filling curve. Mathematically, a space-filling curve is a curve whose range contains a certain -dimensional area, for example the unit square. Space-filling curves have been used before to compute tours for the TSP. In 1989, Platzman and Bartholdi  proved that if we visit the vertices in the order given by a specific space-filling curve, we get an -approximation algorithm. In the graph-theoretical setting of STAR, filling means to cover edges, but not necessarily to visit every vertex. Our dense-case approximation can be thought of as a space-filling cycle.

Before constructing this particular cycle we prove some auxiliary results that will help us to analyze its performance.

###### Lemma 3

Let be an edge of a graph . Then .

###### Proof

If we take any vertex cover of and add one of the endpoints of (if not already in the vertex cover), we get a vertex cover of . ∎

In what follows, we will write .

###### Lemma 4

Let be an instance of STAR, such that is not a star graph. Then .

###### Proof

If we repeatedly apply the previous lemma, each time subtracting a new vertex of , we get . Using Lemma 1 we arrive to the desired inequality. ∎

The proof plan is to construct a space-filling cycle, compare its length with and then use Lemma 4 to bound its performance. It will come in handy to know the exact value of when is a grid graph.

###### Lemma 5

Let be a grid graph with rows and columns. Then .

###### Proof

() Note that is bipartite. Consider any bipartition of its vertices. Both subsets of the partition are vertex covers, and since there are vertices in total, one of them must have size at most .

() We use the fact that the size of any matching is always less than or equal to the size of any vertex cover. It suffices to exhibit a matching of size . To build such a matching, we go over every other row, and for each one we take every other horizontal edge. If

is odd, we also take every other vertical edge of the last column. It’s clear that this is a matching, and it’s a matter of simple algebra to verify that it has

edges. ∎

We are ready to exhibit and analyze our space-filling cycle.

###### Theorem 4.1

There is an approximation algorithm for grid STAR that computes solutions with length at most , where and are the number of rows and columns, respectively, of the input grid graph.

###### Proof

We introduce some terminology to describe the cycle. Enumerate the grid’s rows from to , being the uppermost row and the lowest one. We divide the grid into horizontal stripes, such that the -th stripe, , consists of rows and . If is odd, the last stripe is formed only by the last row.

First we sketch a high-level description. Starting from the upper left corner, we will visit the stripes in order. Initially we move right, until we get to the right border of the grid, the end of the first stripe. Then we go down to the second stripe, and now move left until we get to the left border. Next we go down to the third stripe. The process continues until we finish visiting the last stripe. If the last one is a single row, we move in a straight line. Finally, we go back to the starting position.

More specifically, on stripe , for some odd , we move from left to right following a square wave pattern, which we call period. A period is a sequence of the following single-edge moves: down, right, right, up, right, right. This is illustrated in Figure (a)a. We repeat this sequence of moves until it’s no longer possible, at the right border of the grid. At this point, we could be anywhere between the beginning and the end of a period. In any case, we stop, and move exactly two edges down. On stripe we move in the opposite direction, from right to left, repeating the steps we did on stripe , but in reverse order. When we get to the left border, we go down two edges again, and we are ready to repeat the process. When we reach the end of the grid, we close the cycle by adding a shortest path to the initial vertex. An example of this construction is shown in Figure (b)b.

Let be this cycle. It’s easy to see that covers each edge of , and that it can be computed in polynomial time. Our approximation algorithm simply outputs . We now show that . By Lemma 4, this implies the desired bound.

Each of the two-rows stripes contains horizontal and vertical edges of . To move between two consecutive two-rows stripes, uses exactly edges. Additionally, if is odd, the last stripe is a single row, and we account edges for moving along that row, plus edges to move from the previous stripe. Finally, we have at most extra moves to go from the last stripe to the initial position. Summing everything,

 ℓ(C)≤⌊n/2⌋(m−1+⌈m/2⌉)+(⌊n/2⌋−1) 2+(m−1+2)+(n+m−2)

The first term accounts for intra-stripes moves, the second for inter-stripes moves, the third for a potential single-row stripe, and the last one for the cost to go back to the initial position. A sloppy bounding of the floor and ceiling functions yields

 ℓ(C) ≤(n/2)(m−1+m/2+1)+2(n/2−1)+(m+1)+(n+m−2) =(3/2)(nm/2)+2n+2m−3 ≤(3/2)τ(G)+2n+2m−9/4 =(3/2+O(1/m+1/n))τ(G)

###### Corollary 4

There is an approximation algorithm for grid STAR such that for every , there exist positive numbers and such that the algorithm computes solutions with length at most , for every input grid with rows and columns.

Recall that we are interested in the case where almost all edges belong to , that is, . As we can see, the smaller the , the better the approximation, showing that the space-filling cycle is a promising strategy for the dense readership case.

###### Theorem 4.2

There is an approximation algorithm for grid STAR such that for every , there exist positive numbers and such that the algorithm is -approximated, for every input grid with rows, columns and .

###### Proof

If is a grid graph, then , because a single vertex can cover up to edges. Hence, .

Since , we have , and thus . This in turn implies that , which means that for all there exist positive integers , such that for every and .

Fix any . Let be any two positive reals such that . Instantiate Corollary 4 with , and let and be the minimum numbers of rows and columns, respectively. Let be such that if and , then .

Under these choices, if and , the performance guarantee is

 (3/2+ε1)(OPT+|¯¯¯¯¯X|) <(3/2+ε1)(OPT+ε2OPT) ≤(3/2+ε1)(1+ε2)OPT =(3/2+ε1+(3/2)ε2+ε1ε2)OPT ≤(3/2+ε)OPT

## 5 Open questions

In this paper we only considered the unweighted case of STAR. If the input graph has weights, the problem obviously remains hard, in terms of finding both exact and approximate solutions. Unfortunately, for that case, the approximation strategy we proposed in Theorem 3.3 is no longer useful, because if the vertex cover is agnostic of the weights, then the constructed cycle may be forced to use heavy edges, and therefore the output can be made arbitrarily longer than an optimal solution. Is it possible to adapt the algorithm for the weighted case, or to devise a different constant-factor approximation algorithm?

On a separate note, we showed that there cannot be a PTAS for STAR unless . However, this doesn’t rule out the possibility of a PTAS for the grid case, for which the best we have achieved is a -approximation algorithm that only works for a proper subset of instances. Since the grid case is of practical interest, it would be worthwhile to investigate this possibility.

Finally, the problem may be extended in natural ways, like using multiple trucks or considering the time it takes the driver to carry newspapers to the households.

## Acknowledgements

Thanks to Martín Farach-Colton for useful discussions and suggestions about the presentation.