    # Reverse Greedy is Bad for k-Center

We demonstrate that the reverse greedy algorithm is a Θ(k) approximation for k-center.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Whereas greedy algorithms greedily add to a solution provided feasibility is maintained, reverse greedy algorithms greedily remove from a solution until feasibility is achieved. For instance, in the -median problem we are tasked with picking facilities such that the total distance of all points to their nearest chosen facility is minimized. The reverse greedy algorithm for -median begins with all points as facilities and then repeatedly removes the facility whose removal increases the -median cost the least until only facilities remain.

Reverse greedy algorithms have proven useful for -median where greedy algorithms have not. For example, in a surprising result, Chrobak et al. [CKY05] showed that while the greedy algorithm for -median can perform arbitrarily badly, reverse greedy gives an approximation.111This guarantee is surpassed by approximations based on LP rounding; see, for example, Charikar et al. [CGTS99] for the first approximation. Also see Byrka et al. [BPR14] for the best current approximation algorithm for -median which achieves a approximation. Moreover, the fact that reverse greedy removes from rather than adds to a solution allowed Anthony et al. [AGGN08] to combine reverse greedy with multiplicative weights to solve a two-stage version of -median. However, the only facility location problem for which reverse greedy has been studied is -median.

We study reverse greedy for -center. In the -center problem we must choose centers in a metric so that the maximum distance of any point in the metric to a center is minimized. -center is known to be -hard to approximate for any polynomial-time algorithm and the natural greedy algorithm achieves an approximation of [HS85]. Since reverse greedy outperforms the greedy algorithm for -median and the greedy algorithm for -center achieves the best possible approximation for a polynomial-time algorithm, one might naturally expect that the reverse greedy algorithm also achieves a good approximation for -center. If reverse greedy gave reasonable approximation guarantees for -center it would provide a single combinatorial algorithm which performs well for both -center and -median. Somewhat surprisingly, we show that this is not the case: We demonstrate that reverse greedy attains an approximation factor of for -center.

###### Theorem 1.1.

Reverse greedy is a approximation for -center.

We show our lower bound in Lemma 3.1 and our upper bound in Lemma 4.3.

## 2 k-Center and Reverse Greedy

We now more formally describe the -center problem and the reverse greedy algorithm. -center tasks algorithms with picking centers such that the maximum distance of every point to a chosen facility is minimized. Formally, an instance of -center is given by a metric space and an integer where gives the distance between . An algorithm must output an where . Note that we make the standard assumption that every possible facility is also a client. The cost that an algorithm pays for solution is , where for and . If , we say that serves .

The reverse greedy algorithm for -center repeatedly removes the facility that increases the -center cost the least until only facilities remain.

Throughout the remainder of this note, we let denote the incremental increases in cost as reverse greedy converts facilities into clients;

 δi:=costKC(Fi)−costKC(Fi−1).

Additionally, we let , where is the optimal -center solution, and let denote the corresponding OPT balls where .

## 3 Lower Bound

In this section we show that reverse greedy is at best a -approximation for -center.

We aim to formalize the intuition that by greedily removing the facility which increases cost the least, reverse greedy can repeatedly remove peripheral facilities until the final facilities lie in a single tightly packed region. Now, consider what is required of instances of -center that force reverse greedy to behave in this manner. If reverse greedy ends with its facilties packed into single region of the metric, we must ask ourselves: Why did reverse greedy never remove one of the facilities in this tightly packed region? It must have been the case that for each facility and each iteration , served a client that had no alternative facility within distance . Thus, to produce an instance of -center where reverse greedy performs badly we must produce a metric with a tightly packed region of centers where in every iteration of reverse greedy each one of these centers serves some client whose second furthest center is further than .

Formally, we construct an instance of -center given by for a given and as follows. Consider the -star with edge weights , and the -clique with unit edge weights.222For simplicity we assume though this assumption is easily dropped. Then take the Cartesian product

 G:=Sk□Kn/k

with edge weights inherited in the natural way. That is, consists of cliques each of size with a perfect matching from to for . Each edge in the perfect matching from to has weight and each clique edge has weight . See Figure 0(a). We derive by taking and to be the metric completion of . Observe that in this instance since we may choose one vertex from each , as in Figure 0(b). We now argue that reverse greedy performs poorly on .333It is easy to adapt our construction to have unique edge weights at the cost of in the hardness of our approximation.

###### Lemma 3.1.

For every and every , there exists an instance of -center for which reverse greedy returns a solution of cost at least .

###### Proof.

Consider as described above and illustrated in Figure 1. We provide a particular series of choices that reverse greedy could make on for any and every . We will split our analysis of these choices into “phases” of facility removals, where the th phase is those iterations for which reverse greedy costs for . That is, the th phase consists of such that .

Notice that after removing all facilities in and all but one facility in , the cost of reverse greedy’s solution is . Thus, let phase be this sequence of removals. This is illustrated in Figure 1(a).

We now argue inductively that given the above phase , we can show that over the course of the th phase for , reverse greedy empties and removes no facilities from . Notice that once all facilities from have been removed, they ‘bind’ the facilities in : if any node in is removed then the node in to which it is matched would have to travel distance at least to its nearest facility. Thus, we have that after the th phase removing any facility in would increase our cost to . Moreover, we know that removing the single facility from also increases our cost to since doing so would cause every node in to travel to its nearest facility. Additionally, removing any facility from for increases the cost to . Thus, in the th phase let reverse greedy empty and remove no facility from . See Figure 1(b)-1(e).

At the start of phase , only the facilities in remain. Reverse greedy therefore removes all but of them, forcing clients in that are no longer matched with -facilities to travel to their nearest facility, as in Figure 1(f).

This series of choices therefore yields a final cost of .

## 4 Upper Bound

We now prove that reverse greedy is a -approximation. Our proof builds on two observations. First, if places at least two facilities in an OPT ball then we can demonstrate that in any previous iteration we did not increase our cost by more than . See Figure 3.

###### Lemma 4.1.

If there exists an such that then for every we have .

###### Proof.

Call the two facilities in and . Since and are present in they are present in every iteration of reverse greedy. Moreover, by triangle inequality removing increases the distance of every client that serves by at most and therefore the total cost of reverse greedy’s solution by at most . Since neither nor were removed in any iteration and reverse greedy greedily removes centers, we know that in every iteration the cost increased by at most . ∎

Since there are iterations, the above observation would allow us to demonstrate that reverse greedy is a -approximation provided it places at least two facilities in a single OPT ball. However, to demonstrate that reverse greedy is an -approximation we must use an additional observation.

Our second observation is that so long as we do not empty any , the distance of a client (and therefore the cost of our solution) does not increase too much. See Figure 4 for an illustration of this observation.

###### Lemma 4.2.

If at iteration client is served by some and in iteration we have , then .

###### Proof.

Let be a facility in (which exists by assumption). We have that by triangle inequality and the definition of an OPT ball. Moreover, applying triangle inequality yields

 d(c,Fi′)≤d(c,f′)≤d(c,f)+d(f,f′)≤d(c,f)+2\textscOPT=d(c,Fi)+2\textscOPT.

The above observations yield our approximation.

###### Lemma 4.3.

Reverse greedy is a -approximation for -center.

###### Proof.

Let and be the sequences of discarded facilities and corresponding incremental increases in cost, as above, and note that reverse greedy’s final cost . Call a distinguished if the removal of empties some OPT ball . That is, is distinguished iff there exists an such that but . Let . Call all other regular.

If for every we have then reverse greedy is clearly a approximation by triangle inequality and therefore a -approximation.

On the other hand, suppose that there exists an such that . By pidgeonhole we know there exists an such that and so by Lemma 4.1 we have that for all distinguished . Thus we have

 ∑i∈Dδi≤2|D|\textscOPT (1)

Furthermore, for each contiguous sequence of regular no OPT balls are emptied, so we know by Lemma 4.2 that for all clients , and therefore . There are at most contiguous runs of regular between distinguished , and so

 ∑i∉Dδi≤2\textscOPT(|D|+1). (2)

Combining Equations 1 and 2 yields

 costKC(Fn−k)=n−k∑i=1δi=∑i∈Dδi+∑i∉Dδi≤(2|D|+1)⋅2\textscOPT.

Since there are OPT balls, , and so .

## 5 Aknowledgements

Supported in part by NSF grants CCF-1618280, CCF-1814603, CCF-1527110 and NSF CAREER award CCF-1750808. Thanks to R Ravi and Goran Zuzic for helpful discussion.

## References

• [AGGN08] Barbara M Anthony, Vineet Goyal, Anupam Gupta, and Viswanath Nagarajan. A plant location guide for the unsure. In Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms, pages 1164–1173. Society for Industrial and Applied Mathematics, 2008.
• [BPR14] Jarosław Byrka, Thomas Pensyl, Bartosz Rybicki, Aravind Srinivasan, and Khoa Trinh. An improved approximation for k-median, and positive correlation in budgeted optimization. In Proceedings of the twenty-sixth annual ACM-SIAM symposium on Discrete algorithms, pages 737–756. SIAM, 2014.
• [CGTS99] Moses Charikar, Sudipto Guha, Éva Tardos, and David B Shmoys. A constant-factor approximation algorithm for the k-median problem. In

Proceedings of the thirty-first annual ACM symposium on Theory of computing

, pages 1–10. ACM, 1999.
• [CKY05] Marek Chrobak, Claire Kenyon, and Neal E Young. The reverse greedy algorithm for the metric k-median problem. In International Computing and Combinatorics Conference, pages 654–660. Springer, 2005.
• [HS85] Dorit S Hochbaum and David B Shmoys.

A best possible heuristic for the k-center problem.

Mathematics of operations research, 10(2):180–184, 1985.