# Constant factor FPT approximation for capacitated k-median

Capacitated k-median is one of the few outstanding optimization problems for which the existence of a polynomial time constant factor approximation algorithm remains an open problem. In a series of recent papers algorithms producing solutions violating either the number of facilities or the capacity by a multiplicative factor were obtained. However, to produce solutions without violations appears to be hard and potentially requires different algorithmic techniques. Notably, if parameterized by the number of facilities k, the problem is also W[2] hard, making the existence of an exact FPT algorithm unlikely. In this work we provide an FPT-time constant factor approximation algorithm preserving both cardinality and capacity of the facilities. The algorithm runs in time 2^O(k k)n^O(1) and achieves an approximation ratio of 7+ε.

## Authors

• 4 publications
• 10 publications
• 4 publications
• 1 publication
• 11 publications
• ### Improved Local Search Based Approximation Algorithm for Hard Uniform Capacitated k-Median Problem

In this paper, we study the hard uniform capacitated k- median problem u...
04/24/2018 ∙ by Neelima Gupta, et al. ∙ 0

• ### A constant parameterized approximation for hard-capacitated k-means

Hard-capacitated k-means (HCKM) is one of the remaining fundamental prob...
01/15/2019 ∙ by Yicheng Xu, et al. ∙ 0

• ### Improved Algorithms for Time Decay Streams

In the time-decay model for data streams, elements of an underlying data...
07/17/2019 ∙ by Vladimir Braverman, et al. ∙ 0

• ### A constant FPT approximation algorithm for hard-capacitated k-means

Hard-capacitated k-means (HCKM) is one of the fundamental problems remai...
01/15/2019 ∙ by Yicheng Xu, et al. ∙ 0

• ### Locally Private k-Means in One Round

We provide an approximation algorithm for k-means clustering in the one-...
04/20/2021 ∙ by Alisa Chang, et al. ∙ 0

• ### Partition-Merge: Distributed Inference and Modularity Optimization

This paper presents a novel meta algorithm, Partition-Merge (PM), which ...
09/24/2013 ∙ by Vincent Blondel, et al. ∙ 0

• ### Constant-Factor Approximation for Ordered k-Median

We study the Ordered k-Median problem, in which the solution is evaluate...
11/06/2017 ∙ by Jaroslaw Byrka, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

For many years approximation algorithms and FPT algorithms were developed in parallel. Recently the two paradigms are being combined and provide intriguing discoveries in the intersection of the two worlds. It is particularly interesting in the case of problems for which we fail to progress improving the approximation ratios in polynomial time. An excellent example of such a combination is the FPT approximation algorithm for the -Cut problem by Gupta et al. [15].

In this work we focus on the Capacitated -Median problem, whose approximability attracted attention of many researchers. Unlike in the case of the -Cut problem, it is still not clear what approximation is possible for Capacitated -Median in polynomial time. As shall be discussed in more detail in the following section, the best true approximation known is based on tree embedding of the underlying metric. The other algorithms either violate the bound on the number of facilities or the capacity constraints.

Our main result is a -approximation algorithm for the Capacitated -Median problem running in FPT() time, that exploits techniques from both — approximation and FPT — realms. The algorithm builds on the idea of clustering the clients into locations, which is similar to the approach from the -approximation algorithm, where one creates clusters. This is followed by guessing the distribution of the facilities inside these

clusters. Having such a structure revealed, we simplify the instance further by rounding particular distances and reduce the problem to linear programming over a totally unimodular matrix.

### 1.1 Problems overview and previous work

In the Capacitated -Median problem (CKM), we are given a set of facilities, each facility with a capacity , a set of clients, a metric over and an upper bound on the number of facilities we can open. A solution to the CKM problem is a set of at most open facilities and a connection assignment of clients to open facilities such that for every facility . The goal of the problem is to find a solution that minimizes the connection cost .

In the case when all the facilities can serve at most clients, for some integer , we obtain the Uniform CKM problem.

#### Uncapacitated k-median

The standard -median problem, where there is no restriction on the number of clients served by a facility, can be approximated up to a constant factor [9, 2]. The current best is the -approximation algorithm of Byrka et al. [4], which is a result of optimizing a part of the algorithm by Li and Svensson [21].

#### Approximability of Ckm

As already stressed, Capacitated -Median is among few remaining fundamental optimization problems for which it is not clear if there exist polynomial time constant factor approximation algorithms. All the known algorithms violate either the number of facilities or the capacities. In particular, already the algorithm of Charikar et al. [9] gave 16-approximate solution for the uniform capacitated -median violating the capacities by a factor of 3. Then Chuzhoy and Rabani [10] considered general capacities and gave a 50-approximation algorithm violating capacities by a factor of 40.

The difficulty appears to be related to the unbounded integrality gap of the standard LP relaxation. To obtain integral solutions that are bounded with respect to the fractional solution to the standard LP, one has to either allow the integral solution to open twice as much facilities or to violate the capacities by a factor of two. LP-rounding algorithms essentially matching these limits have been obtained [1, 3].

Subsequently, Li broke this integrality gap barrier by giving a constant factor algorithm for the capacitated k-median by opening facilities [19, 20]. Afterwards analogous results, but violating the capacities by a factor of were also obtained [5, 12].

The algorithms with violations are all based on strong LP relaxations containing additional constraints for subsets of facilities. Notably, it is not clear if these relaxations can be solved exactly in polynomial time, still they suffice to construct an approximation algorithm via the “round-or-separate” technique that iteratively adds consistency constraints for selected subsets. Although while spectacularly breaking the standard LP integrality bound, these techniques appear insufficient to yield a proper approximation algorithm that does not violate constraints.

The only true approximation for CKM known is a folklore approximation algorithm that can be obtained via the metric tree embedding with expected logarithmic distortion [13]. To the best of our knowledge, this result has not been explicitly published, but it can be obtained similarly to the -approximation for Uncapacitated KM by Charikar [7]. For the sake of completeness and since it follows easily from our framework, we give its proof in Section 4 without claiming credit for it.

This -approximation is in contrast with other capacitated clustering problems such as facility location and k-center, for which constant factor approximation algorithms are known [17, 11].

### 1.2 Parameterized Complexity

A parameterized problem instance is created by associating an input instance with an integer parameter . We say that a problem is fixed parameter tractable (FPT) if any instance of the problem can be solved in time , where is an arbitrary computable function of . We say that a problem is FPT if it is possible to give an algorithm that solves it in running time of the required form. Such an algorithm we shall call a parameterized algorithm.

To show that a problem is unlikely to be FPT, we use parameterized reductions analogous to those employed in the classic complexity theory. Here, the concept of W-hardness replaces the one of NP-hardness, and we need not only to construct an equivalent instance in FPT time, but also ensure that the size of the parameter in the new instance depends only on the size of the parameter in the original instance. In contrast to the NP-hardness theory, there is a hierarchy of classes and these containments are believed to be strict. If there exists a parameterized reduction transforming a problem known to be W[t]-hard for to another problem , then the problem is W[t]-hard as well. This provides an argument that is unlikely to admit an algorithm with running time .

We begin with an argument that allowing FPT time for (even uncapacitated) -Median should not help in finding the optimal solution and we still need to settle for approximation.

###### Fact 1.

The Uncapacitated -Median problem is W[2]-hard when parameterized by , even on metrics induced by unweighted graphs.

###### Proof.

Consider an instance of the Dominating Set problem, which is W[2]-hard when parameterized by the solution size. A dominating set of size at most exists in graph if and only if we can find a vertex set of size , such that all other vertices are at distance 1 from . This is equivalent to the solution to Uncapacitated -Median on the metric induced by being of size exactly . ∎

#### Parameterized Approximation

In recent years new research directions emerged in the intersection of the theory of approximation algorithms and the FPT theory. It turned out that for some problems that are intractable in the exact sense, parameterization still comes in useful when we want to reduce the approximation ratio. Some examples are -approximation for -Cut [15] or -approximation for Planar- Deletion [14] for some implicit function . The dependency on was later improved, leading to -approximations for, e.g., -Vertex Separator[18] and -Treewidth Deletion [16].

On the other hand some problems parameterized by the solution size have been proven resistant to such improvements. Chalermsook et al. [6] observed that under the assumption of Gap-ETH there can be no parametrized approximation with ratio for -Clique and none with ratio for -Dominating Set (for any function ). Subsequently Gap-ETH has been replaced with a better established hardness assumption for -Dominating Set [23].

### 1.3 Organization of the paper

Our main result is stated in Theorem 11 (Section 3.3), where we present a -approximation algorithm for the Non-Uniform CKM problem running in FPT() time.

To obtain this result we need two ingredients. First is a metric embedding that reduces the problem to a simpler instance, called -centered, what is described in Section 2. This reduction provides a richer structure, which can be exploited to obtain an -approximation via tree embeddings [13]. As already mentioned, similar approach was presented by Charikar et al. [7] in their algorithm for the uncapacitated setting. We present this result for the sake of completeness in Section 4, after the main result.

The second ingredient is a parameterized algorithm for the -centered instances. Since it is simpler in the uniform setting, we solve it in Section 3.2 as a warm up before the main result. This way the new ideas are being revealed gradually to the reader.

## 2 ℓ-Centered instances

Suppose we work with a graph on nodes , on which we are given a metric . In our considerations the set will be fixed throughout, however we will be modifying the metric over it. Consider an algorithm which produces a solution for a metric . This solution can be seen as a mapping which we explicitly denote by . Its cost in the metric equals which we shall briefly denote by . The second argument is useful, when an algorithm produces a solution (mapping) with respect to metric , but later on we may be interested in its cost over a different metric. Also, let denote the optimum solution for the CKM problem on metric .

In order to solve CKM, we shall invoke an algorithm for Uncapacitated KM as a subroutine. Let be a relaxed solution that opens up to facilities and can break the capacity constraints. It induces a mapping which, for consistency, we shall denote by . Observe that in this mapping every client can be connected to the closest open facility. Since Uncapacitated KM admits constant approximation algorithms, we can work with solutions satisfying: . The larger we allow in the relaxation, the smaller constant we will be able to achieve in the relation above.

Using such an algorithm for Uncapacitated KM as a subroutine, we can find a simpler metric to work with. First we build a graph which will induce the metric. Let be the set of facilities opened by . For each such a facility we create a copy vertex , which is at distance from . We denote the set of copies by , i.e., . Given that we demand the distance from to to be , we can naturally extend the metric to the set . To distinguish facilities from from their copies , we shall call each copy a center.

We build a complete graph on and preserve the metric therein. For every node , be it either a client from or a facility from , we place an edge to the closest (according to the extended ) center and set its length to . We call such a graph -centered and refer to its induced metric as .

###### Definition 2.

An instance of CKM is called -centered if the metric, which we shall denote by , is induced by a weighted graph such that

1. ,

2. , i.e., forms a clique,

3. for every there is only one edge incident to in , and it connects to some .

For a center we shall say that all nodes from that are connected to form a cluster of . If we consider only nodes from , then we talk about an -cluster of , denoted .

In the following lemma we relate the cost of embedding the optimum solution from a metric to .

###### Lemma 3 (Embedding d into l-centered metric dl).

Let be a solution for the Uncapacitated KM problem on metric from which we construct the -centered instance. Optimal solution can be embedded into an -centered metric with the cost relation being

 cost(ϕOPT(d),d)⩽cost(ϕOPT(d),dl)⩽3⋅cost(ϕOPT(d),d)+4⋅cost(ϕALGℓunc(d),d).
###### Proof.

Let be a client connected to facility in the optimal solution . Let be the center closest to within (the -center), and let be the center closest to . First let us note that . Now let us bound the terms and separately.

###### Fact 4.

For every client and its facility from we have .

###### Proof.

Since is the closest -center to the facility , we have that . At the same time, from the triangle inequality it follows that . ∎

###### Fact 5.

For each we have .

###### Proof.

From the triangle inequality we know that

 d(sc,sfc)⩽d(sc,c)+d(c,fc)+d(fc,sfc).

From Fact 4 we also know that , and combining the two inequalities we get .

These facts imply

 dl(c,fc) = d(c,sc)+d(sc,sfc)+d(sfc,fc) ⩽ ⩽ d(c,sc)+2(d(fc,c)+d(c,sc))+(d(fc,c)+d(c,sc))(from Fact~{}???) = 3⋅d(fc,c)+4⋅d(c,sc),

which implies the second inequality from the statement of Lemma 3. The first one directly comes from the triangle inequality

 d(c,fc)⩽d(c,sc)+d(sc,sfc)+d(sfc,fc)=dl(c,fc),

completing the whole proof.

Another lemma is quite simple. Its proof just comes from the fact that metric dominates the metric , i.e., for all pairs of vertices .

###### Lemma 6 (Going back from l-centered metric dl to d).

Any solution for the -centered metric can be embedded back into without any loss:

 cost(ϕALG(dl),dl)⩾cost(ϕALG(dl),d).

Blending together Lemmas 3 and 6 we can state the following Lemma about reducing the CKM problem to -centered instances.

###### Lemma 7.

Suppose we are given a solution for the Uncapacitated KM problem on metric which opens centers, but -approximates the optimum solution for Uncapacitated KM problem with centers, i.e., . Suppose we are given an -approximation algorithm for the CKM problem on -centered instances. If so, then we can construct an -approximation algorithm for CKM on general instances.

###### Proof.

Suppose that we have an -approximation solution for the -centered instance with metric , i.e., such that

 cost(ϕALG(dl),dl)⩽α⋅cost(ϕOPT(dl),dl).

Since is some solution for the -centered instance with metric we have

 cost(ϕALG(dl),dl)⩽α⋅cost(ϕOPT(dl),dl)⩽α⋅cost(ϕOPT(d),dl).

And from Lemma 2 we have that

 cost(ϕALG(dl),dl) ⩽α⋅cost(ϕOPT(dl),dl) ⩽α⋅cost(ϕOPT(d),dl) ⩽α(3⋅cost(ϕOPT(d),d)+4⋅cost(ϕALGℓunc(d),d)).

Since solution -approximates the optimal solution for Uncapacitated KM with centers on metric , we have that

The second inequality follows from an obvious fact that uncapacitated version of the problem is easier than the capacitated. Hence

 cost(ϕALG(dl),dl) ⩽α(3⋅cost(ϕOPT(d),d)+4⋅cost(ϕALGℓunc(d),d)) ⩽α(3⋅cost(ϕOPT(d),d)+4β⋅cost(ϕOPT(d),d)) ⩽α(3+4β)⋅cost(ϕOPT(d),d).

Since without any loss we can embed the solution for the -centered metric into the initial metric (Lemma 3) we obtain an -approximation algorithm. The claim follows.∎

## 3 Constant factor approximation

In this section we present the main result of the paper which is a -approximation algorithm for the Non-Uniform CKM problem. We precede it with a -approximation algorithm for the Uniform CKM problem to introduce the ideas gradually. Both algorithms enumerate configurations of open facilities’ locations, and as a subroutine we need to use an algorithm which, for a fixed configuration of open facilities, finds the optimal assignment of clients to facilities. This subroutine is presented in the following subsection.

### 3.1 Optimal mapping subroutine

We are given an -centered metric instance of the -median problem. Suppose that we have already decided to open a fixed subset of the facilities and we look for a mapping . In the uncapacitated case we can just assign each client to the closest facility in . It turns out that even in the capacitated setting we can find the mapping optimally in polynomial time for a given . We state the problem of finding the optimal as an integer program:

 \rm minimize ∑c∈C∑f∈Fopen dℓ(c,f)⋅xc,f (MAPPING-IP) subject to∑f∈Fopen xc,f=1 ∀c∈C, ∑c∈C xc,f⩽uf ∀f∈Fopen, xc,f∈{0,1}.

In the above program represents the fact that .

###### Lemma 8.

We can find an optimal solution to the (MAPPING-IP) in polynomial time.

###### Proof.

The proof follows from the fact that the relaxation of the above integer program — a program which differs from (MAPPING-IP) only with the constraints instead of — has an optimal solution which is integral. To see this, observe that the linear program is a formulation of the transportation problem. For such a linear program, the constraint matrix is totally unimodular, which implies the integrality of an extremal solution. See [24] for a reference. ∎

### 3.2 Uniform case

As a warm up, we begin with a parameterized algorithm for the uniform case. It is a bit simpler than the general case, because once we know the number of facilities to open in -cluster , then we can choose them greedily.

###### Lemma 9.

Uniform CKM can be solved exactly in time on -centered instances.

###### Proof.

Let be the -centered metric. Note that the -clusters partition the whole set of facilities, i.e., . Let be an optimal solution for the CKM problem on . Every facility belongs to exactly one -cluster . Hence, the -clusters partition the set of facilities opened by . Let us look at all the facilities from a particular -cluster opened by , and suppose that opens of facilities in . Since we consider a uniform capacity case, we can assume without loss that these open facilities from are exactly the ones that are closest to .

Therefore, if we know what is the number of facilities that opens in each -cluster, then we would know what the exact set of open facilities in is due to the greediness in each -cluster. To find out this allocation we can simply enumerate over all possibilities. We just need to scan over all configurations where . Since there are facilities to open, and each of them can belong to one of -clusters , there are at most possible configurations. Of course some configurations may not be feasible since it may happen that , but these can be simply ignored.

For each configuration we need to find the optimal mapping of clients to the set of open facilities that preserves their capacities. Let be the set of open facilities induced by configuration , that is, where we greedily open facilities in -cluster . Given , to find the optimal mapping we use the polynomial time exact algorithm from Lemma 8 with .

Once we know the optimal assignment for each configuration, we can simply take the cheapest one, knowing that it is the optimal one. This proves the lemma. ∎

This lemma suffices to obtain a -approximation for Uniform CKM with a reasoning that we will present in Theorem 11 in full generality.

### 3.3 Non-uniform case

###### Lemma 10.

Non-Uniform CKM can be solved with approximation ratio in time
on -centered instances.

###### Proof.

We begin with guessing the largest distance in between a client and a facility that would appear in the optimal solution — let us denote this quantity as . There are at most choices for , and from now we assume that it is guessed correctly. Note that and for all facilities opened by .

Consider the set of facilities in the cluster of a center . We can remove all facilities such that , because they cannot be a part of the optimal solution. Let us partition remaining facilities from into buckets , such that

 Fi(s)=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩{f∈F(s)∣∣d(s,f)∈[(1+ε)−(i+1)D,(1+ε)−iD]}for i<⌈log1+εnε⌉{f∈F(s)∣∣d(s,f)∈[0,(1+ε)−⌈log1+εnε⌉D]}for i=⌈log1+εnε⌉

The number of buckets equals . We modify the metric again by setting for . The distances within remain untouched. Observe that the distances can only increase.

We shall guess the structure of the solution similarly as in Lemma 9. For each of the facilities, we can choose its location as follows: first we choose one of the -centers ( choices), and then we choose one of the partitions ( choices). Let us denote the number of facilities in a particular partition as . We can assume that because otherwise we know that the guess was incorrect. Since is the same for all , we can assume the optimal solution opens facilities with the biggest capacities.

Once we establish the set of facilities to open, we can find the optimal assignment in metric using the polynomial time exact subroutine from Lemma 8. The total time complexity of solving the problem exactly over equals the running time of the subroutine times the number of possible configurations, which is .

It remains to prove that the algorithm yields a proper approximation. We will show that for any solution it holds that

 cost(ϕSOL,dℓ)⩽cost(ϕSOL,d′ℓ)⩽(1+ε)⋅cost(ϕSOL,dℓ)+ε⋅D. (1)

By substituting we learn that there exists a solution over metric of cost at most for correctly guessed . Therefore the cost of the solution found by our algorithm cannot be larger. Finally we substitute this solution as to see that its cost cannot increase when returning to metric . The claim will follow by adjusting .

The first inequality in (1) is straightforward because dominates . Consider now a pair , where . If , then , so the cost of connecting such pairs increases at most by a multiplicative factor during the metric switch. If , then . Since there are at most such pairs, the total additive cost increase is bounded by .

###### Theorem 11.

Non-Uniform CKM can be solved with approximation ratio in time .

###### Proof.

From Lemma 10 we know that we can get a approximation algorithm for the Non-Uniform CKM problem on -centered instances in time . We shall use the -approximation for Uncapacitated KM by Lin and Vitter [22], that opens at most facilities. By plugging this subroutine to find -centers into the Lemma 7 together with Lemma 10, we obtain a approximation algorithm for the general Non-Uniform CKM problem with running time

 O(((1+1ε)k⋅(lnn+1)⋅1εlnnε)k)nO(1)=O((1εO(1)kln2n)k)nO(1).

Finally, we use standard arguments to show that . Consider two cases. If then by inverting we know that , and so . Suppose now that . In this case

 (lnn)2k<(lnn)lnnlnlnn=2lnlnn⋅lnnlnlnn=n.

## 4 O(logk)-approximation

In this section present a polynomial-time -approximation algorithm for CKM. A constant-factor approximation algorithms for Uncapacitated KM exist [9], and so it is a clear consequence of the Lemma 7 with being constant that it is sufficient for us to construct an -approximation algorithm for the -centered instances.

A standard tool to provide such a guarantee is the Probabilistic Tree Embedding by [13]. This makes our algorithm a randomized one, but if needed, it is possible to derandomize it using the ideas from [8].

###### Definition 12.

A set of metric spaces

together with a probability distribution

over probabilistically -approximates the metric space if

1. Every metric dominates , that is, .

2. For every pair of points its expected distance is not expanded by more then , i.e.,

 Eτ∼πT[τ(x,y)]⩽α⋅d(x,y).

It is a well-known fact, that any metric , can be probabilistically -approximated by a distribution of tree metrics, such that the points in are the leaves in the resulting tree [13].

As described in Definition 2, our -centered metric is induced by a graph composed of two layers — the set of vertices connected in a clique, and the rest of vertices, , each connected to only one vertex in . Let be a random tree embedding of the set (with a metric function ). A modified instance of our problem is created by replacing the clique with its tree approximation .

###### Lemma 13.

An optimum solution for CKM on the instance is in expectation at most times larger than the optimum for the metric .

###### Proof.

denotes the optimum mapping of clients to facilities in the -centered metric . Consider client and facility . Let now be the center of and the center of . The cost of connecting client to amounts to

 dk(c,f)=dk(c,sc)+dk(sc,sf)+dk(sf,f)

in the metric .

The guarantee of tree embeddings gives us an upper bound on a cost of applying the same mapping in the instance ,

 E[dT(c,f)] = dk(c,sc)+E[dT(sc,sf)]+dk(sf,f) ⩽ dk(c,sc)+O(logk)⋅dk(sc,sf)+dk(sf,f) ⩽ O(logk)⋅dk(c,f).

Which means that . Moreover, might not be the optimal solution for the metric , yet its optimal solution can only have smaller cost:

 cost(ϕOPT(dT),dT)⩽cost(ϕOPT(dk),dT)

###### Theorem 14.

The CKM problem admits an -approximation algorithm with polynomial running time.

###### Proof.

After applying the probabilistic tree embedding to the graph inducing — as presented in Lemma 13 — we obtain a tree instance . It should come as no surprise that the problem is polynomially solvable on trees and we explain how to find the optimum solution on in Lemma 16. The assignment , which yields the minimum cost on the tree , can be now used to match clients to facilities in the original instance. It does not incur any additional cost, as

 cost(ϕOPT(dT),dT)⩾cost(ϕOPT(dT),dk)⩾cost(ϕOPT(dT),d)

from the property (1) of Definition 12 and Lemma 6. Combining this with a bound on from Lemma 13 finishes the proof. ∎

### 4.1 Ckm on a tree

The second ingredient to the -approximation for CKM is solving the problem exactly on trees. We will now describe a simple, exact, polynomial algorithm for that special case. In our algorithm we can assume, that all the clients and facilities reside in leaves, but the principle is easy to extend to the general problem on trees.

Imagine we have a subtree of the tree instance, hanging on an edge . Once we have decided, which facilities to open inside the subtree , we know if their total capacity is sufficient to serve all the clients inside . If not, then we need to route some clients’ connections to the facilities outside through the edge . However, if the facilities we have opened in have enough total capacity to serve some clients from the outside, we will connect them through the edge (see Figure 3).

This insight lays out the dynamic algorithm for us. We first turn the tree into a complete binary tree by adding dummy vertices and edges of length (which may double its size). Then, for every subtree , numbers and , we compute .

###### Definition 15.

, for subtree , number of facilities and balance , is the minimum cost of opening exactly facilities in and routing exactly clients down through ( would mean that we are routing clients up). The cost of routing is counted to the top endpoint of .

###### Lemma 16.

The CKM problem on trees admits a polynomial time exact algorithm.

###### Proof.

Computing on with two children and amounts to finding , , and that minimize

 D(t1,k′1,b1)+D(t2,k′2,b2),

such that and . They can be trivially found in time for a single pair . Once , , and are found, we set

 D(t,k′,b)=D(t1,k′1,b1)+D(t2,k′2,b2)+d(et)⋅|b|,

where is the length of the edge in our tree. For a leaf , is defined naturally, depending on whether the leaf holds a client or a facility. Note, that for a leaf with a facility, is finite also for smaller than the capacity of the facility, as the optimal solution might not use it entirely. Finally, the optimum solution to the CKM problem on the entire tree is equal to

 mink′∈{1,…,k}D(T,k′,0).

## 5 Conclusions and open problems

We have presented a -approximation algorithm for the CKM problem, which consists of three building blocks: approximation for Uncapacitated KM, metric embedding into a simpler structure, and a parameterized algorithm working on -centered instances.

Whereas the first and the last ingredient are almost lossless from the approximation point of view, the embedding procedure seems to be the main bottleneck for obtaining a better approximation guarantee. One can imagine that a different technique would allow to obtain a -approximation in FPT time. We believe that finding such an algorithm or ruling out its existence is an interesting research direction.

Another avenue for improvement is processing -centered instances in time . Such a routine would reduce the running time of the whole algorithm to single exponential. In order to do so, one could replace the subroutine for Uncapacitated KM by Lin and Vitter [22] with a standard approximation algorithm that opens exactly facilities, what would moderately increase the constant in approximation ratio.

Finally, whereas we have used the framework of -centered instances to devise an FPT approximation, it might be possible to explore the structure of special instances further and find a polynomial time approximation algorithm. This could yield an improvement over the -approximation ratio for CKM, which remains a major open problem.

## References

• [1] K. Aardal, P. L. van den Berg, D. Gijswijt, and S. Li. Approximation algorithms for hard capacitated k-facility location problems. European Journal of Operational Research, 242(2):358–368, 2015.
• [2] V. Arya, N. Garg, R. Khandekar, A. Meyerson, K. Munagala, and V. Pandit.

Local search heuristics for k-median and facility location problems.

SIAM Journal on Computing, 33(3):544–562, 2004.
• [3] J. Byrka, K. Fleszar, B. Rybicki, and J. Spoerhase. Bi-factor approximation algorithms for hard capacitated k-median problems. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 722–736. SIAM, 2015.
• [4] J. Byrka, T. Pensyl, B. Rybicki, A. Srinivasan, and K. Trinh. An improved approximation for k-median, and positive correlation in budgeted optimization. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 737–756. SIAM, 2015.
• [5] J. Byrka, B. Rybicki, and S. Uniyal. An approximation algorithm for uniform capacitated k-median problem with capacity violation. In

Integer Programming and Combinatorial Optimization - 18th International Conference, IPCO 2016, Liège, Belgium, June 1-3, 2016, Proceedings

, pages 262–274, 2016.
• [6] P. Chalermsook, M. Cygan, G. Kortsarz, B. Laekhanukit, P. Manurangsi, D. Nanongkai, and L. Trevisan. From gap-eth to fpt-inapproximability: Clique, dominating set, and more. In Foundations of Computer Science (FOCS), 2017 IEEE 58th Annual Symposium on, pages 743–754. IEEE, 2017.
• [7] M. Charikar, C. Chekuri, A. Goel, and S. Guha. Rounding via trees: Deterministic approximation algorithms for group steiner trees and k-median. In

Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, Dallas, Texas, USA, May 23-26, 1998

, pages 114–123, 1998.
• [8] M. Charikar, C. Chekuri, A. Goel, S. Guha, and S. A. Plotkin. Approximating a finite metric by a small number of tree metrics. In 39th Annual Symposium on Foundations of Computer Science, FOCS ’98, November 8-11, 1998, Palo Alto, California, USA, pages 379–388, 1998.
• [9] M. Charikar, S. Guha, É. Tardos, and D. B. Shmoys. A constant-factor approximation algorithm for the k-median problem. In Proceedings of the thirty-first annual ACM symposium on Theory of computing, pages 1–10. ACM, 1999.
• [10] J. Chuzhoy and Y. Rabani. Approximating k-median with non-uniform capacities. In Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms, pages 952–958. Society for Industrial and Applied Mathematics, 2005.
• [11] M. Cygan, M. Hajiaghayi, and S. Khuller. Lp rounding for k-centers with non-uniform hard capacities. In Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on, pages 273–282. IEEE, 2012.
• [12] H. G. Demirci and S. Li. Constant approximation for capacitated k-median with -capacity violation. In 43rd International Colloquium on Automata, Languages, and Programming, ICALP 2016, July 11-15, 2016, Rome, Italy, pages 73:1–73:14, 2016.
• [13] J. Fakcharoenphol, S. Rao, and K. Talwar. A tight bound on approximating arbitrary metrics by tree metrics. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing, June 9-11, 2003, San Diego, CA, USA, pages 448–455, 2003.
• [14] F. V. Fomin, D. Lokshtanov, N. Misra, and S. Saurabh. Planar f-deletion: Approximation, kernelization and optimal fpt algorithms. In Proceedings of the 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science, FOCS ’12, pages 470–479, Washington, DC, USA, 2012. IEEE Computer Society.
• [15] A. Gupta, E. Lee, and J. Li. An fpt algorithm beating 2-approximation for k-cut. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2821–2837. Society for Industrial and Applied Mathematics, 2018.
• [16] A. Gupta, E. Lee, J. Li, P. Manurangsi, and M. Wlodarczyk. Losing treewidth by separating subsets. CoRR, abs/1804.01366, 2018.
• [17] M. R. Korupolu, C. G. Plaxton, and R. Rajaraman. Analysis of a local search heuristic for facility location problems. Journal of algorithms, 37(1):146–188, 2000.
• [18] E. Lee. Partitioning a graph into small pieces with applications to path transversal. Mathematical Programming, 2018. Preliminary version in SODA 2017.
• [19] S. Li. On uniform capacitated -median beyond the natural LP relaxation. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 696–707. SIAM, 2015.
• [20] S. Li. Approximating capacitated -median with open facilities. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), To Appear, 2016.
• [21] S. Li and O. Svensson. Approximating k-median via pseudo-approximation. In proceedings of the forty-fifth annual ACM symposium on theory of computing, pages 901–910. ACM, 2013.
• [22] J.-H. Lin and J. S. Vitter. -approximations with minimum packing constraint violation (extended abstract). In Proceedings of the Twenty-fourth Annual ACM Symposium on Theory of Computing, STOC ’92, pages 771–782, New York, NY, USA, 1992. ACM.
• [23] K. C. S., B. Laekhanukit, and P. Manurangsi. On the parameterized complexity of approximating dominating set. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, pages 1283–1296, New York, NY, USA, 2018. ACM.
• [24] A. Schrijver. Combinatorial Optimization - Polyhedra and Efficiency. Springer, 2003.