1 Introduction
1.1 Background and Motivation
In minmax gathering problem, we are given a metric space that contains several users and facilities . We can open some facilities and assign each user to an opened facility so that each opened facility has so least users. The objective of the problem is to minimize the maximum distance between the facilities and the assigned users [6].
This problem has an application in shelter evacuation problem [4]: There are people and evacuation shelters, and we divide the people into shelters so that all people can evacuate in minimum possible time. Each shelter must have at least people to maintain their lives in shelters. The problem also has an application to privacy protection [11]. A set of clusters satisfies anonymity if each cluster has at least users; this condition prevents reconstructing personal information from the clustering.
Several tractability and intractability results are known. There is a polynomialtime approximation algorithm for a general metric space , and no better approximation ratio can be achieved unless P=NP [6]. If is a line, we can solve the problem exactly by dynamic programming (DP) [4, 7, 9], where the fastest algorithm runs in lineartime [5]. When is a spider, which is a metric space constructed by joining halflines at their endpoints, Ahmed et al. [3] proposed a fixedparameter tractable algorithm parameterized by and the degree of the center. In our cosubmitted paper [8], the authors showed the problem is NPhard if is a spider, and the problem admits a fixedparameter tractable algorithm parameterized by .
1.2 Our Contribution
The goal of this study is to explore the boundary of tractability of the minmax gathering problem. Specifically, we consider the problem on tree, which is a natural graph class that contains spiders as a subclass.
It is easy to see that the problem does not admit a fully polynomialtime approximation scheme (FPTAS) (see Proposition 2.1). Therefore, the bestpossible positive result that we can expect is a polynomialtime approximation scheme (PTAS). Our main contribution is to establish PTAS for this problem as follows. There exists an algorithm for the minmax gathering problem on a tree so that for any it outputs a solution with an approximation ratio of in time. The proposed algorithm seeks the optimal value by a binary search, and in each step, it solves the corresponding decision problem by a DP on a tree. Here, the most difficult part is establishing an algorithm for the decision problem.
This technique can also be applied to other problems, for example, gathering problem and gathering problem with a constraint on the number of open facilities. It can also be shown that these problems are NPhard and do not admit FPTAS unless P=NP by the same reduction. Thus, these are also tight results.
On the other hand, there are variants of gathering, which can be solved exactly in polynomial time on a tree. We provide polynomial time algorithms via DP for two problems: minsum gathering problem and minmax (and minsum) gathering with proximity requirement.
1.3 Organization
The rest of the paper is organized as follows. In section 2, we give a PTAS for minmax gathering problem on a tree. We also show the problems which admit essentially same PTAS. In subsection 3.1, we provide the polynomialtime algorithm which solves the minsum version of gathering problem exactly on a tree. Finally, in 3.2, we provide the polynomialtime algorithm which solves the minmax (and minsum) gathering with proximity requirement exactly on a tree.
2 PTAS for minmax Gathering on Tree
A weighted tree is an undirected connected graph without cycles, where is the set of vertices, is the set of edges, and is the nonnegative edge length. forms a metric space by the tree metric , which is the sum of the edge lengths on the unique simple  path for any vertices . We consider the minmax clustering problem on this metric space.
Without loss of generality, we assume that all users and facilities are located on different vertices; otherwise, we add new vertices connected with edges of length zero and separate the users/facilities into the new vertices. By performing similar operations, we also assume that is a rooted full binary tree rooted at a special vertex (that is, we can make to the rooted tree so that every vertex has zero or two children). These operations only increase the number of vertices (and edges) of tree by a constant factor; these do not affect the time complexity of our algorithms. We denote the subtree of rooted at by .
2.1 Hardness of the Problem
We first see that the problem does not admit FPTAS. This is a simple consequence of our cosubmitted paper [8] that proves the NPhardness of the problem on a spider. There is no FPTAS for the minmax gathering problem on a spider unless P=NP.
Proof.
In [8], the authors proved that the minmax gathering problem is NPhard even if the input is a spider and the edge lengths are integral, and the diameter of the spider is bounded by . Let us take such an instance. If there is a FPTAS for the minmax gathering problem on a spider, by taking for sufficiently large constant , we get an optimal solution because the optimal value is an integer at most . This contradicts to the hardness. ∎
2.2 Algorithm. Part 1: Binary Search
In the following sections, we develop a PTAS for the problem. We employ a standard practice for minmax problems: we guess the optimal value by binary search and solve the corresponding decision problem for the feasibility of the problem whose objective value is at most the guessed optimal value.
First, we run Armon et al.’s approximation algorithm [6] to obtain such that holds. Then we set as the range for the binary search. This part is needed to run the algorithm in strongly polynomialtime.
For the binary search, we design the following oracle : Given an instance , threshold , and positive number , it reports YES if , and NO if . If then both answer is acceptable. Our oracle also outputs the corresponding solution as a certificate if it returns YES. Note that we cannot set since it reduces to the decision version of the minmax gathering problem, which is NPhard on a tree [8].
If we have such oracle, we can construct a PTAS as shown in Algorithm 1. The correctness of this algorithm is as follows.
Assume that there is a deterministic strongly polynomial time oracle Solve described above. Then, Algorithm 1 gives a solution to the minmax gathering problem whose cost is at most in strongly polynomial time.
Proof.
By the definition of Solve and the algorithm, during the algorithm, always returns YES, and returns NO unless is YES and . Thus, we have . Therefore, the algorithm outputs the solution with cost at most , which is at most , because
The algorithm terminates in steps because the gap becomes half in each step, That completes proof. ∎
2.3 Algorithm. Part 2: Rounding Distance
In this and next subsections, we propose a DP algorithm for . Our algorithm maintains “distance information” in the indices of the DP table. For this purpose, we round the distances so that all the vertices (thus the users and facilities) are located on the points which are distant from the root by distance multiple of positive number as follows.
For each edge , where is closer to the root, we define the rounded length by . Intuitively, this moves all the vertices “toward the root” and regularize the edge lengths into integers. Then, we define the rounded distance the metric on .
This rounding process only changes the optimal value a little. For any pair of vertices , holds. Especially, .
Proof.
Let be the lowest common ancestor of and . Then, is on the  path; thus, and hold. Since and for all vertex , we have . We also have by symmetry. Thus holds. Since the cost of the minmax gathering problem is the maximum length of some paths, the second statement follows from the first statement. ∎
This lemma implies that an algorithm that determines whether has a solution with cost at most works as an oracle if .
2.4 Dynamic Programming
Now we propose an algorithm to determine whether has a solution with cost at most . Since all the edge costs of are integral, without loss of generality, we replace the threshold by . An important observation is that is bounded by a constant since .
Our algorithm is a dynamic programming on a tree. For vertex , arrays and , we define a boolean value . is true if there is a way to

open some facilities in , and

assign some users in to the opened facilities so that

for all there are unassigned users in who are distant from by distance and no other users are unassigned, and

for all we will assign users out of who are distant from by distance to open facilities in ,
and false otherwise. is the solution to the problem. The elements of and are nonnegative integers at most ; thus, the number of the DP states is , which remains in polynomial in the size of input.
The remaining task is to write down the transitions. For arrays and , we denote by the elementwise addition, by the elementwise subtraction, and by the elementwise inequality. We denote by the array produced by shifting by rightwards if and the array produced by shifting by leftwards if ; the overflowed entries are discarded. Let be the two children of . We make a formula to calculate from the DP values for children. Let the cost of the edges in be . Then, is true if and only if

there are arrays of integers whose lengths are such that

is if there is a user on and otherwise, and

the sum of all elements in is zero or at least if there is a facility on and zero otherwise, and

if is nonzero, the sum of indices of last nonzero elements of and are at most , and

and , and

, and

for , for , for , for , and

, and

.
The meaning of the auxiliary variables are as follows.

The th entry of (resp. ) denotes the number of users in (resp. ) who are distant from by distance and assigned to the facility in (resp. ).

and decide whether we assign the user on to an open facility in or remain unassigned.

The th entry of (resp. ) denotes the number of users in (resp. outside of ) who are assigned to the facility on and distant from by distance .
We can enumerate all the possibilities of the arrays in polynomial time. Thus, the total time complexity is polynomial. We can reconstruct the solution by storing which candidates of transitions are chosen, so we achieved to construct an algorithm what we wanted. This gives a proof of Theorem 1.2.
2.5 Variants
Our technique can be used for other variants of the gathering problems. In gathering problem [1], we do not need to assign at most factor of users. We can construct an algorithm to solve it, just by adding the number of ignored users in to DP states of vertex . Note that, this problem is also NPhard and does not admit FPTAS, because we can convert gathering instance to equivalent gathering instance, just by adding the proper number of users on sufficiently far points.
We can treat the constraint on the number of open facilities just by adding the number of open facilities in to DP states of vertex . Note that, this problem is also NPhard and does not admit FPTAS because in the gadget construction described in our another paper [8] we only have to decide whether there is a solution with clusters, where is the number of “long legs” on a spider.
Here we give a theorem to conclude this subsection.
Both minmax gathering and gathering with constraints on the number of open facilities admit strongly polynomial time approximation schema.
We can also straightforwardly combine these additional states to solve combined problems.
3 PolynomialTime Algorithms for other variants
In contrast to the minmax gathering, there are variants which can be solved in polynomialtime in tree. In this section, we introduce them.
3.1 minsum Gathering and Lower Bounded Facility Location Problem
Now we consider the other objective function – not minmax, but minsum. We can also introduce the cost to open facility for each facility : the total cost is the sum of the distance between users and assigned facilities, and the sum of over all open facilities. In this situation, the problem is socalled lower bounded facility location problem [10]. For the general metric case, approximation algorithm was given in [10]. Later, the approximation ratio is improved to [2].
Unlike the minmax case, we can solve this problem exactly on a tree in polynomial time. For each vertex and an integer , such that , let us define the value by the minimum total cost in following situation.

If , all but users in are assigned to facilities in , all open facilities in has at least users, and we will assign remaining users to facilities out of . In other words, users go upwards from , and no users go downwards to .

Otherwise, all users in are assigned to facilities in , and we will assign additional users out of to the facilities in . In other words, users go downwards to , and no users go upward from .
We want the value . Following observation ensures we can get an optimal solution by calculating DP values in a bottomup way.
There is an optimal solution, that for each edge , all users who pass through the edge when they go to the assigned facilities pass through in the same direction.
Proof.
Assume the users go to the facilities , respectively, and they pass through the edge in the opposite direction. Then, we can decrease the sum of the number of edges the user pass through among all users, by reassigning to and to , without increasing the total cost and breaking feasibility. ∎
Let us write down the transitions. Denote two children of by , and distance between and by . We also denote the number of users on by . Then, is calculated by
if contains no facilities. If contains a facility , we also decide whether to open . Thus, we additionally take a minimum to the value . We can implement this algorithm to work in time. Since , we get the following theorem.
minsum gathering problem and lower bounded facility location problem on a tree admit an exact time algorithm.
3.2 Proximity Requirement
In real applications, it is natural to assume that users go to their nearest open facilities. This requirement is called proximity requirement. It is discussed in Armon’s paper [6] for minmax gathering problem and they gave a approximation algorithm. We assume that for all user , there is no tie among the distances from to the facilities. That ensures the users uniquely determine the facility that they go. Especially, there is a positive distance between two distinct facilities.
Unlike the vanilla gathering, We can solve this problem exactly in polynomial time on a tree. The key observation is the following fact.
Lemma 1.
Assume that the user go to the facility , respectively, in a feasible solution. If path and path have a common point, .
Proof.
Denote this common point by . Since , holds. Without loss of generality, we can assume that . It means both and should go to . ∎
By the above lemma, we can argue that if there are two users who go to the same facility, so do all the users between them. From now, we construct an algorithm by dynamic programming.
For each vertex , facility , and integer , we calculate the value , which represents the minimum possible cost to assign all users in and decide whether to open each facilities in and , in situation

there are at least users assigned to ,

the nearest open facility from is ,

users in is assigned to the facilities in or ,

and all open facilities in but have at least users.
If there is no solution which satisfies above conditions, this value is . We calculate these values in a bottomup way.
We want the minimum value among all facility . Let us write down the transitions. Let two children of the vertex be and , and the number of users on vertex be . Let be when there are users on and when there is no user on . is calculated by the following minimum.

for all . That corresponds to the case remaining users in are assigned in .

for all and facility , which satisfies and . That corresponds to the case remaining users in are assigned to and we finish to choose users assigned to .

for all and facility , which satisfies and . That corresponds to the opposite case described above.

for all facilities , which satisfies and . That corresponds to the case which we finish to choose the users assigned to .
We can calculate all these transitions in time for each vertex , so we can solve this problem in time. Note that, minsum version of this problem can be solved in the same way. Here we conclude this subsection by the following theorem.
minmax and minsum gathering with proximity requirement admit an exact time algorithm.
References
 [1] Gagan Aggarwal, Rina Panigrahy, Tomás Feder, Dilys Thomas, Krishnaram Kenthapadi, Samir Khuller, and An Zhu. Achieving anonymity via clustering. ACM Transactions on Algorithms, 6(3):49:1–49:19, 2010.
 [2] Sara Ahmadian and Chaitanya Swamy. Improved approximation guarantees for lowerbounded facility location. In International Workshop on Approximation and Online Algorithms, pages 257–271. Springer, 2012.
 [3] Shareef Ahmed, Shinichi Nakano, and Md Saidur Rahman. rgatherings on a star. In Proceedings of International Workshop on Algorithms and Computation, pages 31–42. Springer, 2019.
 [4] Toshihiro Akagi and Shinichi Nakano. On rgatherings on the line. In Proceedings of International Workshop on Frontiers in Algorithmics, pages 25–32. Springer, 2015.
 [5] Sarker Anik, Sung Wingkin, and Rahman Mohammad Sohel. A linear time algorithm for the rgathering problem on the line (extended abstract). In Proceedings of International Workshop on Algorithms and Computation, pages 56–66. Springer, 2019.
 [6] Amitai Armon. On min–max rgatherings. Theoretical Computer Science, 412(7):573–582, 2011.
 [7] Yijie Han and Shinichi Nakano. On rgatherings on the line. In Proceedings of International Conference on Foundations of Computer Science, pages 99–104, 2016.
 [8] Soh Kumabe and Takanori Maehara. gather clustering and gathering on spider: FPT algorithms and hardness.
 [9] Shinichi Nakano. A simple algorithm for rgatherings on the line. In Proceedings of International Workshop on Algorithms and Computation, pages 1–7. Springer, 2018.
 [10] Zoya Svitkina. Lowerbounded facility location. ACM Transactions on Algorithms, 6(4):69, 2010.
 [11] Latanya Sweeney. kanonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems, 10(05):557–570, 2002.