Fast Fencing

03/31/2018
by   Mikkel Abrahamsen, et al.
0

We consider very natural "fence enclosure" problems studied by Capoyleas, Rote, and Woeginger and Arkin, Khuller, and Mitchell in the early 90s. Given a set S of n points in the plane, we aim at finding a set of closed curves such that (1) each point is enclosed by a curve and (2) the total length of the curves is minimized. We consider two main variants. In the first variant, we pay a unit cost per curve in addition to the total length of the curves. An equivalent formulation of this version is that we have to enclose n unit disks, paying only the total length of the enclosing curves. In the other variant, we are allowed to use at most k closed curves and pay no cost per curve. For the variant with at most k closed curves, we present an algorithm that is polynomial in both n and k. For the variant with unit cost per curve, or unit disks, we present a near-linear time algorithm. Capoyleas, Rote, and Woeginger solved the problem with at most k curves in n^O(k) time. Arkin, Khuller, and Mitchell used this to solve the unit cost per curve version in exponential time. At the time, they conjectured that the problem with k curves is NP-hard for general k. Our polynomial time algorithm refutes this unless P equals NP.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

01/24/2018

Threadable Curves

We define a plane curve to be threadable if it can rigidly pass through ...
02/11/2019

Geometric Multicut

We study the following separation problem: Given a collection of colored...
01/14/2020

Deciding contractibility of a non-simple curve on the boundary of a 3-manifold: A computational Loop Theorem

We present an algorithm for the following problem. Given a triangulated ...
11/24/2021

Construction and evaluation of PH curves in exponential-polynomial spaces

In the past few decades polynomial curves with Pythagorean Hodograph (fo...
05/01/2019

A convex cover for closed unit curves has area at least 0.0975

We combine geometric methods with numerical box search algorithm to show...
12/07/2012

Similarity of Polygonal Curves in the Presence of Outliers

The Fréchet distance is a well studied and commonly used measure to capt...
08/10/2021

Symmetries of discrete curves and point clouds via trigonometric interpolation

We formulate a simple algorithm for computing global exact symmetries of...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We consider some very natural “fence enclosure” problems studied by Capoyleas, Rote, and Woeginger [6] and Arkin, Khuller, and Mitchell [3] in the early 90s. Given a set of points in the plane, we aim at finding a set of closed curves such that (1) each point is enclosed by a curve and (2) the total length of the curves is minimized. We consider two main variants. In the first variant, we pay an opening cost per curve, which is part of the input. An equivalent formulation is that a circle with radius is centered at each point, and we need to enclose these circles with curves of minimal total length, paying no opening cost. The equivalence is illustrated and explained in Figure 1. By a suitable scaling, we may assume that the circles are unit circles. We thus refer to this variant as the unit disk fencing problem. In the other variant, we are allowed to use at most closed curves and pay no cost per curve. We can think of this as dividing the points into clusters and then viewing the closed curves as perimeters of the convex hulls of the clusters. For this reason, we refer to the variant as the -cluster fencing problem (also referred to as the minimum perimeter sum problem in the literature).

Capoyleas, Rote, and Woeginger [6] presented an algorithm for the -cluster fencing problem that runs in time , where is the number of nonisomorphic planar graphs on nodes. This yields a polynomial running time when is fixed. Arkin et al. [3] conjectured the problem to be NP-hard when is part of the input and neither an NP-hardness proof nor a polynomial time algorithm has been found so far. To solve the unit disk version, which is equivalent to having a unit cost per cluster, Arkin et al. suggested running their algorithm for the -cluster fencing problem for all , adding to the total perimeter length. These were the best known bounds except in the special case of the -cluster fencing problem for (see the discussion on related work for more details).

1.1 Our Results

We present polynomial time algorithms for both problems. More specifically, for the unit disk fencing problem, we present an efficient near-linear time algorithm (Theorem 1). For the -cluster fencing problem, we present an algorithm that is polynomial in both and (Theorem 3). In particular, this refutes the conjectured hardness unless P NP.

Our algorithm for the unit disk fencing problem can be generalized to the case where the input consists of objects that are allowed to be disks with different diameters or polygonal objects that have to be fenced. For this variant, our running time increases by a factor that is logarithmic in the ratio between the maximum and the minimum object diameter.

In order to achieve near-linear bounds for the unit disk fencing problem, we introduce new techniques that we believe can have other applications in computational geometry. We give a detailed overview of the techniques below.

Throughout our paper, we assume that it is possible to compare the costs of two different clusterings efficiently. Note that this is a standard assumption in computational geometry.

1.2 Applications

The problem of fencing in disks or objects appears very commonly in the real world. A good example is the protection of trees, either at construction sites to protect the roots, or in the wild to protect rare trees from deer and other animals. When trees are planted by nature, we have no control over their location. In this context, each disk should have a sufficient diameter to protect rare trees from wildlife (see Figure 1).

There are many standards that specify how far fences should be from trees, and even discussions on different advantages of grouping trees beyond the fence cost (e.g., see [8]).

Figure 1: A set of points and the set of enclosing curves minimizing the total length plus the opening cost per curve. There are three curves, enclosing , , and points, drawn in gray (one cannot be seen, as it consists of just one input point). A dashed circle centered at each point with radius is drawn. For each cluster, the curve enclosing the respective circles is drawn in black. We note that the perimeter of each black curve is exactly larger than the perimeter of the gray curve, as the linear pieces sum to the perimeter of the gray curve and the circular arcs sum to . Hence, the problem of enclosing points with curves, minimizing the total length plus an opening cost per curve, is equivalent to that of enclosing circles with curves of minimal total length.

1.3 Our Techniques

Approach for the unit disk fencing problem.

Our approach is as follows. We first partition the plane and recursively subdivide it into four quadrants, using a quadtree dissection-type approach. This divides the plane into cells of geometrically decreasing sizes. Each cell of width consists of four cells of width . We find optimal partitions for the smallest cells first (i.e., the lowest level of the quadtree). We then work with increasing cell sizes, solving the problem layer by layer in the quadtree. To obtain a solution at a given cell size, we rely on solutions to smaller cell sizes. For this to work, we need to have precomputed the solutions for various polyominoes made of some constant number of cells of smaller sizes.

For a given polyomino, we show how to obtain an optimal clustering of the points within the polyomino by merging the solutions for smaller polyominoes. In some cases, merging is not enough as there can be one large cluster intersecting all the cells of the polyomino, which we do not find by merging solutions of smaller polyominoes.

Hence, we need an efficient algorithm for finding the best cluster intersecting all the cells of the polyomino. In order to do this, we give a subroutine running in time that finds the best cluster that intersects all cells. The subroutine works in two steps: it first finds a point that belongs to the cluster we are looking for together with a point on the boundary of the cluster. Once we have and , the idea is to make an angular sweep of a ray from , and consider the points in the order the ray sweeps over them. For each point , we calculate the “best” path from to , in terms of both its length and how many clusters’ opening costs it may save. We prove that the “best” path consists of line segments between points, and save for each information about the last line segment on the path to (see the red lines in Figure 2). This information allows us to finally retrieve the boundary of the convex hull of recursively.

0.33 [ tinyvertex/.style= draw, circle, minimum size=1mm, inner sep=0pt, outer sep=0pt, every label/.append style= font=, ] [tinyvertex,label=left:] (x) at (0,0) ; [tinyvertex,label=left:] (p) at (0,5) ; [tinyvertex,label=below right:] (t) at (4.5,2.7) ;

plot coordinates (1,5) (1.6,5.2) (1.1,2.3) (1,5);

plot coordinates (2.5,3) (2.75,3.5) (3,3) (2.75,2.5) (2.5,3);

plot coordinates (4,1) (2.2,0.8) (4.5,2.7) (4,1);

[dashed] plot coordinates (0,0) (5,3);

[¡-] (6,0) arc (0:90:7.4);

[red] plot coordinates (0,5) (1.6,5.2) (2.75,3.5); [red] plot coordinates (0,5) (1,5); [red] plot coordinates (0,5) (1.1,2.3); [red] plot coordinates (1.6,5.2) (2.5,3); [red,dashed] plot coordinates (1.6,5.2) (4.5,2.7);

Figure 2: Given the advice that lies on the perimeter of the cluster containing in an optimal clustering, we can compute the cluster containing with an angular sweep. Here, the angular sweep has reached the point .
Approach for the -cluster fencing problem.

Our algorithm for the -cluster fencing problem shares some similarities with the work of Gibson et al. [10], in which a dynamic programming approach was used to solve the minimum radius sum problem. One main difference is that the running time complexity they obtain is , whereas our approach works in time. Their technique is to divide a given instance of the minimum radius sum problem into subproblems, where is the set of points to be clustered and is a constraint on the number of clusters we can use. The problem is solvable if we have a solution to all subproblems. However, the number of subproblems is exponential. To get a polynomial time algorithm, they showed that a solution can be found after considering only a polynomial number of subproblems.

We also use a dynamic programming approach for our problem, although we need new techniques since the number of candidate clusters for the minimum radius sum problem is only (as each disk is determined by two points in , one determining the center and the other determining the radius). In contrast, our problem has an exponential number of candidate clusters (dictated by all subsets of ). We define subproblems based on boxes, which are rectangles that cover some portion of the plane and some number of input points from . Our key observation is that there is some separator of each box (i.e., a vertical line segment or a horizontal line segment) that splits the box into two strictly smaller boxes such that an optimal solution only has a constant number of line segments that intersect this separator (in fact, we give a bound of two on the number of such intersecting line segments).

A dynamic programming approach naturally follows: we simply guess the position of such a separator (for which we argue there are choices), and then guess which segments crossing this separator belong to an optimal solution. We first obtain solutions for smaller boxes, and then glue together solutions for smaller boxes to obtain solutions for larger boxes.

We note that there is an unpublished solution [12] (i.e., polynomial time algorithm) for the rectilinear version of the problem, where we must enclose points using axis-parallel rectangles rather than convex hulls (as in our setting). The solution to the axis-parallel version uses similar ideas. In particular, it is possible to argue the existence of separators that do not cut through any clusters of an optimal solution in this setting. This is not possible for our problem. For the

-cluster fencing problem, it is possible that any such vertical or horizontal separator cuts at least one cluster, and allowing skew separators would result in subproblems of high complexity. In particular, consider an example with

sufficiently large and assume . We have points spread evenly on a circle as the corners of a regular -gon, and points spread evenly on a surrounding circle with the same center. The surrounding points are fairly close yet sufficiently far enough away that the optimal solution is to cluster the inner points together, and open a cluster for each of the points on the surrounding circle. Cutting away the points on the outside (with not necessarily axis-aligned separators) creates subproblems defined on polygonal regions with sides, resulting in high complexity subproblems.

1.4 Related Work

The literature on geometric clustering is vast [2], and thus we focus on the most relevant prior works. Arkin, Khuller, and Mitchell [3] considered many clustering variants related to the problems studied in the present paper. For the variant where points have a value associated with them, they showed that the problem of maximizing profit (i.e., sum of values of points enclosed minus the total perimeter) is NP-hard when values are unrestricted in sign. When values are strictly positive, they gave an time algorithm. For the version in which there is a budget on the total perimeter we can use, the problem of maximizing profit is also NP-hard, even when values are strictly positive (they provided a pseudo-polynomial algorithm when the values are integers).

The -cluster fencing problem for is the very well-known problem of computing the convex hull of a set of points in the plane [7]. There has also been some work for the special case of clusters. The work of Mitchell and Wynters [13] studied four flavors of the problem: minimizing the sum of perimeters, the maximum of the perimeters, the sum of the areas enclosed by the fences, and the maximum of the areas. They gave polynomial time solutions for all four flavors, running in time (for some of them, they gave improved running time bounds of ). More recently, the work of Abrahamsen et al. [1] gave an algorithm running in time that solves the case of clusters, yielding the first subquadratic time algorithm for this setting.

There have been many other papers studying related geometric clustering problems. Capoyleas, Rote, and Woeginger [6] studied a general geometric -clustering framework in which the cost of a solution is determined by some weight function that assigns real weights to any subset of points in the plane (i.e., each cluster), after which a symmetric -ary function over -tuples is applied (e.g., the sum function). For the case when the weight function is the diameter, radius, or perimeter and the symmetric -ary function is an arbitrary monotone increasing function (such as the sum or the maximum), they gave an algorithm running in time , where is the number of nonisomorphic planar graphs on nodes. This is polynomial if is fixed and not given as input.

In addition, the work of Behsaz and Salavatipour [4] studied objectives such as minimizing the sum of radii and minimizing the sum of diameters subject to the constraint of having at most clusters. For minimizing the sum of radii, they gave a polynomial time algorithm for clustering points in metric spaces that are induced by unweighted graphs, assuming no singleton clusters are allowed. They also showed that finding the best single cluster for each connected component of the graph yields a -approximation algorithm, assuming no singleton clusters are allowed. For the problem of minimizing the sum of diameters, they gave a polynomial time approximation scheme when points lie in the plane with Euclidean distances, along with a polynomial time exact algorithm when is constant (for the metric setting).

Many classical clustering problems are NP-hard when is given as part of the input, though there are some notable exceptions. In 2012, Gibson et al. [10] devised a polynomial time algorithm for finding disks, each centered at a point in , such that the sum of the radii of the disks is minimized subject to the constraint that their union must cover . In their paper, they used a dynamic programming approach to get a running time of , where is the time needed to compare two candidate solutions.

2 The Unit Disk Fencing Problem

Given a set of points in the plane, we denote by the convex hull of , and by the perimeter of .

Let be a finite set of points in and let be the opening cost. Consider a partition of . We refer to each set as a cluster. The cost of with respect to is

The partition is optimal for if no partition of has a lower cost. We denote the cost of an optimal partition for as . When the opening cost is clear from the context, we might omit it. In the unit disk fencing problem, we are given a set of points and an opening cost , and the goal is to find an optimal partition for .

Observation 1.

In an optimal partition, the clusters have pairwise disjoint convex hulls.

We say that is indivisible if is an optimal partition for .

Observation 2.

Each cluster of an optimal partition is indivisible.

We say an optimal partition for is maximal if there is no optimal partition of , where , such that for each , there is some such that .

2.1 Structural Results

Figure 3: The polygon from the proof of Lemma 1 is the set . Note that the perimeter is .
Lemma 1.

Let be two indivisible sets of points in under the opening cost . If and intersect, then the set is indivisible.

Proof.

Let be a maximal optimal partition of . Let be the clusters from such that and for . Assume for contradiction that . This means that and each cluster of is a subset of either or , and hence that . If , then , and cannot be optimal as and intersect. If , then cannot be maximal as and are indivisible. Hence .

We want to show that the set is indivisible. Let , where possibly , be the clusters of that are contained in . Then the set is a partition of . Note that must be an optimal partition of , as otherwise would not be optimal for .

The cost of is thus

Now, for each cluster , we define a polygon . As and are both convex, so is .

Note that all points in are contained in the polygon , see Figure 3. It hence follows that

Note that is a partition of . Hence, by the indivisibility of , we have

Combining these two inequalities yields

and so is indivisible.

As is indivisible, is the union of clusters of , and is maximal, it follows that is itself a cluster of , i.e., and . Recall furthermore that contains and intersects . In a similar way, it can be shown that is a cluster of that contains and intersects . Since and is optimal, we must have , so and is indivisible. ∎

Lemma 2.

Let be sets of points in , and let be indivisible. Let be a maximal optimal partition of . Then for some .

Proof.

Let be the set of clusters of that intersect . By Lemma 1, each of the sets , where , is indivisible. It thus follows that is also indivisible. Since is maximal, it must then be the case that consists of a single cluster that contains . ∎

Lemma 3.

Each instance of the unit disk fencing problem has a unique maximal optimal partition.

Proof.

Consider two maximal optimal partitions and of . Lemma 2 gives that each cluster of is contained in some cluster of . Likewise, each cluster of is contained in some cluster of . Since the clusters of an optimal partition are disjoint, it now follows that there is a one-to-one correspondence between the partitions and , i.e., they are the same partition of . ∎

Lemma 4.

Let be a set of points such that , where for . Let be the maximal optimal partition of , and the maximal optimal partition of . Any cluster of that intersects a cluster is also contained in , and it follows in particular that each cluster is the union of clusters of the partitions . Furthermore, each cluster is either a cluster of some partition or has a non-empty intersection with each set .

Proof.

In order to prove that first part, consider some cluster that intersects . Since is indivisble, Lemma 2 gives that there is a cluster of containing . That cluster must be , as the clusters of are disjoint. It hence also follows that each cluster is the union of some clusters of the partitions .

In order to prove the second part, consider a cluster that does not intersect some set . Then, . Since is indivisible, Lemma 2 gives that there is a cluster of such that . By the above, we also have that . Hence , i.e., is a cluster of . ∎

2.2 Partitioning into Independent Instances

Consider an instance of the unit disk fencing problem. Observe that any two points such that must be in the same cluster in any optimal partition of . We will prove that we can efficiently decompose the problem instance into a collection of independent subinstances , such that for each subinstance .

Lemma 5.

Any -point instance of the unit disk fencing problem can be reduced in time into a disjoint collection of subinstances , where and each subinstance is bounded by a box of side lengths at most . The subinstances are independent in the sense that an optimal partition for is the union of the optimal partitions for .

Proof.

Clearly, the optimal partition for has cost of at most , as a partition of cost can be obtained by opening singleton clusters. Therefore, if any two points are at a distance greater than , they must be in separate clusters of any optimal partition (as the perimeter of any cluster containing such two points would be greater than ).

We first sort all points from with respect to their -coordinate. Denote the sorted points by . Whenever for two consecutive points the difference in their -coordinate is greater than , we know that the sets of points and can be treated separately, i.e., each cluster of any optimal partition will be contained in either or . That gives us a partition of into subinstances, where each subinstance is contained in a vertical slab of width at most . Now, for each subinstance we sort the points according to their -coordinate, and perform a similar operation. Therefore, in time we partitioned into subinstances , such that , each subinstance has a bounding box of size at most , and an optimal partition for is the union of the optimal partitions for .

Note that if , then , and the size of the bounding box is as required. It therefore remains to consider subinstances for which and the smallest bounding box has size at least . In such an instance, there must be two consecutive points in the order of - or -coordinates where the respective coordinates differ by at least . Thus, the instance can be recursively partitioned into yet smaller subinstances. Since at each recursive level, at most half of the points from the previous level remain, the depth of the recursive tree is at most . The total running time is therefore . ∎

2.3 Cells and Polyominoes

Consider an instance of the unit disk fencing problem with bounding box (which is an axis-parallel square of side length ). We will recursively subdivide into cells, starting with a single cell , and recursively partitioning each cell into four smaller squares, ending when the side length of the cells is at most . This happens after some recursive operations due to the partitioning described in Section 2.2. We call a cell a level cell if it has been obtained after levels of subdivision. The square is thus a level cell.

We consider level cells and define polyominoes to be simple polygons consisting of some level cells. Two cells are neighbouring if their boundaries share an edge or a corner. We will be particularly interested in monominoes, which consist of a single cell, dominoes, which are the union of two cells sharing an edge, L-trominoes, which are the union of three pairwise neighbouring cells (i.e., which are in the shape of the letter L), and square-tetrominoes, which are a -square of cells. A basic polyomino is a monomino, a domino, an L-tromino, or a square-tetromino. See Figure 4 for all the basic polyominoes. Note that any non-basic polyomino contains two non-neighbouring cells. Polyominoes consisting of level cells are called level polyominoes. We say that a polyomino is convex if the intersection of with any horizontal or vertical line has at most one connected component.

Figure 4: Basic polyominoes.

Note that each level monomino, domino, tromino, or tetromino, for , contains , , or cells at level , respectively. We define a subpolyomino at level to be a convex polyomino at level that is contained in a basic polyomino at level . Note that each basic polyomino at level and at level is also a subpolyomino at level . For all subpolyominoes, see Figure 5.

Figure 5: The different subpolyominoes (up to rotations and reflections).

As we want each input point to belong to exactly one cell of a given level, we define a cell to include its right and bottom edge and the bottom-right corner. The other edges and corners then belong to the neighbouring cells. For any collection of cells (not necessarily a polyomino), we define to be the input points in . We say that a polyomino is empty if . We will consider subproblems of the original unit disk fencing problem instance , each subproblem corresponding to the input points lying in a non-empty basic polyomino of some level. Note that the number of non-empty basic polyominoes of a given level is , as each point belongs to a constant number of such polyominoes. Therefore, the total number of non-empty basic polyominoes is at most .

We compute an optimal partition for each subproblem , where is a non-empty basic polyomino. We start with the polyominoes at level . At level , any two points in the same basic polyomino have distance less than and therefore is indivisible. Suppose now that we have already computed the optimal partition for each non-empty basic polyomino at some level . As we will see, this makes it possible to compute the optimal partitions for all non-empty subpolyominoes at level . Since each basic polyomino at level is a subpolyomino at level , we thus also know the optimal partitions of each basic polyomino at level . It follows that the process can be repeated until we reach level , i.e., we have computed an optimal partition of .

To compute and process the polyominoes efficiently, we will use a quadtree construction as described in the next section.

2.4 Quadtree Construction

A quadtree is a geometric data structure for objects in the Euclidean plane. The root of the quadtree corresponds to a square containing all the input objects, while the children of each node correspond to the four subsquares of the given square. The leaves correspond to subsquares that have small enough side length, or where the input objects have small complexity, e.g., subsquares containing at most some constant number of input objects. See [11, Chapter 2] for more information on quadtrees, and for their applications.

In our case, the root of the quadtree corresponds to the level cell . Each node corresponding to a level cell has at most children, which correspond to level cells contained in . We do not create nodes corresponding to empty cells, i.e., with no points from . The leaves correspond to the highest level cells, i.e., the side length of the leaf cells is at most . As there are at most non-empty cells at each level, the number of nodes of the quadtree is at most . The quadtree can be constructed in time , as at each of the levels, we have to compute the subsquares for the points.

While constructing the quadtree, for each node corresponding to each level cell, we can compute the nodes corresponding to the eight neighbouring cells (if such nodes exist, i.e., the corresponding cells are non-empty). Remembering this information will allow us to construct the polyominoes easier.

From the quadtree, we can construct the set of all non-empty basic polyominoes in time , assigning each basic polyomino to the nodes of the quadtree corresponding to the cells of . We do that by considering all nodes of the tree, for each node corresponding to a cell considering the basic polyominoes containing , and either constructing a new polyomino, or assigning an existing polyomino to the currently considered node.

2.5 Finding an Optimal Partition for Each Basic Polyomino

We now describe an algorithm for finding an optimal partition for each basic polyomino.

Leaf polyominoes.

Consider a basic polyomino at level . As consists of level cells, and the side length of such cells is at most , the distance between any two points in is smaller than . Therefore, an optimal partition for consists of one indivisible set of points . Therefore, optimal partitions for leaf polyominoes can be computed efficiently.

At most one big cluster.

Suppose that we have already computed the optimal partitions for all basic polyominoes at level . In order to compute the optimal partitions for the basic polyominoes at level , we first compute the optimal partitions for all subpolyominoes at level . This suffices as the basic polyominoes at level are also subpolyominoes at level . To find an optimal partition for each subpolyomino efficiently, we make use the following property.

Figure 6: A situation from the proof of Lemma 6.
Lemma 6.

Let be a non-basic polyomino at some level . Let be a maximal optimal partition of . For any pair of non-neighbouring cells of , there is at most one cluster such that intersects both and . In particular, has at most one cluster such that intersects all cells of .

Proof.

Assume that there are two clusters such that and both intersect and . The situation is depicted in Figure 6. Since is optimal, and are disjoint, and it follows that both boundaries and intersect both boundaries and . We may then define the point , for each choice , to be the intersection point of with such that (i) , and (ii) . Let be the quadrilateral with vertices at for , and consider the polygon . We will show that is not larger than . As contains all points of , this shows that is not optimal or maximal.

In the following, we show the inequality

(1)

The desired result then follows as the latter sum is at most the length of the perimeter of and contained in . Let be the side length of the cells. If and are in the same row or column of cells, then clearly , and the inequality 1 holds. Otherwise, let be the corners of minimizing the distance between the cells. By considering cases of where the points are on , one can observe that it always holds that

and inequality (1) follows. ∎

Cluster unions.

Consider a given subpolyomino at some level for which we want to compute the maximal optimal partition of the input points . The overall approach is the following. We use maximal optimal partitions for for various smaller collections of level cells. We then construct the merger of the partitions , which is the partition of we get when we unite the partitions and keep merging clusters with overlapping convex hulls. The merger thus consists of clusters with pairwise disjoint convex hulls, but is in general not optimal. As we shall see, a maximal optimal partition of can be obtained by uniting one subset of the clusters of into one big cluster. This is the motivation for the following definitions.

Let be a partition of a set of points such that for any . For any subset consider the set . Let be the partition consisting of the cluster and every . Consider the set minimizing the cost of the partition . If there is more than one such partition, we are interested in a maximal one. We say that is an optimal cluster union for the clustering .

Lemma 7.

Let be a set of points such that , where for . Let be the maximal optimal partition of and suppose that there is at most one cluster in that intersects each set . Let be the maximal optimal partition of and let be the merger of the partitions . Let be an optimal cluster union for . Then and are the same partition.

Proof.

By Lemma 4, each cluster of is either a cluster of some , or has non-empty intersection with each set . Since there is at most one cluster in of the latter kind, it follows that has the form , where each , , is contained in a cluster of the partition . Each cluster of is indivisible by Lemma 1, so it is contained in some cluster by Lemma 2. For each cluster where , there must be a cluster contained in , and it follows that . Hence, has the form , where . Therefore, the optimal cluster union for is , and is the partition . ∎

Lemma 8.

Let be a non-basic convex polyomino. There are two cells of with the following properties:

  1. are non-neighbouring,

  2. and are convex, and either

  3. the horizontal distance between and is at least as large as the vertical distance, is leftmost and is rightmost in , or

  4. the vertical distance between and is at least as large as the horizontal distance, is topmost and is bottommost in .

Proof.

Since is non-basic, it has either width of at least cells or height of at least cells. Assume without loss of generality that the width of is at least as large as the height of . We will choose to be one of the leftmost cells of , and to be one of the rightmost cells of . As we want and to be convex, and have to be topmost or bottommost in their columns. If there is only one cell in the leftmost (rightmost) column, we take it as (respectively, ), as then clearly (respectively, ) remains a convex polyomino. If there are at least two cells in the column, then, by convexity of , at least one of them can be removed without disconnecting the polyomino. ∎

The following lemma states that we can find optimal cluster unions efficiently. The proof is in sections 2.62.8.

Lemma 9.

Let be a collection of cells and two non-neighbouring cells of such that either

  • the horizontal distance between and is at least as large as the vertical distance, is leftmost and is rightmost in , or

  • the vertical distance between and is at least as large as the horizontal distance, is topmost and is bottommost in .

Let be a partition of the points such that for , restricted to the points of is the maximal optimal partition of . An optimal cluster union for can be found in time , where is the number of points in .

Solving non-basic subpolyominoes.

We can now describe how to find maximal optimal partitions of the points in non-basic subpolyominoes.

Figure 7: A demonstration of how the optimal partition of the set of points in the non-basic -cell polyomino is obtained from optimal partitions of points in smaller collections of cells , as described the proof of Lemma 10. The cells have width and the opening cost is . The second, third and fourth figure show the optimal clusterings . The fifth figure shows the merger of these. The final figure shows (the optimal cluster union consists of all the clusters of ).
Lemma 10.

Let be a non-basic subpolyomino at level . Given maximal optimal partitions for basic polyominoes at level , we can compute an optimal partition of in time , where is the number of points in .

Proof.

We first find cells of as in Lemma 8. Let , , and . As each monomino is a basic polyomino at level , we know the maximal optimal partition of by assumption. Let be the merger of the maximal optimal partitions of and (which is in fact just the union of the partitions, as the cells are disjoint). Then, together with the partition satisfy the conditions of Lemma 9, and we can find an optimal cluster union for in time . Define . By Lemma 6, the maximal optimal partition of contains at most one cluster the intersects and . Hence, by Lemma 7, is the maximal optimal partition of .

See Figure 7. Denote the maximal optimal partition of as . Consider the sets , and suppose for now that we know their optimal partitions . By Lemma 4 (taking , , , and ), we get that a cluster of that is not a cluster of any must intersect both and . Due to Lemma 6, there is at most one such cluster in . Let be the merger of the partitions . Applying Lemma 9 for , , , and , we obtain an optimal cluster union for . By Lemma 7, we get that .

As the polyomino has at most cells, we need to find optimal partitions for a constant number of subpolyominoes before we get down to the basic polyominoes. That gives the total running time of . ∎

Summing up.

Consider the subpolyominoes at some level . By Lemma 10, the total computation for level takes time

where the equality follows since each point belongs to level subpolyominoes. Due to the preprocessing described in Lemma 5, the number of levels is , so the total running time of the algorithm is . We have thus proven the following theorem:

Theorem 1 (The unit disk fencing problem).

There is an algorithm running in time that, given any set of points in the plane and an opening cost , finds a set of closed curves such that each point in is enclosed by a curve and the total length of the curves plus is minimized.

2.6 Finding Optimal Cluster Unions

In order to find optimal cluster unions, we first solve a more specialized problem where we require a special point to be contained in the cluster union. To be more precise, the optimal cluster union for the pair , where is a partition, is a maximal subset of the clusters of the partition such that and the cost of is minimal. Note that we use the terms point and vertex interchangeably in the following.

In this section, we prove the following result.

Lemma 11.

Let be a partition of a given set of points such that for any , and let be an arbitrary point. A maximal optimal cluster union for can be found in time , where is the number of points in .

Our first goal is to solve the following more specialized problem. Given an “internal” point and a “perimeter” point , and given a set of input points and pre-clustered input points, the goal is to find the optimal cluster containing with an angle-monotone perimeter seen from and with on its perimeter. We present an algorithm to solve this problem. The algorithm can take into account that the cluster must be contained within some delimiting outer boundary.

The idea is to make an angular sweep of a ray from , and consider the points in the order the ray sweeps over them. For each point , we calculate the best path (see Section 2.6.4) from to , in terms of both its length and how many clusters’ opening costs it may save. In this process, we only store for each vertex its parent , which is an input point with the property that a best path to ends with the line segment from to . Finally, we calculate the parent of , and have thus specified the entire cluster and may generate its convex hull by recursively outputting the parent until we end back at .

2.6.1 Preliminaries

Definition 1.

Any closed simple curve divides the plane into two regions: the bounded interior, denoted , and the unbounded exterior, denoted . We write and .

Definition 2.

A region of Euclidean space is star shaped if there exists a point such that for all , the line segment is contained in . We say that is star shaped seen from .

Definition 3.

Given a point , a curve is angle-monotone with respect to if for . Here,

is the counterclockwise angle from the unit vector

to the unit vector on the unit circle.

When a cluster only contains one point, we call it trivial, otherwise, it is non-trivial. If a curve intersects the interior of a cluster, we say that it dissects the cluster. Given a partition of some points, we denote by the set of points, .

We denote by the ball of radius around the point .

2.6.2 Problem formalization

We are given:

  • special points and ,

  • clusters with costs ,

  • an outer limit, corresponding to the perimeter of an unbounded cluster , which the cluster containing is not allowed to intersect.

We may assume that no two clusters touch or intersect, and that each cluster is convex. Unclustered points are treated as trivial clusters .

Definition 4.

is the set of all closed simple curves with , not dissecting any cluster in , and such that is angle-monotone from .

The cost of a curve is:

Note that we sometimes omit the subscript when it is clear from context.

Problem.

Given , , , and as described above, compute .

Note here that even if no outer limit is given as part of the construction, we may take any bounding box around containing all of as an outer limit.

Lemma 12.

We can compute and also output in time , where is the total number of vertices in .

The rest of this section is dedicated to a proof of the lemma above.

2.6.3 Reduction to the case where every cluster is non-trivial.

We reduce the original to an instance where every cluster has a non-empty interior.

This reduction follows the framework of symbolic perturbation [9, 15, 14]. We introduce an infinitesimal , and replace each vertex by three vertices in an equilateral triangle centered at . Each precluster will consist of the set for some . We thus have that every cluster in has a non-trivial interior.

Finally, we replace every vertex by , that is, we perturb each point by a very small random number, such that no three points lie on the same line. Therefore, in the following we can assume all the vertices are in general position.

Note that we may disregard any vertex that does not lie on the convex hull of its cluster.

2.6.4 Subproblem structure.

For any cluster not containing , note that the set of angles is either an interval with , or the union of intervals , with . Because of the symbolic perturbation introduced in Section 2.6.3, these values and are realized by unique vertices on the convex hull of . Denote by the vertex realizing , and by the vertex realizing .

Definition 5.

For , is the set of all simple angle-monotone curves from to , not dissecting any cluster .

Denote by the cone with apex through and . Denote by and the bounded, and unbounded, region of , respectively.

The cost of a curve is:

Observation 3.

Let and be as in the reduction (Section 2.6.3). Then, since was chosen to be infinitesimal, the difference between and is also infinitesimal.

Definition 6.

is the subset of polygonal curves consisting of line segments between points of .

Lemma 13.
Proof.

For any curve in , let denote the set of internal points, and denote the set of external points. Then, the shortest curve separating and will consist of line segments between vertices of . Since , and since they cover the same clusters, . ∎

Definition 7.

For , , let . Since is polygonal, we may write it as the polygonal curve on the vertex set . We say that is the parent of .

The cost of may be rewritten in the following way:

cost

We denote by the expression .

When