We consider geometric versions of the Discriminating Code problem, which are variations of classic geometric covering problems. A set of point sites in is given. For a set of objects of , denote by the set of objects of that contain . The objective is to choose a minimum-size set of objects such that for all (covering), and for each pair of distinct sites (discrimination). In the discrete version, the objects of must be chosen among a specified set of objects given in the input, while in the continuous version, only the points are given, and the objects can be chosen freely (among some infinite class of allowed objects).
The problem is motivated as follows. Consider a terrain that is difficult to navigate. A set of sensors, each assigned a unique identification number (), are deployed in that terrain, all of which can communicate with a single base station. If a region of the terrain suffers from some specific problem, a subset of sensors will detect that and inform the base station. From the ’s of the alerted sensors, one can uniquely identify the affected region, and a rescue team can be sent. The covering zone of each sensor can be represented by an object in . The arrangement of the objects divides the entire plane into regions. A representative point of each region may be considered as a site. The set consists of some of those sites. We need to determine the minimum number of sensors such that no two sites in are covered by the same set of s. Apart from coverage problems in sensor networks, this problem has applications in fault detection, heat prone zone in VLSI circuits, disaster management, environmental monitoring, localization and contamination detection Laifenfeld et al. , Ray et al. , to name a few.
Minimum Discriminating Code (Min-Disc-Code) Input: A connected bipartite graph , where . Output: A minimum-size subset such that for all , and for every pair , .
In the geometric version of Min-Disc-Code which will be further referred to as the G-Min-Disc-Code, the two sets of nodes in the bipartite graph are = a set of geometric objects , and = a set of points in , and an object is adjacent to all the points it contains. The code of a point with respect to a subset is the subset of that contains . Given an instance , two points are called twins if each member in that contains also contains , and vice-versa. An instance of G-Min-Disc-Code is twin-free if no two points in are twins. Geometrically, if we consider the arrangement de Berg et al.  of the geometric objects , then the instance is twin-free if each cell of contains at most one point of . As mentioned earlier, for a twin-free instance, a subset of that can uniquely assign codes to all the points in is said to discriminate the points of and is called a discriminating code or disc-code in short. In the discrete version of the problem, our objective is to find a subset of minimum cardinality that is a disc-code for the points in . In the continuous version, we can freely choose the objects of . The two problems are formally stated as follows.
Discrete-G-Min-Disc-Code Input: A point set to be discriminated, and a set of objects to be used for the discrimination. Output: A minimum-size subset which discriminates all points in .
Continuous-G-Min-Disc-Code Input: A point set to be discriminated. Output: A minimum-size set of objects that discriminate the points in , and that can be placed anywhere in the region under consideration.
Related work. The general Min-Disc-Code problem is NP-hard and hard to approximate Charbit et al. , Charon et al. , Laifenfeld and Trachtenberg . In the context of the above-mentioned practical applications, Discrete-G-Min-Disc-Code in 2D was defined in Basu et al. , where an integer programming formulation (ILP) of the problem was given along with an experimental study. Continuous-G-Min-Disc-Code was introduced in Gledel and Parreau , and shown to be NP-complete for disks in 2D, but polynomial-time in 1D (even when the intervals are restricted to have bounded length). These two problems are related to the class of geometric covering problems, for which also both the discrete and continuous version are studied extensively Krupa R. et al. . A related problem is the Test Cover problem de Bontridder et al. , which is similar to Min-Disc-Code (but defined on hypergraphs). It is equivalent to the variant of Min-Disc-Code where the covering condition “” is not required. Thus, a discriminating code is a test cover, but the converse may not be true. Geometric versions of Test Cover have been studied under various names. For example, the separation problems in Boland and Urrutia , Călinescu et al. , Har-Peled and Jones  can be seen as continuous geometric versions of test cover in 2D, where the objects are half-planes. Similar problems are also called shattering problems, see Nandy et al. . A well-studied special case of Min-Disc-Code for graphs is the problem Minimum Identifying Code (Min-ID-Code). This problem was studied in particular for the related setting of geometric intersection graphs, for example on unit disk graphs Müller and Sereni  and interval graphs Bousquet et al. , Foucaud , Foucaud et al. .
More references on several coding mechanisms on graphs based on different applications, namely locating-dominating sets, open locating dominating sets, metric dimension, etc, and their computational hardness results are available in Foucaud , Foucaud et al. .
Our results. We show that Discrete-G-Min-Disc-Code in 1D, that is, the problem of discriminating points on a real line by interval objects of arbitrary length, is NP-complete. For this we reduce from 3-SAT. Here, the challenge is to overcome the linear nature of the problem and to transmit the information across the entire construction without affecting intermediate regions. This result is in contrast with Continuous-G-Min-Disc-Code in 1D, which is polynomial-time solvable Gledel and Parreau . This is also in contrast with most geometric covering problems, which are usually polynomial-time solvable in 1D Krupa R. et al. . We then design a polynomial-time -factor approximation algorithm for Discrete-G-Min-Disc-Code in 1D. To this end we use the concept of minimum edge-covers in graphs, whose optimal solution can be found by computing a maximum matching of the graph. We also design a polynomial-time approximation scheme (PTAS) for both Discrete-G-Min-Disc-Code and Continuous-G-Min-Disc-Code in 1D, when the objects are required to all have the same (unit) length. We also study both problems in 2D for axis-parallel unit square objects, which form a natural extension of 1D intervals to the 2D setting. The continuous version is known to be NP-complete for unit disks Gledel and Parreau , and we show that the reduction can be adapted to our setting, for both the continuous and discrete case. We then design polynomial-time constant-factor approximation algorithms for both problems in the same setting, of factors for Continuous-G-Min-Disc-Code, and for Discrete-G-Min-Disc-Code (for any fixed ). To this end, we re-formulate the problem as an instance of stabbing a set of given line segments by placing unit squares in . (Here a line segment is stabbed by a unit square if exactly one end-point of is contained in the square.) We propose an -factor approximation algorithm for this stabbing problem, which, to the best of our knowledge, is the first polynomial-time constant-factor algorithm for it.111Such algorithms exist for a related, but different, segment-stabbing problem by unit disks, where a disk stabs a segment if it intersects it once or twice Mustafa and Ray , Kobylkin . We remark that in many cases, a polynomial-time algorithm for Discrete-G-Min-Disc-Code works for Continuous-G-Min-Disc-Code, as long as any instance of the former can be transformed in polynomial time into an equivalent instance of the latter. For example, this is the case for objects restricted to a fixed size, since there are only polynomially many kinds of intersections with the point set, such as for our PTAS for the unit interval case. Conversely, a hardness result for Continuous-G-Min-Disc-Code can often be applied to Discrete-G-Min-Disc-Code, this is the case of our hardness proof for axis-parallel unit squares in 2D. Our results are summarized in Table 1.
|1D intervals||-||Polynomial (Gledel and Parreau )||NP-hard (Thm. 2.1)||-approximable (Thm. 2.2)|
|1D unit intervals||Open||PTAS (Thm. 2.4)||Open||PTAS (Thm. 2.4)|
|2D axis parallel unit squares||NP-hard (Thm. 3.1)||-approximable (Thm. LABEL:8approx)||NP-hard (Thm. 3.1)||-approximable (Thm. 3.3)|
2 The one-dimensional case
It has been shown that Continuous-G-Min-Disc-Code is polynomial-time solvable in 1D Gledel and Parreau . Thus, in this section we mainly focus on Discrete-G-Min-Disc-Code.
An instance of Discrete-G-Min-Disc-Code is a set of points and a set of intervals of arbitrary lengths placed on the real line . Assuming that the points are sorted with respect to their coordinate values, we define gaps , where , for , and .
The time complexity of checking whether a given instance of points and intervals is twin-free is .
Observe that, if a subset is a G-Min-Disc-Code for , then any superset of also produces unique code for all the points in . Thus, the twin-free property will be verified by checking whether all the objects in can produce unique code for the points in or not.
We compute all the maximal cliques222A subset forms a maximal clique if their intersection is non-null, and no other member of intersect the region of intersection of the members in . of the intervals in in time, and store them in an array . Next, we execute a merge pass among the sorted points in and the intervals in to check whether any maximal clique region in the array contains more than one point of . This needs time.
Thus one can check whether is twin-free in because . Observe that (i) if both endpoints of an interval lie in the same gap of , then it can not discriminate any pair of points; thus is useless, and (ii) if more than one interval in have both their endpoints in the same two gaps, say , then both of them discriminate the exact same point-pairs. Thus, they are redundant and we need to keep only one such interval. In a linear scan, we can first eliminate the useless and redundant intervals. From now onwards, will denote the number of intervals, none of which are useless or redundant. Hence, .
2.1 NP-completeness for the general 1D case
Discrete-G-Min-Disc-Code is in NP, since given a subset , in polynomial time one can test whether the problem instance is twin-free (i.e. whether the code of every point in induced by is unique). Our reduction for proving NP-hardness is from the NP-complete 3-SAT- problem Tovey  (defined below), to Discrete-G-Min-Disc-Code.
3-SAT- Input: A collection of clauses where each clause contains at most three literals, over a set of Boolean variables , and each literal appears at most twice. Output: A truth assignment of such that each clause is satisfied.
Given an instance of 3-SAT-, we construct in polynomial time an instance of Discrete-G-Min-Disc-Code on the real line . The main challenge of this reduction is to be able to connect variable and clause gadgets, despite the linear nature of our 1D setting. The basic idea is that we will construct an instance where some specific set of critical point-pairs will need to be discriminated (all other pairs being discriminated by some partial solution forced by our gadgets). Let us start by describing our basic gadgets.
A covering gadget consists of three intervals , , and four points , , and satisfying , , and as in Fig. 1. Every other interval of the construction will either contain all four points, or none. There may exist a set of points in , depending on the need of the reduction.
Points ,,, can only be discriminated by choosing all three intervals , , in the solution.
Follows from the fact that none of the intervals in that is not a member of the covering gadget can discriminate the four points in . Moeover, if we do not choose , then are not discriminated. If we do not choose , are not discrimnated. If we do not choose , are not discriminated.
The idea of the covering gadget is to forcefully cover the points placed in , so that they are covered by (which needs to be in any solution), and hence discriminated from all other points of the construction.
Let us now define the gadgets modeling the clauses and variables of the 3-SAT- instance.
Let be a clause of . The clause gadget for , denoted , is defined by a covering gadget along with two points placed in (see Fig. 2).
The idea behind the clause gadget is that some interval that ends between points will have to be taken in the solution, so that this pair gets discriminated.
Let be a variable of . The variable gadget for , denoted , is defined by a covering gadget , and five points placed consecutively in . We place six intervals , , , , , , as in Fig. 3.
Interval starts between and , and ends between and .
Interval starts between and , and ends between and .
Interval starts between and , and ends after .
Interval starts between and , and ends after .
Interval starts between and , and ends after .
Interval starts between and , and ends after .
(The ending point of the four latter intervals will be determined at the construction.)
In a variable gadget , the intervals and represent the occurrences of literal , while and represent the occurrences of . The right end points of each of these four intervals will be in the clause gadget of the clause that the occurrence of the literal belongs to. More precisely, is constructed as follows, shown in Figure 4. Note that we can assume that every literal appears in at least one clause (otherwise, we can fix the truth value of the variable and obtain a smaller equivalent instance).
For each variable , contains a variable gadget .
The gadgets are positioned consecutively, in this order, without overlap.
For each clause , contains a clause gadget .
The gadgets are positioned consecutively, in this order, after the variable gadgets, without overlap.
For every variable , assume appears in clauses and , and appears in and (possibly or ). Then, we extend interval so that it ends between and ; ends between and ; ends between and ; ends between and .
Figure 4 gives an example construction for our reduction.
Let be the union of the disc-codes (i.e. all intervals of type , by Observation 1) of all covering gadgets. Observe that discriminates the points in each covering gadget , and any point covered by from any other point not covered by . It follows that all point-pairs are discriminated by , except the following critical ones:
the pairs among the five points of each variable gadget , and
the point pair of each clause gadget .
Discrete-G-Min-Disc-Code in 1D is NP-complete.
We prove that is satisfiable if and only if has a disc-code of size . In both parts of the proof, we will consider the set defined above. Each variable gadget and clause gadget contains one covering gadget. Thus, .
Consider first some satisfying truth assignment of . We build a solution set as follows. First, we put all intervals of in . Then, for each variable , if is true, we add intervals , and to . Otherwise, we add intervals , and to . Notice that . As observed before, it suffices to show that discriminates the point-pair of each clause gadget , and the points of each variable gadget . (All other pairs are discriminated by .)
Since the assignment is satisfying, each clause contains a true literal . Then, one interval of is in and discriminates and . Furthermore, consider a variable . Point is discriminated from as it is the only one not covered by any of , , , , , and . If is true, is covered by ; is covered by and ; is covered by ; is covered by and . If is false, is covered by ; is covered by and ; is covered by , and ; is covered by and . Thus, in both cases, the five points are discriminated, and is discriminating, as claimed.
For the converse, assume that is a discriminating code of of size . By Observation 1, . Thus there are intervals of that are not in .
First, we show that contains exactly three intervals of each variable gadget . Indeed, it cannot contain less than three, otherwise we show that the points cannot be discriminated. To see this, note that each consecutive pair () must be discriminated, thus must contain one interval with an endpoint between these two points. There are four such consecutive pairs in , thus if contains at most two intervals of , it must contain and . But now, the points and are not discriminated, a contradiction.
Let us now show how to construct a truth assignment of . Notice that at least one of and must belong to , otherwise some points of cannot be discriminated. If but , then necessarily to discriminate and , and to discriminate and . In this case, we set to true. Similarly, if but , then necessarily to discriminate and , and to discriminate and . In this case, we set to false. Finally, if both and belong to , the third interval of in may be any of the four intervals covering . If this third interval is or , we set to true; otherwise, we set it to false.
Observe that when we set to true, none of and belongs to ; likewise, when we set to false, none of and belongs to . Thus, our truth assignment is coherent. As for every clause , the point-pair is discriminated by , one interval correspoding to a true literal discriminates it. The obtained assignment is satisfying, completing the proof.
2.2 A -approximation algorithm for the general 1D case
We next use the classic algorithm solving the edge-cover problem of an undirected graph to design a -factor approximation algorithm for Discrete-G-Min-Disc-Code in 1D.
Edge-Cover Input: An undirected graph . Output: A subset such that every vertex is incident to at least one edge of .
We create a graph , where corresponds to the set of gaps. For each interval , we create an edge if and . See Figure 5 for an example. As we have removed useless333An interval that covers no point. and redundant444An interval is said to be a redundant interval if the interval and some other interval create the same edge in but the right-end point of is to the left of the right-end point of . intervals, there are no loops and multiple edges in . Thus, and . The minimum edge-cover (MEC) consists of (i) the edges of a maximum matching in , and (ii) for each unmatched vertex (if exists), any arbitrary edge incident to that vertex Garey and Johnson . It can be computed in time Micali and Vazirani .
Let be the set of intervals corresponding to the edges of . Clearly, discriminates all consecutive point-pairs of , since for each gap , there is an interval with an endpoint in . Moreover, is an optimal set of intervals discriminating all consecutive point-pairs. Thus, any solution to Discrete-G-Min-Disc-Code for has size at least , since any such solution should in particular discriminate consecutive point-pairs.
A subset will receive unique codes by ,
A subset may not be covered by the intervals of , and hence they will not receive any code. If then the elements in are non-consecutive.
Some subsets of points (of sizes ) of may each receive the same nonempty code by . In that case, the members of each of those subsets are non-consecutive.
Clearly, since discriminates all consecutive point-pairs, for any integer , any two points of cannot be consecutive.
Denote by , the interval starting at the first point of and stopping at the last point of . Then, for any two distinct sets and , either and are disjoint, or one of them (say ) is strictly included between two consecutive points of the other (). In that case, we say that is nested inside .
Suppose that and intersect. Recall that all the points in have the same code by , and all the points in have the same code by . That is, each interval of either contains all points or no point of and , respectively, and there is at least one interval of that contains, say, all points of but no point of . Then, necessarily, is included between two consecutive points of , as claimed.
For a set of size , we denote the points in . We next use Lemma 2 to give a lower bound on the size of .
We have .
Consider the sets (possibly ). We will prove that every interval contains a set of at least intervals of that are included in . Moreover, for every that is nested inside , none of the intervals of are included in .
We proceed by induction on the nested structure of the ’s that follows from Lemma 2. As a base case, assume that has no interval nested inside. Since by Lemma 1, the points of are non-consecutive inside , between each pair of consecutive points of , there is at least one point of . By definition of , is discriminated from all points of by . Hence, there is an interval of that lies completely between and : add it to . Since there are such consecutive pairs, : the base case is proved.
Next, assume by induction that the claim is true for all the intervals that are nested inside . Consider a point of that is not the last point of . Again, between and , there is a point of . Let be the point of that comes just after . The set discriminates the two consecutive points and . However, there cannot be an interval of covering and ending between and , otherwise it would also discriminate and . Thus, there must be an interval of that starts between and . Notice that is not included in any , for nested inside . Thus, we can add to . Repeating this for all points of except the last one, we obtain that , as claimed.
We have thus proved that there are at least distinct intervals of , each of them being included in some . But moreoever, there is at least one interval of that is not included in any . Indeed, there must be an interval of that corresponds to an edge of that covers the first gap . This interval has not been counted in the previous argument. Thus, it follows that .
Next, we will choose additional intervals from to discriminate the points in , and add them to . The resulting set, , will form a discriminating code of . Consider some set . We will choose at most new intervals so that all points in are discriminated: call this set . We start with , and we select some interval of that discriminates (since can be assumed to be twin-free, such an interval exists) and add it to . We then proceed by induction: at each step (), we assume that the points are discriminated, and we consider . There is at most one point, say , among whose code is the same as by (since by induction all have different codes). We thus find one interval of that discriminates and add it to . In the end we have .
After repeating this process for every set , all pairs of points of are discriminated by . Finally, we may have to add one additional interval in order to cover one point of , that remains uncovered. Let us call the resulting set: this is a discriminating code of . Moreover, we have added at most additional intervals to , to obtain . By Lemma 3, we thus have .
Hence, denoting by the optimal solution size for , and recalling that , we obtain that . Moreover, the construction of from can be done in linear time. Thus, we have proved the following:
The proposed algorithm produces a -factor approximation for Discrete-G-Min-Disc-Code in 1D, and runs in time .
2.3 A PTAS for the 1D unit interval case
The following observation (which was also made in the related setting of identifying codes of unit interval graphs [Foucaud, 2012, Proposition 5.12]) plays an important role in designing our PTAS.
In an instance of Discrete-G-Min-Disc-Code in 1D, if the objects in are intervals of the same length, then discriminating all the pairs of consecutive points in is equivalant to discriminating all the pairs of points in .
Assume that we have a set that covers all points and discriminates all consecutive point-pairs, but two non-consecutive points and () are not discriminated. Since and are covered by some intervals of , and they are covered by the same set of intervals of , some unit interval contains them both, so they must be at a distance at most apart. Now, since they are not consecutive, lies between and . Since discriminates and , there is an interval with an endpoint in the gap . If it is the right endpoint, covers but not , a contradiction. Thus, it must be the left endpoint. But since the distance between and is at most , contains (but not ), again a contradiction.
For a given , we choose points, namely , called the reference points, as follows: is the -th point of from the left, and for each , the number of points in between every consecutive pair is , including and (the number of points to the right of may be less than ). For each reference point , we choose two intervals such that both contain (span) , and the left (resp. right) endpoint of (resp. ) have the minimum -coordinate (resp. maximum -coordinate) among all intervals in that span . Observe that all the points in that lie in the range are covered, where are the -coordinates of the left endpoint of and the right endpoint of , respectively. These ranges will be referred to as group-ranges. Since the endpoints of the intervals are distinct, the span of a group-range is strictly greater than 1.
We now define a block as follows. Observe that the ranges and may or may not overlap. If several consecutive ranges are pairwise overlapping, then the horizontal range forms a block. The region between a pair of consecutive blocks will be referred to as a free region. We use to name the blocks in order, and to name the free regions (from left to right). The points in each block are covered. Here, the remaining tasks are (i) for each block, choose intervals from such that consecutive pairs of points in that block are discriminated, and (ii) for each free region, choose intervals from such that all its points are covered, and the pairs of consecutive points are discriminated. Observe that no interval can contain both a point in and a point in since and are sepatated by the block . The reason is that if there exists such an interval , then it will contain the reference point just to the right of 555the reference point of the leftmost group-range of the block .. This contradicts the choice of for . Thus, the discriminating code for a free region is disjoint from that of its neighboring free region . So, we can process the free regions independently.
Processing of a free region: Let the neighboring group-ranges of a free region be and , respectively. There are at most points lying between the reference points of and . Among these, several points of to the right (resp. left) of the reference point of (resp. ) are inside block (resp. ). Thus, there are at most points in . We collect all the members in that cover at least one point of . Note that, though we have deleted all the redundant intervals of , there may be several intervals in with an endpoint lying in a gap inside that free region, and their other endpoint lies in distinct gaps of the neighboring block. There are some blue intervals which are redundant with respect to the points , but are non-redundant with respect to the whole point set . However, the number of such intervals is at most due to the definition of of the right-most group-range of the neighboring block and left-most group-range of .
Thus, we have . We consider all possible subsets of intervals of , and test each of them for being a discriminating code for the points in . Let be all possible different discriminating codes of the points in , with in the worst case.
Processing of a block: Consider a block ; its neighboring free regions are and . Consider two discriminating codes and . As in Section 2.2, we create a graph whose nodes correspond to the gaps of which are not discriminated by the intervals used in and . Each edge corresponds to an interval in that discriminates pairs of consecutive points corresponding to two different nodes of . Now, we can discriminate each non-discriminated pair of consecutive points in by computing a minimum edge-cover of in time Micali and Vazirani . As mentioned earlier, all the points in are covered. Thus, the discrimination process for the block is over. We will use to denote the size of a minimum edge-cover of using and .
Computing a discriminating code for : We now create a multipartite directed graph . Its -th partite set corresponds to the discriminating codes in , and . Each node has its weight equal to the size of the discriminating code . A directed edge connects two nodes and of two adjacent partite sets, say and , and has its weight equal to . For every pair of partite sets and , we connect every pair of nodes and , where . Every node of is connected to a node with weight 0, and every node of is connected to a node with weight 0.
The shortest weight of an - path666The weight of a path is equal to the sum of costs of all the vertices and edges on the path. in is a lower bound on the size of the optimum discriminating code for .
Let be the shortest - path in the graph , which corresponds to a set of intervals , the set corresponds to the minimum discriminating code, and . As is a discriminating code, the points of every free region are discriminated by a subset, say . Since, we maintain all the discriminating codes in , surely . Let be the set of intervals that span the points of the block . As is a discriminating code, the points in are discriminated by the intervals in . Thus the set of intervals discriminate the pair of points of that are not discriminated by . Observe that, for every , we have . Moreover, there exists a path that connects , whose each edge has cost equal to . Thus, we have the contradiction that is a path in having cost less than that of .
Let denote the set of intervals of in a shortest - path in . The intervals in may not form a discriminating code for