    # On the Distance Identifying Set meta-problem and applications to the complexity of identifying problems on graphs

Numerous problems consisting in identifying vertices in graphs using distances are useful in domains such as network verification and graph isomorphism. Unifying them into a meta-problem may be of main interest. We introduce here a promising solution named Distance Identifying Set. The model contains Identifying Code (IC), Locating Dominating Set (LD) and their generalizations r-IC and r-LD where the closed neighborhood is considered up to distance r. It also contains Metric Dimension (MD) and its refinement r-MD in which the distance between two vertices is considered as infinite if the real distance exceeds r. Note that while IC = 1-IC and LD = 1-LD, we have MD = ∞-MD; we say that MD is not local In this article, we prove computational lower bounds for several problems included in Distance Identifying Set by providing generic reductions from (Planar) Hitting Set to the meta-problem. We mainly focus on two families of problem from the meta-problem: the first one, called bipartite gifted local, contains r-IC, r-LD and r-MD for each positive integer r while the second one, called 1-layered, contains LD, MD and r-MD for each positive integer r. We have: - the 1-layered problems are NP-hard even in bipartite apex graphs, - the bipartite gifted local problems are NP-hard even in bipartite planar graphs, - assuming ETH, all these problems cannot be solved in 2^o(√(n)) when restricted to bipartite planar or apex graph, respectively, and they cannot be solved in 2^o(n) on bipartite graphs, - even restricted to bipartite graphs, they do not admit parameterized algorithms in 2^O(k).n^O(1) except if W = W. Here k is the solution size of a relevant identifying set. In particular, Metric Dimension cannot be solved in 2^o(n) under ETH, answering a question of Hartung in 2013.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction and Corresponding Works

Problems consisting in identifying each element of a combinatorial structure with a hopefully small number of elements have been widely investigated. Here, we study a meta identification problem which generalizes three of the most well-known identification problems in graphs, namely Identifying Code (IC), Locating Dominating Set (LD) and Metric Dimension (MD). These problems are used in network verification [3, 4], fault-detection in networks [22, 28], graph isomorphism  or logical definability of graphs . The versions of these problems in hypergraphs have been studied under different names in ,  and .

Given a graph with vertex set , the classical identifying sets are defined as follows:

• IC: Introduced by Karposky et al. , a set of vertices of is said to be an identifying code if none of the sets are empty, for and they are all distinct.

• LD: Introduced by Slater [25, 26], a set of vertices of is said to be a locating-dominating set if none of the sets are empty, for and they are all distinct. When not considering the dominating property ( may be empty), these sets have been studied in  as distinguing sets and in  as sieves.

• MD: Introduced independently by Harary et al.  and Slater , a set of vertices of is said to be a resolving set if contains one vertex from each connected component of and, for every distinct vertices and of , there exists a vertex of such that . The metric dimension of is the minimum size of its resolving sets.

The corresponding minimization problems of the previous identifying sets are defined as follows: given a graph , compute a suitable set of minimal size, if one exists. In this paper, we mainly focus on the computational complexity of these minimization problems.

##### Known results.

A wide collection of NP-hardness results has been proven for the problems.

For IC and LD, the minimization problems are indeed NP-hard [9, 10]. Charon et al. showed the NP-hardness when restricted to bipartite graphs , while Auger showed it for planar graphs with arbitrarily large girth . For trees, there exists a linear algorithm .

Metric Dimension is also NP-hard, even when restricted to Gabriel unit disk graphs [16, 20]. Epstein et al.  showed that MD is polynomial on several classes as trees, cycles, cographs, partial wheels, and graphs of bounded cyclomatic number, but it remains NP-hard on split graphs, bipartite graphs, co-bipartite and line graphs of bipartite graphs. Additionally, Diaz et al.  proved a quite tight separation: the problem is polynomial on outerplanar graphs whereas it remains NP-hard on bounded degree planar graphs.

In a recent publication, Foucaud et al.  also proved the NP-hardness of the three problems restricted to interval graphs and permutation graphs.

These notions may be considered under the parameterized point of view; see  for a comprehensive study of Fixed Parameter Tractability (FPT). In the following, the parameter is chosen as the solution size of a suitable set.

For IC and LD, the parameterized problems are clearly FPT since the number of vertices of a positive instance is bounded by ( vertices may characterize neighbors).

Such complexity is not likely to be achievable in the case of MD, since it would imply . Indeed, Hartung et al. [18, 19] showed MD is -hard for bipartite subcubic graphs. The problem is however FPT on families of graphs with degree growing with the number of vertices because the size of a resolving set must satisfy . Finally, Foucaud et al.  provided a FPT algorithm on interval graphs.

##### Our contributions.

In order to unify the previous minimization problems, we introduce the concept of distance identifying functions. Given a distance identifying function and a value as a positive integer or infinity, the meta-problem consists in finding a minimal sized -dominating set which distinguishes every couple of vertices of an input graph thanks to the function . Here, we mainly focus on two natural subfamilies of problems of named local, in which a vertex cannot discern the vertices outside of its -neighborhood, for a fixed positive integer, and -layered, where a vertex is able to separate its open neighborhood from the distant vertices.

With this approach, we obtain several computational lower bounds for problems included in by providing generic reductions from (Planar) Hitting Set to the meta-problem. The reductions rely on the set/element-gadget technique, the noteworthy adaptation of the clause/variable-gadget technique from SAT to Hitting Set.

As we provide a -layered generic gadget, the -layered reductions operate without condition. For local problems, the existence of a local gadget is not always guaranteed. Thus, a local reduction operates only if a local gadget is provided. However, the local planar reduction is slightly more efficient than its -layered counterpart: it indeed implies computational lower bounds for planar graphs whereas the -layered reduction requires an auxiliary apex, limiting the consequences to apex graphs.

The reductions in general graphs are designed to exploit the -hardness of Hitting Set parameterized by the solution size of an hitting set, hereby using:

###### Theorem 1 (folklore).

Let and be the number of elements and sets of an Hitting Set instance, and be its solution size. A parameterized problem with parameter admitting a reduction from Hitting Set verifying does not have a parameterized algorithm running in time except if .

###### Proof.

Given a reduction from Hitting Set to a parameterized problem such that the reduced parameter satisfies and the size of the reduced instance verifies , an algorithm for of running time is actually an algorithm for Hitting Set of running time , meaning that Hitting Set is FPT, a contradiction to its -hardness (otherwise ). ∎

Hence, as each gadget contributes to the resulting solution size of a distance identifying set, we set up a binary compression of the gadgets to limit their number to the logarithm order. From the best of our knowledge, this merging gadgets technique has never been employed.

The organization of the paper is as follows. After a short reminder of the computational properties of Hitting Set, Section 2 contains the definitions of distance identifying functions and sets, allowing us to precise the computation lower bounds we obtain. The Section 3 designs the supports of the reductions as distance identifying graphs and compressed graph. Finally, the gadgets needed for the reductions to apply are given in Section 4 as well as the proofs of the main theorems.

## 2 Definition of the Meta-Problem and Related Concepts

### 2.1 Preliminaries

##### Notations.

Throughout the paper, we consider simple non oriented graphs.

Given a positive integer , the set of positive integers smaller than is denoted by . By extension, we define . Given two vertices of a graph , the distance between and corresponds to the number of vertices in the shortest path between and and is denoted . The open neighborhood of is denoted by , its closed neighborhood is , and for a value , the -neighborhood of is , that is the set of vertices at distance less than of . For , the -neighborhood of is the set of vertices in the same connected component than . We recall that a subset of is called an -dominating set of if for all vertices of , the set is non-empty. Thus an -dominating set of contains at least a vertex for each connected component of .

Given two subsets and of , the distance corresponds to the value . For a vertex , we will also use and , defined similarly. The symmetric difference between and is denoted by , and the -combination of a set is denoted

Given two graphs and , is an induced subgraph of if and for all vertices and of , if and only if . We denote and by . Symmetrically, is an induced supergraph of .

#### The (Planar)Hitting Set problem.

Consider a universe of elements denoted and a set of non-empty subsets of denoted such that every element belongs to at least a subset. Then, a subset of intersecting every set of is called an hitting set of : Hitting Set
Input: A universe and a set of non-empty subsets of whose union covers .
Output: A minimal-sized hitting set of , i.e. a subset of satisfying , .

The parameterized version Hitting Set(k) decides if there exists a hitting set of size .

###### Theorem 2 (R.G. Downey and M.R Fellows ).

Hitting Set cannot be solved in time under ETH even if . Moreover, Hitting Set(k) is -hard.

Hitting Set may be translated into a dominating problem on bipartite graphs. Given an instance of Hitting Set, let us define as the bipartite graph of size such that for each , there exists a vertex in , for each , there exists a vertex in , and the edge is present in if and only if the element belongs to the subset . Henceforth, a hitting set of is equivalent to a subset of that dominates . We call the associated graph of .

Planar Hitting Set
Input: An instance of Hitting Set such that is planar.
Output: A hitting set of of minimal size.

We also consider the parameterized version Planar Hitting Set(k) of the latter problem.

###### Theorem 3 (folklore).

There exists a reduction from SAT to Planar Hitting Set(n) producing associated graphs of quadratic size in the number of variables of the instances of SAT. Thus Planar Hitting Set cannot be solved in under ETH even if .

###### Proof.

Let be the set of the variables present in the set of clauses of an instance of SAT. For each variable of , we add two fresh elements and to the universe representing the two possible affectations of variable , and we create a set that we append to the set of subsets of . The independence of the sets implies that the existence a hitting set of size strictly smaller than is impossible. Reciprocally, a potential hitting set of size exactly must define an affectation of the variables of . Finally, to determine if an affectation satisfies the set of clauses , for each clause we append to the set of elements representing each literal present in the clause . The equivalence between the satisfiability of and the existence of a hitting set of of size is immediate by construction. It remains to guarantee the planarity of the associated graph . To do so, we actually apply the reduction on a restriction of SAT named Separate Simple Planar SAT (See  for a precise definition). Adding the sparsifying lemma from , the reduction produces a graph of size linear in , preserving the computational lower bound of Separate Simple Planar SAT. In particular, the latter problem is not solvable in under ETH, ∎

### 2.2 The meta-problem

Given a graph and , the classical identifying sets may be rewritten:

• -IC: a subset of is a -identifying code of if it is an -dominating set and for every distinct vertices of , a vertex in verifies .

• -LD: a subset of is a -locating dominating set of if it is an -dominating set and for every distinct vertices of , a vertex in verifies .

• -MD: a subset of is a -resolving set of if it is an -dominating set and for every distinct vertices of , a vertex in verifies and .

A pattern clearly appears: the previous identifying sets only deviate on the criterion that the vertex must verify. The pivotal idea is to consider an abstract version of the criterion which does not depend on the input graph. Hence:

###### Definition 1 (identifying function).

A function of type: , is called an identifying function. Given three vertices , and of a graph such that , we write to get the resulting boolean. The notation implies that is symmetric, that is .

We need to require some useful properties on identifying functions to produce generic results. By mimicking the classical identifying sets, the main property we consider is that a vertex cannot distinguish two vertices at the same distance from it. Then:

###### Definition 2 (distance function).

A distance identifying function is an identifying function such that for every graph and all vertices , and of with :

• is when .

Besides this mandatory criterion, we suggest two paradigms related to the neighborhood of a vertex. Let . First, we may restrain the range of a vertex to its -neighborhood: a vertex should not distinguish two vertices if they do not lie in its -neighborhood but it should always distinguish them whenever exactly one of them lies to that -neighborhood. Reciprocally, we may ensure that a vertex could distinguish the vertices of its -neighborhood: a vertex should distinguish a vertex belonging to its -neighborhood from all the other vertices, assuming the distances are different. Formally, we have:

###### Definition 3 (i-local function).

For , an -local identifying function is an identifying function such that for every graph and all vertices , , of with :

1. is when or, symmetrically, .

2. is when .

###### Definition 4 (i-layered function).

For , an -layered identifying function is an identifying function such that for every graph and all vertices ,, of with :

1. is when and .

In the following, given an identifying function and three vertices , , of a graph , we say that -distinguishes and if and only if is . By extension, given three vertex sets , and of , we say that -distinguishes and if for every in and in , either or there exists in verifying . Finally, a graph of vertex set is -distinguished by when -distinguishes and .

We are now ready to define the meta-problem.

###### Definition 5 ((f,r)-distance identifying set).

For a distance identifying function and , a -distance identifying set of a graph is an -dominating set of that -distinguishes .

Input: A distance identifying function and . A graph .
Output: A -distance identifying set of of minimal size, if one exists.

Given a distance identifying function and as inputs of the meta-problem, the resulting problem is called - and denoted -DIS. The problem -DIS is said to be -layered when the function is -layered, and it is said to be -local when is -local and . A problem is local if it is -local for an integer . Our local reductions will need a local gadget to operate: the subfamilies of local problems admitting a (bipartite) local gadget is called (bipartite) gifted local. We do not need to define gifted -layered as every -layered problem admits a -layered gadget. We also consider the parameterized version ().

### 2.3 Detailed Computational Lower Bounds

Using the meta-problem, we get the following lower bounds:

###### Theorem 4.

For each -layered distance identifying function and every , the - problem restricted to bipartite apex graphs is NP-hard, and does not admit an algorithm running in time under ETH.

###### Theorem 5.

The (bipartite) gifted local problems restricted to (bipartite) planar graphs are NP-hard, and do not admit an algorithm running in time under ETH.

###### Theorem 6.

For each -local -layered distance identifying function , -DIS restricted to bipartite planar graphs is NP-hard, and cannot be solved in under ETH.

###### Theorem 7.

Let and be distance identifying functions such that is -layered, is -local -layered and is -local and admits a local (bipartite) gadget. Let . The -, - and -DIS problems are NP-hard, and do not admit:

• algorithms running in time, except if ETH fails,

• parameterized algorithms running in time, except if .

The parameter denotes here the solution size of a relevant distance identifying set.

All bounds still hold in the bipartite case (whenever the gadget associated with is bipartite).

As a side result, the -layered general reduction answers a question of Hartung in :

###### Corollary 1.

Under ETH, Metric Dimension cannot be solved in .

Finally, notice that the parameterized lower bound from Theorem 7 may be complemented by an elementary upper bound inspired from the kernel of IC and LD of size :

###### Proposition 8.

For every -local distance identifying function , the -Distance Identifying Set problem has a kernel of size where k is the solution size. Therefore, it admits a naive parameterized algorithm running in time.

###### Proof.

The kernel size simply relies on the fact that vertices may characterize at most -neighbors using distances, while the parameterized algorithm just enumerates the set of vertices of the input graph, trying them in . ∎

The proofs of the Theorems 4 to 7 will be given in Section 4.

## 3 The Supports of the Reductions for

### 3.1 The Distance Identifying Graphs

Consider the associated graph as defined in Section 2.1. The differences between the meta-problem and the dominating problem related to associated graphs actually raise two issues for a reduction based on these latter notions to be effective on . First, contrarily to the dominating problem where a vertex may only discern its close neighborhood, the meta-problem may allow a vertex to discern further than its direct neighborhood. In that case, we cannot certify that a vertex does not distinguish a vertex when is not in , the adjacency not remaining a sufficient argument. Secondly, one may object that a vertex formally has to distinguish a vertex from another vertex, but that distinguishing a single vertex is not defined.

To circumvent these problems, we suggest the following fix: rather than producing a single vertex for each , the set may contain two vertices and . Then, the role of would be to distinguish them if and only if . To ensure that the vertex distinguishes and when , we may use the properties and of Definition 3 and 4 for the -local and -layered problems, respectively. Precisely, when , should be at distance to (with in the -layered cases) while should not be in the -neighborhood of . Similarly, to ensure that cannot distinguish and when , we may use properties or of Definitions 2 and 3. Hence, when , should not be in the -neighborhood of , or and should be equal.

That fix fairly indicates how to initiate the transformation of the associated graphs in order to deliver an equivalence between a hitting set formed by elements of and the vertices of a distance identifying set included in . However, it is clearly not sufficient since we also have to distinguish the couples of vertices of for which nothing is required. To solve that problem, we suggest to append to each vertex of the associated graph a copy of some gadget with the intuitive requirement that the gadget is able to distinguish the close neighborhood of its vertices from the whole graph. We introduce the notion of -extension:

###### Definition 6 (B-extension).

Let be a connected graph, and . An induced supergraph is said to be a -extension of if it is connected and for every vertex of , the set is either equal to or .

A vertex of such that is said to be -adjacent. The -extensions of such that contains exactly a -adjacent vertex or two -adjacent vertices but not connected to each other are called the -single-extension and the -twin-extension of , respectively.

Here, the ”border” makes explicit the connections between a copy of a gadget and a vertex outside the copy. In particular, a -single-extension is formed by a gadget with its related vertex , while a -twin-extension contains a gadget with its two related vertices and . Piecing all together, we may adapt the associated graphs to the meta-problem:

###### Definition 7 ((H,B,r)-distance identifying graph).

Let be an instance of Hitting Set. Let be a connected graph, a subset of its vertices, and a positive integer. The -distance identifying graph is as follows.

• for each , the graph contains as induced subgraph a copy of together with a -adjacent vertex , where denotes the copy of .

• similarly, for each , the graph contains a copy of together with two -adjacent vertices and (the latter vertices are not adjacent) and where denotes the copy of .

• finally, for each and each , is connected to by a path of vertices denoted with for each .

When the problem is not local, we prefer the following identifying graph:

###### Definition 8 ((H,b)-apex distance identifying graph).

An -apex distance identifying graph is the union of a -distance identifying graph with an additional vertex called apex such that:

• for each , the apex is -adjacent to , where (resp. ) denotes the copy of (resp. ).

• for each , the apex is adjacent to and .

See Figure 2 for an example of an -distance identifying graph (on the left) and an example of -apex distance identifying graph (on the right).

###### Proposition 9.

Given an instance of Planar Hitting Set where , , the graphs and

• are connected and have size bounded by , (with for ),

• may be built in polynomial time in their size,

• are bipartite if the -single extension of is bipartite,

• are respectively planar and an apex graph if the -twin-extension of is planar.

###### Proof.

The graph is formed by the union of -single-extensions of , -twin-extensions of and the all the possible paths of vertices. As is a bipartite planar graph, the Euler formula implies that the number of paths is bounded by . We conclude that the number of vertices of is bounded by:

 n(|H|+1)+m(|H|+2)+(r−1)(2(n+m)−4)=(|H|+2r)(n+m)−n−4(r−1)

Furthermore, it is clear that is connected if and only if the associated graph is connected. Additionally, we may consider that is connected since it is a property decidable in polynomial time, and that the instances corresponding to the distinct connected components of may be considered independently.

Finally, all the other items of the proposition are direct by construction. ∎ Figure 2: A (H,B,3)-distance identifying graph and a (H,B)-apex distance identifying graph built on the planar instance formed by Ω={1,2,3,4} and S={{1,2},{2,3,4}}.

Having defined the (apex) distance identifying graphs, the main effort to obtain generic reduction from Planar Hitting Set is done. We now define relevant gadgets:

Let be a distance identifying function and . Let be a connected graph, and be two subsets of . We said that the triple is a -gadget if for every -extension of :

• -distinguishes and .

• -distinguishes and , where is the set of -adjacent vertices of .

• is an -dominating set of .

• For all -distance identifying set of , .

A -gadget is a local gadget, if is a -local identifying function with , and

• for every , there exists such that .

Consistently, we say that a -gadget is bipartite if the -single-extension of is bipartite, and that it is planar if the -twin-extension of is planar.

###### Theorem 10.

Let be an instance of Hitting Set such that , . Let be a -gadget for a -layered identifying function and let be a local -gadget. The following propositions are equivalent:

• there exists a hitting set of of size .

• there exists a -distance identifying set of of size .

• there exists a -distance identifying set of of size .

###### Proof.

We start by focusing on the equivalence between the first and second items.

Suppose first that is a hitting set of of size . By denoting and the copies of associated to the copies and of , we suggest the following set of size as a -distance identifying set of :

 I={vΩi:ui∈P}∪⋃i∈[[n]]CΩi∪⋃j∈[[m]]CSj.

Recall that by construction, is a -extension of (respectively -extension of ) for any (respectively ). This directly implies that is an -dominating set of . Indeed, the condition of Definition 9 implies that (respectively ) -dominates plus (respectively of plus , ). The remaining apex is also -dominated by any , as it is -adjacent for every .

We now have to show that -distinguishes . We begin with the vertices of the gadget copies because the condition implies that -distinguishes the vertices of and for every , and -distinguishes the vertices of and for every . Thereby, we only have to study the vertices of the form , , , and the apex (there is no vertex of the form in an apex distance identifying graph). To distinguish them, we use the condition . Recall that . Then, for each distinct , we have:

• a vertex of the form or is neither -adjacent nor -adjacent.

Enumerating the relevant and , we deduce that every couple of vertices is distinguished except when they are both of the form or for . But we may distinguish or for distinct by applying on .

It remains to distinguish and for . We now use the fact that is a hitting set for . By definition of a hitting set, for any set , there exists a vertex such that . We observe that by construction of and that by definition of . Since is -layered, -distinguishes and .

In the other direction, assume that is a distance identifying set of of size . As every set of is not empty, we may define a function such that .

We suggest the following set as an hitting set of of size at most :

 P={ui∈Ω|vΩi∈I}∪{uφ(j)∈Ω|vSj∈I or ¯vSj∈I}

We claim that the only vertices that may -distinguish and are themselves and the vertices such that . To prove so, we apply propriety of Definition 2:

• the apex verifies

• a vertex such that verifies

• a vertex of verifies

• a vertex of with verifies

• both and are -adjacent, so they are at the same distance of any vertex of .

We deduce that and are -distinguished only if either one on them belongs to (in that case ) or there exists such that (and then ).

It remains to show that . By the condition of Definition 9, we know that and for any and , implying

Now, we prove the equivalence between the first and third items. Consider a -local distance identifying function , a local -gadget and an instance of Planar Hitting Set such that , . We denote the copies of as or , the copies of as and , and the copies of as and for any and .

In the first direction, suppose that is a hitting set of of size , the -distance identifying set of is defined identically as in the equivalence of the first and second items of the current theorem:

 I={vΩi:ui∈P}∪⋃i∈[n]CΩi∪⋃j∈[m]CSj

Using conditions and of Definitions 9 and 10, is clearly an -dominating set of . Indeed, by every vertex belonging to a copy of the gadget is -dominated. Additionally, every vertex outside of the copies of the gadgets is at distance at most of a copy by construction, but there exists a vertex (so a relevant copy in ) by .

To prove that -distinguishes , the strategy is differing from the previous equivalence only on the vertices and when distinguishing and as we will see.

Recall that by construction, is a -extension of (respectively -extension of ) for any (respectively ). Distinguishing the vertices of the gadget copies is easy, as the condition implies that -distinguishes the vertices of and for every , and similarly -distinguishes the vertices of and for every .

Thereby, we only have to study the vertices of the form , , , and the vertices .

To distinguish them, we mainly use the condition . We observe that for each distinct (they exist as ) :