# Subexponential-time Algorithms for Maximum Independent Set in P_t-free and Broom-free Graphs

In algorithmic graph theory, a classic open question is to determine the complexity of the Maximum Independent Set problem on P_t-free graphs, that is, on graphs not containing any induced path on t vertices. So far, polynomial-time algorithms are known only for t< 5 [Lokshtanov et al., SODA 2014, 570--581, 2014], and an algorithm for t=6 announced recently [Grzesik et al. Arxiv 1707.05491, 2017]. Here we study the existence of subexponential-time algorithms for the problem: we show that for any t> 1, there is an algorithm for Maximum Independent Set on P_t-free graphs whose running time is subexponential in the number of vertices. Even for the weighted version MWIS, the problem is solvable in 2^O(√(tn n)) time on P_t-free graphs. For approximation of MIS in broom-free graphs, a similar time bound is proved. Scattered Set is the generalization of Maximum Independent Set where the vertices of the solution are required to be at distance at least d from each other. We give a complete characterization of those graphs H for which d-Scattered Set on H-free graphs can be solved in time subexponential in the size of the input (that is, in the number of vertices plus the number of edges): If every component of H is a path, then d-Scattered Set on H-free graphs with n vertices and m edges can be solved in time 2^O(|V(H)|√(n+m) (n+m)), even if d is part of the input. Otherwise, assuming the Exponential-Time Hypothesis (ETH), there is no 2^o(n+m)-time algorithm for d-Scattered Set for any fixed d> 3 on H-free graphs with n-vertices and m-edges.

## Authors

• 1 publication
• 38 publications
• 29 publications
• 34 publications
• 1 publication
• 14 publications
• ### Independent Set on P_k-Free Graphs in Quasi-Polynomial Time

We present an algorithm that takes as input a graph G with weights on th...
05/02/2020 ∙ by Peter Gartland, et al. ∙ 0

• ### Parameterized Complexity of Independent Set in H-Free Graphs

In this paper, we investigate the complexity of Maximum Independent Set ...
10/10/2018 ∙ by Édouard Bonnet, et al. ∙ 0

• ### Single-Source Bottleneck Path Algorithm Faster than Sorting for Sparse Graphs

In a directed graph G=(V,E) with a capacity on every edge, a bottleneck ...
08/31/2018 ∙ by Ran Duan, et al. ∙ 0

• ### On Alternative Models for Leaf Powers

A fundamental problem in computational biology is the construction of ph...
05/26/2021 ∙ by Benjamin Bergougnoux, et al. ∙ 0

• ### Polygon Simplification by Minimizing Convex Corners

Let P be a polygon with r>0 reflex vertices and possibly with holes and ...
12/13/2018 ∙ by Yeganeh Bahoo, et al. ∙ 0

• ### Gerrymandering on graphs: Computational complexity and parameterized algorithms

Partitioning a region into districts to favor a particular candidate or ...
02/19/2021 ∙ by Sushmita Gupta, et al. ∙ 0

• ### Conflict-Free Colouring using Maximum Independent Set and Minimum Colouring

We present a polynomial time reduction from the conflict-free colouring ...
12/04/2018 ∙ by S. M. Dhannya, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

There are some problems in discrete optimization that can be considered fundamental. The Maximum Independent Set problem (MIS, for short) is one of them. It takes a graph as input, and asks for the maximum number of mutually nonadjacent (i.e., independent) vertices in . On unrestricted input, it is not only NP-hard (its decision version “Is ?” being NP-complete), but APX-hard as well, and, in fact, not even approximable within in polynomial time for any unless P=NP, as proved by Zuckerman [27]. For this reason, those classes of graphs on which MIS becomes tractable are of definite interest. One direction of this area is to study the complexity of MIS on -free graphs, that is, on graphs not containing any induced subgraph isomorphic to a given graph .

For the majority of the graphs , we know a negative answer on the complexity question. It is easy to see that if is obtained from by subdividing each edge with new vertices, then holds. This can be used to show that MIS is NP-hard on -free graphs whenever is not a forest, and also if contains a tree component with at least two vertices of degree larger than 2 (first observed in [2], see, e.g., [19]). As MIS is known to be NP-hard on graphs of maximum degree at most 3, the case when contains a vertex of degree at least 4 is also NP-hard.

The above observations do not cover the case when every component of is either a path, or a tree with exactly one degree-3 vertex with three paths of arbitrary lengths starting from . There are no further unsolved classes but even this collection means infinitely many cases. For decades, on these graphs only partial results have been obtained, proving polynomial-time solvability in some cases. A classical algorithm of Minty [21] and its corrected form by Sbihi [24] solved the problem when is a claw (3 paths of length 1 in the model above). This happened in 1980. Much later, in 2004, Alekseev [3] generalized this result by an algorithm for isomorphic to a fork (2 paths of length 1 and one path of length 2).

The seemingly easy case of -free graphs is poorly understood (where is the path on vertices). MIS on -free graphs is not known to be NP-hard for any ; for all we know, it could be polynomial-time solvable for every fixed . -free graphs (also known as cographs) have a very simple structure, which can be used to solve MIS with a linear-time recursion, but this does not generalize to -free graphs for larger . In 2010, it was a breakthrough when Randerath and Schiermeyer [22] stated that MIS on -free graphs was solvable in subexponential time, more precisely within for any constants and . Designing an algorithm based on deep results, Lokshtanov et al. [19] finally proved that MIS is polynomial-time solvable on -free graphs. More recently, a quasipolynomial (-time) algorithm was found for -free graphs [18] and finally a polynomial-time algorithm for -free graphs was announced [13].

We explore MIS and some variants on -free graphs from the viewpoint of subexponential-time algorithms in this work. That is, instead of aiming for algorithms with running time on -vertex graphs, we ask if algorithms are possible. Very recently, Brause [8] and independently the conference version of this paper [4] observed that the subexponential algorithm of Randerath and Schiermeyer [22] can be generalized to arbitrary fixed with running time roughly . Our first result shows a significantly improved subexponential-time algorithm for every .

###### Theorem 1.1.

For every fixed , MIS on -vertex -free graphs can be solved in subexponential time, namely, it can be solved by a -time algorithm.

The algorithm is based on the combination of two ideas. First, we generalize the observation of Randerath and Schiermeyer [22] stating that in a large connected -free graph there exists a high-degree vertex. Namely, we prove that such a vertex always exists in a large connected -free graph for general and it can be used for efficient branching. Next we prove the combinatorial result that a -free graph of maximum degree has treewidth ; the proof is inspired by Gyárfás’ proof of the -boundedness of -free graphs [14]. Thus if the maximum degree drops below a certain threshold during the branching procedure, then we can use standard algorithmic techniques exploiting bounded treewidth.

While our algorithm works for -free graphs with arbitrary large , it does not seem to be extendable to -free graphs where is the subdivision of a . Hence, the existence of subexponential-time algorithms on such graphs remains an open question. However, we are able to give a subexponential-time constant-factor approximation algorithm for the case when is a -broom. A -broom is a graph consisting of a path and additional vertices of degree one, all adjacent to one of the endpoints of the path. In other words, is a star with one of the edges subdivided to make it a path with vertices. For , we obtain the generalized forks and , yields the traditional fork. We prove the following theorem; here and are considered constants, hidden in the big- notation.

###### Theorem 1.2.

Let be fixed integers. One can find a -approximation to Maximum Independent Set on an -vertex -free graph in time .

Let us remark that on -free graphs, a folklore linear-time (and very simple) -approximation algorithm exists for Maximum Independent Set; better -approximation algorithms also exist [5, 6, 15, 26]. On fork-free graphs, Independent Set can be solved in polynomial time [3]. For general graphs, we do not expect that a constant-factor approximation can be obtained in subexponential time for the problem. Strong evidence for this was given by Chalermsook et al. [9], who showed that the existence of such an algorithm would violate the Exponential-Time Hypothesis (ETH) of Impagliazzo, Paturi, and Zane, which can be informally stated as -variable 3SAT cannot be solved in time (see [10, 17, 16]).

Scattered Set (also known under other names such as dispersion or distance- independent set [20, 25, 1, 23, 7, 11]) is the natural generalization of MIS where the vertices of the solution are required to be at distance at least from each other; the size of the largest such set will be denoted by . We can consider with being part of the input, or assume that is a fixed constant, in which case we call the problem -Scattered Set. Clearly, MIS is exactly the same as 2-Scattered Set. Despite its similarity to MIS, the branching algorithm of Theorem 1.1 cannot be generalized: we give evidence that there is no subexponential-time algorithm for 3-Scattered Set on -free graphs.

###### Theorem 1.3.

Assuming the ETH, there is no -time algorithm for -Scattered Set with on -free graphs with vertices.

In light of the negative result of Theorem 1.3, we slightly change our objective by aiming for an algorithm that is subexponential in the size of the input, that is, in the total number of vertices and edges of the graph . As the number of edges of can be up to quadratic in the number of vertices, this is a weaker goal: an algorithm that is subexponential in the number of edges is not necessarily subexponential in the number of vertices. We give a complete characterization when such algorithms are possible for Scattered Set.

###### Theorem 1.4.

For every fixed graph , the following holds.

1. If every component of is a path, then -Scattered Set on -free graphs with vertices and edges can be solved in time , even if is part of the input.

2. Otherwise, assuming the ETH, there is no -time algorithm for -Scattered Set for any fixed on -free graphs with -vertices and -edges.

The algorithmic side of Theorem 1.4 is based on the combinatorial observation that the treewidth of -free graphs is sublinear in the number of edges, which means that standard algorithms on bounded-treewidth graphs can be invoked to solve the problem in time subexponential in the number of edges. It has not escaped our notice that this approach is completely generic and could be used for many other problems (e.g., Hamiltonian Cycle, 3-Coloring, and so on), where or perhaps -time algorithms are known on graphs of treewidth . For the lower-bound part of Theorem 1.4, we need to examine only two cases: claw-free graphs and -free graphs (where is the cycle on vertices); the other cases then follow immediately.

The paper is organized as follows. Section 2 introduces basic notation and contains some technical tools for bounding the running time of recursive algorithms. Section 3 contains the combinatorial results that allow us to bound the treewidth of -free graphs. The algorithmic results for Maximum Independent Set (Theorems 1.1 and 1.2) appear in Section 4. The upper and lower bounds for -Scattered Set, which together prove Theorem 1.4, are proved in Section 5.

## 2 Preliminaries

Simple undirected graphs are investigated here throughout. The vertex set of graph will be denoted by , the edge set by . The notation for distance, for the subgraph induced by the vertex set , will have the usual meaning, similarly as and for the closed and open neighborhood respectively of vertex set in . is the maximum degree in . For a vertex set in , means the induced subgraph . () is the chordless path (cycle) on vertices. Finally, a graph is -free if it does not contain as an induced subgraph.

A distance- (-scattered) set in a graph is a vertex set such that for every pair of vertices in , the distance between them is at least in the graph. For , we obtain the traditional notion of independent set (stable set). For , a distance- set is a distance- set as well, for example, for , any distance- set is an independent set.

The algorithmic problem Maximum Weight Independent Set is the problem of maximizing the sum of the weights in an independent set of a graph with nonnegative vertex weights . The maximum is denoted by . For a weight function that has value everywhere, we obtain the usual problem Maximum Independent Set (MIS) with maximum .

An algorithm is subexponential in parameter if the number of steps executed by is a subexponential function of the parameter . We will use here this notion for graphs, mostly in the following cases: is the number of vertices, the number of edges, or (which is considered to be the size of the input generally). Several different definitions are used in the literature under the name subexponential function. Each of them means some condition: this function (with variable , called the parameter) may not be larger than some bound, depending on . Here we use two versions, where the bound is of type and respectively, with some . (Clearly, the second one is the more strict.) Throughout the paper, we state our results emphasizing which version we mean. A problem is subexponential if there exists some subexponential algorithm solving .

### 2.1 Time analysis of recursive algorithms

To formally reason about time complexities, we will need the following technical lemma.

###### Lemma 2.1.

Let be a concave and nondecreasing function with , for every , and for some and every . Let be two nondecreasing functions such that we have , moreover, for some universal constant and and for every :

 T(n)≤2cnlogn/Δ(n)+max( S(n),T(n−1)+T(n−⌈Δ(n)⌉), max1≤k≤⌊nΔ(n)⌋2k⋅n⋅T(n−⌈kΔ(n)⌉)). (1)

Then, for some constant depending only on and , for every it holds that

 T(n)≤2c′nlogn/Δ(n)⋅(S(n)+1).

We will use Lemma 2.1 as a shortcut to argue about time complexities of our branching algorithms; let us now briefly explain its intuition. The function will be the running time bound of the discussed algorithm. The term in (1) corresponds to a processing time at a single step of the algorithm; note that this is at least polynomial in as . The terms in the in (1) are different branching options chosen by the algorithm. The first one, , is a subcall to a different procedure, such as bounded treewidth subroutine. The second one, , corresponds to a two-way branching on a single vertex of degree at least . The last one corresponds to an exhaustive branching on a set of size , such that every connected component of has at most vertices.

###### Proof of Lemma 2.1.

For notational convenience, it will be easier to assume that the functions and is defined on the whole half-line with and .

First, let us replace with addition in the assumed inequality. After some simplifications, this leads to the following.

 T(n)≤T(n−1)+S(n)+2cnlogn/Δ(n)+2n⋅⌊nΔ(n)⌋∑k=12k⋅T(n−kΔ(n)). (2)

From the concavity of it follows that

 n−i−Δ(n−i)≤n−Δ(n).

Furthermore, the assumptions on , namely the fact that is nondecreasing, concave, with , implies that for any we have

 yxΔ(x)≥Δ(x)−Δ(x−y).

After simple algebraic manipulation, this is equivalent to

 xΔ(x)≥x−yΔ(x−y).

That is, is a nondecreasing function.

Using the fact that and are nondecreasing and the facts above, we iteratively apply (2) times to the first summand, obtaining the following.

 T(n)≤n⋅⎛⎜ ⎜⎝S(n)+2cnlogn/Δ(n)+2n⋅⌊nΔ(n)⌋∑k=12k⋅T(n−kΔ(n))⎞⎟ ⎟⎠. (3)

We now show the following.

###### Claim 2.2.

Consider a sequence and . Then for . Here, the big--notation hides constants depending on .

###### Proof.

By the concavity of we have , thus as long as we have that . Consequently, for some we have . We infer that we obtain at position

 i=O(nΔ(n)+n/2Δ(n/2)+n/4Δ(n/4)+…).

By the assumption that for some constant and every , the sum above can be bounded by a geometric sequence, yielding .

The above claim implies that if we iteratively apply (3) to itself, we obtain

 T(n)≤(2n)O(n/Δ(n))⋅(S(n)+2cnlogn/Δ(n)).

This finishes the proof of the lemma. ∎

## 3 Gyárfás’ path-growing argument

The main (technical but useful) result of this section is the following adaptation of Gyárfás’ proof that -free graphs are -bounded [14].

###### Lemma 3.1.

Let be an integer, be a connected graph with a distinguished vertex and maximum degree at most , such that does not contain an induced path with one endpoint in . Then, for every weight function , there exists a set of size at most such that every connected component of satisfies . Furthermore, such a set can be found in polynomial time.

###### Proof.

In what follows, a connected component of an induced subgraph of is big if . Note that there can be at most one big connected component in any induced subgraph of .

If does not contain a big component, we can set . Otherwise, let and be the big component of . As is connected, every component of is adjacent to , thus holds. We will inductively define vertices such that induce a path in .

Given vertices , we define sets and as follows. We set . If does not contain a big connected component, we stop the construction. Otherwise, we set to be the big connected component of . During the process we maintain the invariant that is the big component of and that . Note that this is true for by the choice of and .

It remains to show how to choose , given vertices and sets and . Note that and , so is the big connected component of . Consequently, we can choose some that satisfies all the desired properties.

Since does not contain an induced with one endpoint in , the aforementioned process stops after defining a set for some , when does not contain a big component. Observe that

 |Ai+1|≤(Δ+1)+i⋅Δ=(i+1)Δ+1≤(t−1)Δ+1.

Consequently, the set satisfies the desired properties.

For the algorithmic claim, note that the entire proof can be made algorithmic in a straightforward manner. ∎

It is well known that if graph has a set of size for every weight function such that every connected component of satisfies , then has treewidth (see, e.g., [12, Theorem 11.17(2)]). Thus Lemma 3.1 implies a treewidth bound of . Algorithmically, it is also a standard consequence of Lemma 3.1 that a tree decomposition of width can be obtained in polynomial time. What needs to be observed is that standard 4-approximation algorithms for treewidth, which run in time exponential in treewidth, can be made to run in polynomial time if we are given a polynomial-time subroutine for finding the separator as in Lemma 3.1. For completeness, we sketch the proof here.

###### Corollary 3.2.

A -free graph with maximum degree has treewidth . Furthermore, a tree decomposition of this width can be computed in polynomial time.

###### Proof.

We follow standard constant approximation algorithm for treewidth, as described in [10, Section 7.6]. This algorithm, given a graph and an integer , either correctly concludes that or computes a tree decomposition of of width at most .

Let be a -free graph with maximum degree at most . We may assume that is connected, otherwise we can handle the connected components separately. Let us start by setting so that any application of Lemma 3.1 gives a set of size at most .

The only step of the algorithm that runs in exponential time is the following. We are given an induced subgraph of and a set with the following properties:

1. and ;

2. both and are connected;

3. .

The goal is to compute a set such that and every connected component of is adjacent to at most vertices of .

The construction of is trivial for , as we can take for an arbitrary . The crucial step happens for sets of size exactly . Instead of the exponential search of [10, Section 7.6], we invoke Lemma 3.1 on the graph and a function that puts if and only if . The lemma returns a set of size at most such that every connected component of contains at most vertices of . Since is connected and , we cannot have . Consequently, satisfies all the requirements.

The algorithm of [10, Section 7.6] returns that only if at some step it encounters pair for which it cannot construct the set . However, our method of constructing works for every choice of , and executes in polynomial time. Consequently, the modified algorithm of [10, Section 7.6] always computes a tree decomposition of width at most in polynomial time, as desired. ∎

## 4 Subexponential algorithms based on the path-growing argument

The goal of this section is to use Corollary 2.2 to prove Theorems 1.1 and 1.2 stated in the Introduction.

### 4.1 Independent Set on graphs without long paths

We first prove the following statement, which implies Theorem 1.1.

###### Theorem 4.1.

The Maximum-Weight Independent Set problem on an -vertex -free graph can be solved in time .

###### Proof.

Let be an -vertex -free graph. We set a threshold . If the maximum degree of is at most , we invoke Corollary 3.2 to obtain a tree decomposition of of width . By standard techniques on graphs of bounded treewidth (cf. [10]), we solve Maximum-Weight Independent Set on in time .

Otherwise, contains a vertex of degree greater than . We choose (arbitrarily) such a vertex and we branch on : either is contained in the maximum independent set or not. In the first case we delete from , in the second we delete only from . This gives the following recursion for the time complexity of the algorithm.

 T(n)≤max(T(n−1)+T(n−⌈Δ(n)⌉)+O(n2),2O(√tnlogn)). (4)

Observe that we have by Lemma 2.1 with ; it is straightforward to check that satisfies all the prerequisites of Lemma 2.1. This finishes the proof of the theorem. ∎

### 4.2 Approximation on broom-free graphs

We now extend the argumentation in Theorem 4.1 to -brooms—however, this time we are able to obtain only an approximation algorithm. Recall that a -broom is a graph consisting of a path and additional vertices of degree one, all adjacent to one of the endpoints of the path.

We now prove Theorem 1.2 from the introduction.

###### Proof of Theorem 1.2.

Let ; note that such a definition fits the prerequisites of for Lemma 2.1. In the complexity analysis, we will use Lemma 2.1 with this and without any function ; this will give the promised running time bound. In what follows, whenever we execute a branching step of the algorithm we argue that it fits into one of the subcases of the in (1) of Lemma 2.1.

As in the proof of Theorem 4.1, as long as there exists a vertex in of degree larger than , we can branch on such a vertex : in one subcase, we consider independent sets not containing (and thus delete from ), in the other subcase, we consider independent sets containing (and thus delete from ). Such a branching step can be conducted in polynomial time, and fits in the second subcase of in (1). Thus, we can assume henceforth that the maximum degree of is at most .

We also assume that is connected and , as otherwise we can consider every connected component independently and/or solve the problem by brute-force.

Later, we will also need a more general branching step. If, in the course of the analysis, we identify a set such that every connected component of has size at most , then we can exhaustively branch on all vertices of and independently resolve all connected components of the remaining graph. Such a branching fits into the last case of the in (1), and hence it again leads to the desired time bound by Lemma 2.1.

We start with greedily constructing a set with the following properties: is connected and . We start with being a single arbitrary vertex and, as long as , we add an arbitrary vertex of to and continue. Since is connected, the process ends when ; since the maximum degree of is at most , we have .

Let be the vertex set of the largest connected component of . If , we exhaustively branch on , as is of size at most , but every connected component of is of size at most . Hence, we are left with the case .

Let . Note that is disjoint from . Let be the connected component of that contains . Since , we have that ; in particular, while, as , we have . Furthermore, since and , we have .

Consider now the following case: there exists such that contains an independent set of size . Observe that such a vertex can be found by an exhaustive search in time .

For such a vertex and independent set , define to be the vertex set of the connected component of that contains . Note that as we have , and thus such a component exists. Furthermore, as , contains . In particular, contains , and

 |D|≥|(A1∪S)∖N(L)|≥|N[A1]|−Δ⋅|L|≥n1/2−dn1/4≥12n1/2.

If , then we exhaustively branch on the set , as while every connected component of is of size at most due to being of size at least and at most . Consequently we can assume .

Observe that does not contain a path with one endpoint in , as such a path, together with the set , would induce a in . Consequently, we can apply Lemma 3.1 to the graph with the vertex and uniform weight for every , obtaining a set of size such that every connected component of has size at most . We branch exhaustively on the set : this set is of size at most , while every connected component of is of size at most due to the properties of and the fact that . This finishes the description of the algorithm in the case when there exists and an independent set of size .

We are left with the complementary case, where for every , the maximum independent set in is of size less than . We perform the following operation: by exhaustive search, we find a maximum independent set in and greedily take it to the solution; that is, recurse on and return the union of and the independent set found by the recursive call in . Since , the exhaustive search runs in time, fitting the first summand of the right hand side in (1). As a result, the graph reduces by at least one vertex, and hence the remaining running time of the algorithm fits into the second case of the in (1). This gives the promised running time bound. It remains to argue about the approximation ratio; to this end, it suffices to show the following claim.

###### Claim 4.2.

If is a maximum independent set in and is a maximum independent set in , then .

###### Proof.

Let . Clearly, is an independent set in , and thus . It suffices to show that , that is, .

The maximality of implies that . As is a maximum independent set in , we have that . For every , pick a neighbor . Note that we have . Since for every vertex , the size of the maximum independent set in is less than , we have for every . Consequently,

 |I∩N[IA]∩B|≤(d−1)|IA∩S|≤(d−1)|IA|.

Together with , we have , as desired.

This finishes the proof of Theorem 1.2. ∎

## 5 Scattered Set

We prove Theorem 1.4 in this section. The algorithm for Scattered Set for -free graphs hinges on the following combinatorial bound.

###### Lemma 5.1.

For every and for every -free graph with edges, we have that has treewidth .

###### Proof.

Let be the set of vertices of with degree at least . The sum of the degrees of the vertices in is at most , hence we have . By the definition of , the graph has maximum degree less than . Thus by Corollary 3.2, the treewidth of is . As removing a vertex can decrease treewidth at most by one, it follows that has treewidth at most . ∎

It is known that Scattered Set can be solved in time on graphs of treewidth using standard dynamic programming techniques (cf. [25, 20]). By Lemma 5.1, it follows that Scattered Set on -free graphs can be solved in time . If is a fixed constant, then this running time can be bounded as . If is part of the input, then (taking into account that we may assume ) the running time is

 dO(t√m)⋅nO(1)=2O(t√mlogn)+O(logn)=2O(t√n+mlog(n+m)).

Observe that if every component of a fixed graph is a path, then is an induced subgraph of , which implies that -free graphs are -free. Thus the algorithm described here for -free graphs implies the first part of Theorem 1.4.

### 5.1 Lower bounds for Scattered Set

A standard consequence of the ETH and the so-called Sparsification Lemma is that there is no subexponential-time algorithm for MIS even on graphs of bounded degree (see, e.g., [10]):

###### Theorem 5.2.

Assuming the ETH, there is no -time algorithm for MIS on -vertex graphs of maximum degree 3.

A very simple reduction can reduce MIS to 3-Scattered Set for -free graphs, showing that, assuming the ETH, there is no algorithm subexponential in the number of vertices for the latter problem. This proves Theorem 1.3 stated in the Introduction.

###### Proof of Theorem 1.3.

Given an -vertex -edge graph with maximum degree 3 and an integer , we construct a -free graph with vertices such that . This reduction proves that a -time algorithm for 3-Scattered Set could be used to obtain a -time algorithm for MIS on graphs of maximum degree 3, and this would violate the ETH by Theorem 5.2.

We may assume that has no isolated vertices. The graph contains one vertex for each vertex of and additionally one vertex for each edge of . The vertices of representing the edges of form a clique. Moreover, if the endpoints of an edge are , then the vertex of representing is connected with the vertices of representing and . This completes the construction of . It is easy to see that is -free: an induced path of can contain at most two vertices of the clique corresponding to and the vertices of corresponding to the vertices of form an independent set.

If is an independent set of , then we claim that the corresponding vertices of are at distance at least 3 from each other. Indeed, no two such vertices have a common neighbor: if and the corresponding two vertices in have a common neighbor, then this common neighbor represents an edge of whose endpoints are and , violating the assumption that is independent. Conversely, suppose that is a set of vertices with pairwise distance at least 3 in . If , then all these vertices represent vertices of : observe that for every edge of , the vertex of representing is at distance at most 2 from every other (non-isolated) vertex of . We claim that corresponds to an independent set of . Indeed, if and there is an edge in with endpoints and , then the vertex of representing is a common neighbor of and , a contradiction. ∎

Next we give negative results on the existence of algorithms for Scattered Set that have running time subexponential in the number of edges. To rule out such algorithms, we construct instances that have bounded degree: then being subexponential in the number of vertices or the number of edges are the same. We consider first claw-free graphs. The key insight here is that Scattered Set with in line graphs (which are claw-free) is essentially the Induced Matching problem, for which it is easy to prove hardness results.

###### Theorem 5.3.

Assuming the ETH, -Scattered Set does not have a algorithm on -vertex claw-free graphs of maximum degree 6 for any fixed .

###### Proof.

Given an -vertex graph with maximum degree 3, we construct a claw-free graph with vertices and maximum degree 4 such that . Then by Theorem 5.2, a -time algorithm for -Scattered Set for -vertex claw-free graphs of maximum degree 4 would violate the ETH.

The construction is slightly different based on the parity of ; let us first consider the case when

is odd. Let us construct the graph

by attaching a path of edges to each vertex ; let us denote by , , the edges of this path such that is incident with . The graph is defined as the line graph of , that is, each vertex of represents an edge of and two vertices of are adjacent if the corresponding two vertices share an endpoint. It is well known that line graphs are claw-free. As has edges and maximum degree 4 (recall that has maximum degree 3), the line graph has maximum degree 6 with vertices an edges. Thus an algorithm for Scattered Set with running time on -vertex claw-free graphs of maximum degree 3 could be used to solve MIS on -vertex graphs with maximum degree 3 in time , contradicting the ETH.

If there is an independent set of size in , then we claim that the set is a -scattered set of size in . To see this, suppose for a contradiction that there are two vertices such that the vertices of representing and are at distance at most from each other. This implies that there is a path in that has at most edges and whose first and last edges are and , respectively. However, such a path would need to contain all the edges of path and all the edges of , hence it can contain at most edges outside these two paths. But and are not adjacent in by assumption, hence more than one edge is needed to complete and to a path, a contradiction.

Conversely, let be a distance- scattered set in , which corresponds to a set of edges in . Observe that for any , at most one edge of can be incident to the vertices of : otherwise, the corresponding two vertices in the line graph would have distance at most . It is easy to see that if contains an edge incident to a vertex of , then we can always replace this edge with , as this can only move it farther away from the other edges of . Thus we may assume that every edge of is of the form . Let us construct the set , which has size exactly . Then is independent in : if are adjacent in , then there is a path of edges in whose first an last edges are and , respectively, hence the vertices of corresponding to them have distance at most .

If is even, then the proof is similar, but we obtain the graph by first subdividing each edge and attaching paths of length to each original vertex. The proof proceeds in a similar way: if and are adjacent in , then has a path of edges whose first and last edges are and , respectively, hence the vertices of corresponding to them have distance at most . ∎

There is a well-known and easy way of proving hardness of MIS on graphs with large girth: subdividing edges increases girth and the size of the largest independent set changes in a controlled way.

###### Lemma 5.4.

If there is an -time algorithm for MIS on -vertex graphs of maximum degree 3 and girth more than for any fixed , then the ETH fails.

###### Proof.

Let be a fixed constant and let be a simple graph with vertices, edges, and maximum degree 3 (hence ). We construct a graph by subdividing each edge with new vertices. We have that has vertices, maximum degree 3, and girth at least . It is known and easy to show that subdividing the edges this way increases the size of the maximum independent set exactly by . Thus a - time algorithm for -vertex graphs of maximum degree 3 and girth at least could be used to give a -time algorithm for -vertex graphs of maximum degree , hence the ETH would fail by Theorem 5.2. ∎

We use the lower bound of Lemma 5.4 to prove lower bounds for Scattered Set on -free graphs.

###### Theorem 5.5.

Assuming the ETH, -Scattered Set does not have a algorithm on -vertex -free graphs with maximum degree 3 for any fixed and .

###### Proof.

Let be an -vertex -edge graph of maximum degree 3 and girth more than . We construct a graph the following way: we subdivide each edge of with new vertices to create a path of length , and attach a path of length to each of the new vertices created. The resulting graph has maximum degree 3, vertices and edges, and girth more than (hence it is -free). We claim that holds. This means that an -time algorithm for Scattered Set -vertex -free graphs with maximum degree 3 would give a -time algorithm for -vertex graphs of maximum degree 3 and girth more than and this would violate the ETH by Lemma 5.4.

To see that holds, consider first an independent set of . When constructing , we attached paths of length . Let contain the degree-1 endpoints of these paths, plus the vertices of corresponding to the vertices of . It is easy to see that any two vertices of has distance at least from each other: is an independent set in , hence the corresponding vertices in are at distance at least from each other, while the degree-1 endpoints of the paths of length are at distance at least from every other vertex that can potentially be in . This shows . Conversely, let be a set of vertices in that are at distance at least from each other. The set contains two types of vertices: let be the vertices that correspond to the original vertices of and let be the vertices that come from the new vertices introduced in the construction of . Observe that can be covered by paths of length and each such path can contain at most one vertex of , hence at most vertices of can be in . We claim that can contain at most vertices, as corresponds to an independent set of . Indeed, if and are adjacent vertices of , then the corresponding two vertices of are at distance , hence they cannot be both present in . This shows , completing the proof of the correctness of the reduction. ∎

As the following corollary shows, putting together Theorems 5.3 and 5.5 implies Theorem 1.4(2).

###### Corollary 5.6.

If is a graph having a component that is not a path, then, assuming the ETH,