Detecting and enumerating a fixed subgraph in a given host graph is an important and well-studied graph problem. Even the special cases for small subgraphs have many applications e. g. in the analysis of protein–protein networks  or of social networks .
We focus on the problem variants where has three or four vertices, resulting in 15 problem variants. Out of these 15 candidates for , only three are known to be detectable in linear time: a path on three or four vertices  and the complement of a (an edge plus an isolated vertex). For the remaining 12 subgraphs the (theoretically) fastest known algorithms are based on fast matrix multiplication [24, 13] and mostly run in 111The -notation suppresses polylogarithmic factors. Here is the time to multiply two -matrices; it is known that . time ( for clique and independent set on four vertices). However, the fast matrix multiplication is not practical due to its large overhead. We will thus focus on “combinatorial” algorithms. Although this term is not well-defined, it is usually used to denote algorithms without any use of fast matrix multiplication. These algorithms are often more efficient in practice.
Finding a combinatorial algorithm that detects a triangle in for seems challenging to an extent that it was conjectured to not exist [1, 23]. To circumvent this difficulty, we follow the spirit of “parameterization for polynomial-time solvable problems” (also referred to as “FPT in P”) . Our parameter of choice is the recently defined -closure which captures a natural property found often in social networks : two vertices with many common neighbors tend to be adjacent. More formally, the -closure of a graph is the smallest integer such that any two non-adjacent vertices have less than common neighbors.
An advantage of the -closure is that its reasonably small in social networks with thousands of vertices . We provide FPT in P algorithms with a small polynomial dependency on , thus parameter-values which are prohibitively high for exponential-time algorithms are still be acceptable in our setting.
Besides induced subgraph detection algorithms we also investigate the enumeration problems. Here, we settle for all but four out of the 15 subgraphs the complexity on -closed graphs; see Table 1 for an overview on our results and existing work.
|,||[Sections 3 and 3]|
|[Section 4.2]||,||[Section 4.1]|
|,||[Section 4.2]||,||[Section 4.1]|
|[Section 4.1]||[Section 4.1]|
Further Related Work.
We refer to Table 1 for an overview on prior results on subgraph detection algorithms for three- and four-vertex subgraphs. As to subgraph enumeration, it is folklore that for a graph on vertices, an algorithm that enumerates all induced copies of takes time (see Section 3). For enumerating triangles, an -time algorithm is provided by Itai and Rodeh .
As to FPT in P, there are a few works on detecting and counting triangles [18, 3, 5]. Kowaluk and Lingas  provided parameterized algorithms for several induced subgraph detection problems where the subgraph has four vertices. Their parameter is the order of the largest clique in the host graph.
Being a relatively new parameter, there is not much work on parameterized algorithms exploiting the -closure [8, 15, 16]. All maximal cliques can be enumerated in time . For constant , they all showed that there are maximal cliques in -closed graphs, which was previously shown for . Dense subgraphs such as -plexes, -defective cliques, and bicliques can be enumerated in time . Moreover, polynomial kernels for several NP-hard graph problems are known .
For , let denote the set . Throughout the paper, we use to denote an undirected graph. Let and be the vertex set and the edge set of , respectively, with and . We will use for the complement of . For a vertex , let and denote its open and closed neighborhood, respectively. The degree of a vertex is . A vertex is universal if . For a vertex set , the notation is used for the subgraph induced by . The path on vertices is denoted by , the complete graph on vertices is denoted by , and the complete bipartite graph with the parts containing and vertices is denoted by .
[] A graph is -closed if for all pairs of nonadjacent vertices . The -closure of is the smallest integer such that is -closed.
Considering the landscape of graph parameters, the -closure is obviously “smaller” than the maximum vertex degree of the graph, i. e., . Other (common) parameters smaller than are minimum degree, degeneracy, acyclic chromatic number, and -index . These parameters are unrelated to -closure: they are all on a (-closure is ) and large on a (-closure is 1). These two examples also show that there are graphs with and graphs with .
3 Three-Vertex Induced Subgraphs
In this section, we consider the three-vertex induced subgraphs. We start with the edgeless subgraph. For constant , it was shown that finding an independent set on vertices in -closed graphs can be found in time . Thus, a can be detected in time. Enumerating all cannot be done in time, even on -closed graphs as an edgeless graph on vertices is -closed and contains many ’s. This settles finding and enumerating ’s.
As a side result, we remark that the -time algorithm for detecting a can be used as subroutine to find stars. However, note that the subsequent result is not useful for finding stars few leafs as the existing algorithms for finding a and a claw on general graphs are faster (see Table 1). There is an -time algorithm to find an induced for constant .
Let . An induced in which the center is of degree at most can be found in time: There are choices for the center and one of its leaves and choices for the other leaves. Hence, it remains to find a where the center is a vertex of degree at least . This can be done by looking for an independent set in . Recall that an independent set of order can be found in time . Since there are vertices of degree at least , the overall running time is . ∎
With the case of being settled, we turn to the remaining three-vertex graphs: , , and . As already mentioned by Williams et al. , one can find a in linear time. As we were unable to find the corresponding algorithm in the literature, we provide one for completeness.
There is an -time algorithm to find an induced .
Assume that the input graph has at least one edge; otherwise there is no . Further assume that there is no isolated vertex; otherwise the isolated vertex and any edge forms a . Clearly, this assumptions can be checked in linear time.
Partition the vertex into two parts and where is the set of vertices of degree less than and . One can check in time whether is an independent set. If not, then with is part of a : Since , there is a vertex . Thus assume that is an independent set.
Next consider the case that there is a vertex with . Thus, there is a vertex . Since and it follows that there is a vertex . Thus forms a . Note that and can be found (if one exists) in linear time. Once and are fixed, one can find with another linear-time scan of the graph.
It remains to consider the case that for each vertex we have . Since is an independent set, it follows that no vertex in can be part in a . Thus, it suffices to look for a ’s in . Since for each , it follows that we can compute in time. Moreover, we can find a in in time222Note that each connected component that is not a clique contains a , which can be found with a simple BFS from a non-universal vertex.. ∎
Note that all ’s can be enumerated easily in time: Enumerate all combinations of one edge and one vertex and check whether they induce a . The following observation shows that this running time bound is tight, even for -closed graphs.
Let be a constant. For every subgraph on vertices there is an -vertex graph containing distinct occurrences of . Moreover, if does not contain an induced , then is -closed.
For a fixed on vertices, let be the graph obtained by replacing each vertex with a clique of order . Formally, let and . Clearly, for each combination , the graph is isomorphic to . This show the first part.
As to the second part, observe that if does not contain an induced , then is a cluster graph. By construction, it follows that also is a cluster graph thus is -closed. ∎
We continue with ’s. As a can be found in linear timeFootnote 2, there is no need to consider -closed graphs for the detection problem. We thus turn to enumeration. First, observe that a start contains many ’s and is -closed. Thus, the following upper bound on the number of ’s is tight.
A -closed graph has induced ’s.
Let be the set of all ’s in and let be the set of all ’s with endpoints and . By definition, . Since is -closed, for each . Thus, we obtain . ∎
The algorithm considers each edge and each vertex incident with . Thus, , , and form either a or a triangle. Since , we obtain the following theorem:
There is an -time algorithm to enumerate all ’s.
Fox et al.  showed that a set of cliques containing all maximal cliques can be enumerated in time, where is the time complexity to list all induced ’s. They noted that due to the result of Gasieniec et al. , where and are the the matrix multiplication exponent and the dual exponent of matrix multiplication, respectively. Using Section 3 to bound gives the following:
There is an -time algorithm to find a set of cliques containing all maximal cliques.
We can adapt Algorithm 1 to an algorithm for finding a triangle: As mentioned above, we find either a triangle or in Line 4 of Algorithm 1. In order to find a triangle in time, we just terminate the algorithm as soon as one is detected.
There is an -time algorithm to find a triangle.
There is an -time algorithm to find a clique of size .
For each subset of vertices, we check whether there is a triangle in in time by Section 3. ∎
Next, we develop a more efficient algorithm for finding a triangle in sparse graphs.
There is an -time algorithm to find a triangle.
Let . Let be the set of vertices with degree at least and let . Note that . If there is a triangle in , then it can be found in time by Section 3. If there is a triangle containing at least one vertex of , then it can be found in time. ∎
As a side-result, we also show that by enumerating all ’s, one can compute the -closure.
There is an -time algorithm to compute the -closure.
We remark that deciding whether a graph is 2-closed requires time .
4 Four-Vertex Induced Subgraphs
In this section, we consider four-vertex subgraphs. We turn our attention first to the enumeration aspect and then to the detection part.
Recall that if our four-vertex subgraph does not contain an induced , then Section 3 excludes algorithms with running time for any function and any . This applies to five of the eleven subgraphs: co-diamond, co-square, co-claw, and . Interestingly, as we show below, we can have algorithms with running time or better for the other six subgraphs, namely, co-paw, , claw , paw, square , and diamond. This is implied by the next simple but general theorem. Before stating the theorem, we need some more notation: For a graph , let be the minimum size of a vertex set such that each vertex in has at least two nonadjacent neighbors in . For instance, , , and .
Let be a graph. There is an time algorithm to enumerate induced copies of .
We first compute the set for each pair of nonadjacent vertices. We can do so by enumerating all ’s using Algorithm 1 in time. We consider each choice of vertices such that each vertex in has at least two nonadjacent vertices. For , there are choices. ∎
This algorithm can enumerate squares and diamonds in time and co-paws, ’s, claws, and paws in time. Note that we can construct -close graphs containing co-paws, claws, or paws respectively (see discussion in the second part of this subsection). However, for , square, and diamond we do not have fitting lower bounds (in terms of and ).
For ’s, paws, and squares we found alternative bounds. As we see in the second part of this subsection the running time of the following algorithm for ’s and paws is tight.
There is an an -time algorithm to enumerate all induced ’s and all paws.
By considering all combinations of one edge and one vertex, one fixes three vertices in a (a paw). For ’s (paws) assume the non-fixed vertex is one of the two degree-two vertices (the degree-three vertex). It follows from the definition of -closure that there are at most choices for the fourth vertex. Note that these choices can be obtained using Algorithm 1 in time. This results in an -time algorithm. ∎
As we shall in the second part of this section, there are 3-closed graphs with induced copies of . Thus, the running time of the following algorithm could still be improved slightly. There is an an -time algorithm to enumerate all induced squares.
Let . We call a vertex high-degree if its degree is at least and low-degree otherwise. We consider two cases based on which vertices of the square are high-degree.
Two consecutive vertices are low-degree: First, we consider each edge where both endpoints and are of low-degree. Then, we consider each neighbors and of and , respectively. We list the square if . This requires time.
Two opposite vertices are high-degree: We first enumerate all ’s where both endpoints are high-degree in time. We achieve this by adapting Algorithm 1: We consider each edge where at least one endpoint is high-degree in Line 2 instead. Without loss of generality, assume that is high-degree. Moreover, we consider each high-degree neighbor of in Line 3 instead. Then, this algorithm spends time for each triangle or whose endpoints are both high-degree. Since there are triangles and ’s whose endpoints are both high-degree, this adaptation of Algorithm 1 takes time. Thus, we have the set of common neighbors of each pair of nonadjacent high-degree vertices. Now we can enumerate all squares where two opposite vertices are high-degree in time. Overall, all squares are listed in time. ∎
(Tight) Lower bounds.
We now provide (almost) fitting lower bounds. Whenever possible, we replace factors of by factors of (mostly replacing by ). This is done via the following simple observation.
For a graph of constant size, there is an -time algorithm to enumerate all induced copies of , where is the maximum matching size of .
We consider each choice for the set of vertices and the set of edges. Note that there are such choices. Since of constant size, whether forms an induced can be checked in constant time. ∎
Sections 4.1 and 3 yield matching running time upper and lower bounds even in -closed graphs for the task of enumerating ’s, co-diamonds, co-squares, co-claws, or ’s. The remaining six cases are discussed below (in the order they are listed in Table 1)
Start with a co-paw. The upper bound and follow by simple brute force selecting edges and vertices (as in Section 4.1). As to the lower bound consider the disjoint union of an independent set and a star : It is -closed and has edges and contains co-paws.
As to ’s, observe that again the upper bound follows from Section 4.1. As to the lower bound, consider the graph resulting from making the centers of two ’s adjacent: it is -closed and contains many ’s. Note that this lower bound fits to the algorithm in Section 4.1 but leaves a gap to the -time algorithm following from Section 4.1. Interestingly, we can improve the lower bound as stated in the next theorem, but also the the new lower bound does not match the upper bound.
There is an infinite family of -closed graphs containing ’s and squares.
Suppose that for an integer and consider a projective plane on points and lines. It fulfills the following properties:
For any pair of points (lines), there is exactly one line incident with both points (points).
Each point (line) is incident with exactly lines (points).
See e.g. Albert and Sandler  for more on projective planes.
Now consider the graph constructed as follows: We introduce vertices for each point of and vertices for each line of . Then, we add an edge for each point and for each line . We also add edges for each pair of point and line that are incident in . The constructed graph is 3-closed: For two distinct points and (lines and ), the vertices and ( and ) for have exactly two common neighbors by the first property of projective planes. For a non-incident pair of a point and a line , the vertices and for have no common neighbor. Moreover, has vertices and edges by the second property of projective planes.
Finally, we count the number of ’s and squares. We begin with ’s. Let and be distinct points. By the properties of projective planes, there is exactly one line on which both and lie and there are exactly lines that are incident with and not with . Let be one of these lines. Observe that is a in . Hence, has ’s. Next, we consider squares. For each pair of distinct points and , there exists an induced on , where is the line incident with both and . Thus, there are squares in . ∎
We remark that a construction similar to the above one was used to show a lower bound on the number of maximal cliques in 2-closed graphs by Eschen et al. .
Continuing with claws, observe that the upper bound follows from Section 4.1. As to the lower bound, consider a star : it is -closed and contains claws.
As to paws, observe that again time follows from Section 4.1. For the lower bound, consider a clique where at one vertex of the clique there are degree-one vertices attached. This results in a -closed graph with paws.
Finally, consider diamonds. Observe that again time follows from Section 4.1. As for lower bounds, consider a graph obtained by making the two high-degree vertices in a adjacent. This graph is -closed and has diamonds: combining the two high-degree vertices with any two independent-set vertices form a diamond. Note that the algorithm following from Section 4.1 has an additional term in its running time which means it not tight.
In this section, we provide efficient algorithms for five out of the eleven induced subgraph detection problems on -closed graphs, namely for the subgraphs co-diamond, co-paw, co-square, paw, and square. Note that a can be found in linear time , thus there is no room for improvement. For a faster algorithm on -closed graphs is known  (see also first paragraph of Section 3). Hence, for four subgraphs, namely claw, co-claw, diamond, and , the question for fast algorithms on -closed graphs remain open.
Out of the five positive results, detecting a co-diamond and a co-square require new algorithms. For the remaining three subgraph, the results either directly from Section 4.1 (for square) or from known characterizations via induced three-vertex subgraphs and results from Section 3 (for co-paw and paw). We start by briefly discussing the latter (co-paw and paw). Afterwards, we show the algorithms for detecting a co-diamond and a co-square. Finally, we provide a algorithm for detecting a diamond in a gem-free -closed graph. Moreover, we highlight the issue that needs to be resolved in order to remove the gem-free assumption.
Co-paw and paw.
We use the characterization of Olariu : A graph is a paw-free if and only if it is triangle-free or -free. Thus, we immediately obtain the following corollary from Section 3 and Sections 3 and 3. 11todo: 1@TK: Can we also to find the co-pow / paw?
There is an -time algorithm to detect an induced paw.
There is an -time algorithm to detect an induced co-paw.
We next present our algorithm detecting co-diamonds, which is based on the following structural statements.
If there is a maximal clique of order at least in , then either is a clique or contains a co-diamond.
If is a clique, then clearly the graph is co-diamond-free. It remains to show that if is not a clique, then contains a diamond. To this end, let for . Since is maximal, there exist vertices such that . By the -closure, we have that and that . Therefore, . For , the four vertices forms an induced co-diamond. ∎
In our algorithm, we will use the following statement, which is a small reformulation of Section 4.2.
Let be a graph that cannot be partitioned into two cliques. If there is a clique of order at least in , then contains a co-diamond.22todo: 2@TK: Can one find the co-diamond in linear time?!
There is an -time algorithm to detect an induced co-diamond.
If , then we can determine whether the input graph has an induced co-diamond in time, using the -time algorithm of Eisenbrand and Grandoni . So assume that .
Then, we determine whether the vertex set can be partitioned into two cliques and . If , then this is impossible (at least one clique needs to be of order ). Thus, assume . Hence, in time we can simply check whether the complement of is bipartite. Suppose that there are two cliques and such that . Then, we can conclude that has no induced co-diamond. Thus, we assume in the following that cannot be partitioned into two cliques (note that this allows us to invoke Section 4.2).
We claim that if there is an edge such that and , then has an induced co-diamond. Since , we have . If is a clique (which can be checked in time), then by Section 4.2 there is a co-diamond in . Hence, assume there exist nonadjacent vertices . However, then forms an induced co-diamond. Note that such an induced co-diamond can be found in time.
Next, consider the case that there is a vertex such that . We claim that has an induced co-diamond in this case. Note that . Hence, if is a clique, then by Section 4.2 there is a co-diamond in . Otherwise, there exist nonadjacent vertices . Moreover, there exists a vertex that is adjacent to neither nor : The -closure of yields that . Thus, we find an induced co-diamond . Hence, we assume in the following that each vertex has a degree of at most or at least .
It remains to consider the case that each edge contains a vertex of degree at least . We iterate over all these high-degree vertex; let be such a vertex of degree at least . To find an co-diamond where is one of its degree-one vertices, we simply check whether there are nonadjacent pair of vertices in . Since , we can find and (if existing) in time. If there is no such pair, then we can conclude that has no induced co-diamond containing . Otherwise, there is an induced co-diamond for . Note that by the same argument above. Since we spend time for each vertex of degree at least , this step requires time. ∎
We now consider co-squares. The next lemma plays an important role in our co-square detection algorithms.
Suppose that there are vertices such that , , and . Then, contains an induced co-square.
Since is -closed, we have . It follows that . Let be an arbitrary vertex in . Since , we have by the -closure of . Thus, . For an arbitrary vertex , the vertices form an induced co-square. ∎
We say that a connected component is trivial if it consists of one vertex.
There is an -time randomized algorithm to detect an induced co-square.
Let be the set of vertices of degree at least . If is not a clique, then contains an induced co-square by Section 4.2. So assume that is a clique. Let be the connected components of . If all components are trivial, then there is no induced co-square. Moreover, if there are more than one non-trivial connected component, we find an induced co-square. Thus, we assume that there is exactly one connected component with at least one edge.
If there is no co-square in , then the diameter of is at most three. Since every vertex in has degree at most , we can assume that . Furthermore, if , then there is an induced co-square. For each vertex , there exists a vertex in that is not adjacent to , because and has at most neighbors. Thus, the -closure of yields that each vertex has at most neighbors in . Consequently, for an edge , there are two vertices that are not adjacent to or , which form a co-square along with and . Hence, we can also assume that .
Let be the set of isolated vertices in . Now we describe an -time algorithm to find a co-square containing a vertex of . For each vertex , we check whether has a neighbor in . This can be done in time. Then, for each vertex with at least one neighbor in , we check whether there is an edge such that . If there is such an edge , then is a co-square, where is a vertex adjacent to . Otherwise, we can conclude that there is no co-square containing a vertex of . Note that this procedure takes time, because and .
Finally, it remains to find a co-square in . Using the -time algorithm of Williams et al. , this can be done in . ∎