# Improved approximation algorithms for path vertex covers in regular graphs

Given a simple graph G = (V, E) and a constant integer k > 2, the k-path vertex cover problem ( PkVC) asks for a minimum subset F ⊆ V of vertices such that the induced subgraph G[V - F] does not contain any path of order k. When k = 2, this turns out to be the classic vertex cover ( VC) problem, which admits a (2 - Θ(1/|V|))-approximation. The general PkVC admits a trivial k-approximation; when k = 3 and k = 4, the best known approximation results for P3VC and P4VC are a 2-approximation and a 3-approximation, respectively. On d-regular graphs, the approximation ratios can be reduced to {2 - 5/d+3 + ϵ, 2 - (2 - o(1)) d/ d} for VC ( i.e., P2VC), 2 - 1/d + 4d - 2/3d |V| for P3VC, d/2 (2d - 2)/( d/2 + 1) (d - 2) for P4VC, and 2d - k + 2/d - k + 2 for PkVC when 1 < k-2 < d < 2(k-2). By utilizing an existing algorithm for graph defective coloring, we first present a d/2 (2d - k + 2)/( d/2 + 1) (d - k + 2)-approximation for PkVC on d-regular graphs when 1 < k - 2 < d. This beats all the best known approximation results for PkVC on d-regular graphs for k > 3, except for P4VC it ties with the best prior work and in particular they tie at 2 on cubic graphs and 4-regular graphs. We then propose a 1.875-approximation and a 1.852-approximation for P4VC on cubic graphs and 4-regular graphs, respectively. We also present a better approximation algorithm for P4VC on d-regular bipartite graphs.

## Authors

• 15 publications
• 46 publications
• 3 publications
• 9 publications
11/29/2020

### Approximation algorithms for hitting subgraphs

Let H be a fixed undirected graph on k vertices. The H-hitting set probl...
12/21/2018

### A local search 4/3-approximation algorithm for the minimum 3-path partition problem

Given a graph G = (V, E), the 3-path partition problem is to find a mini...
08/27/2021

### An explicit vector algorithm for high-girth MaxCut

We give an approximation algorithm for MaxCut and provide guarantees on ...
08/04/2020

### Constructing transient amplifiers for death-Birth updating: A case study of cubic and quartic regular graphs

A central question of evolutionary dynamics on graphs is whether or not ...
10/06/2021

### An Improved Approximation for Maximum k-Dependent Set on Bipartite Graphs

We present a (1+k/k+2)-approximation algorithm for the Maximum k-depende...
12/08/2018

### The k-conversion number of regular graphs

Given a graph G=(V,E) and a set S_0⊆ V, an irreversible k-threshold conv...
03/12/2020

### Regular Intersection Emptiness of Graph Problems: Finding a Needle in a Haystack of Graphs with the Help of Automata

The Int_reg-problem of a combinatorial problem P asks, given a nondeterm...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

We investigate a vertex deletion problem called the minimum -path vertex cover problem, denoted as PVC, which is a generalization of the classic minimum vertex cover (VC) problem [14]. The PVC problem has been studied for more than three decades in the literature, and it has applications in wireless sensor networks such as constructing optimal connectivity paths and in networking security such as monitoring the message traffic and detecting malicious attack [1].

Given a simple graph and a constant integer , a -path (or, a path of order ) is a simple path containing vertices; the PVC problem asks for a minimum subset of vertices such that the induced subgraph (set minus operation) does not contain any -path [7, 17, 16]. When , this turns out to be VC. In the literature, a -path vertex cover is also called a vertex -path cover [6], or a vertex cover  [26, 25], or a vertex cover [10], or a -observer [1, 24]. Also, when is a (-path) vertex cover, is an independent set, and when is a -path vertex cover, is a dissociation set. The maximum independent set (MIS) problem is another classic NP-hard problem [14]; the maximum dissociation set problem is also classic, was introduced more than three decades ago by Yannakakis [27], and is NP-hard even on bipartite graphs.

The concept of -path vertex covers, and many related ones, form a line of research in graph theory. The minimum cardinality of a -path vertex cover, for , in the graph is denoted by  [6]. Clearly, , where is the independent number, that is, the maximum cardinality of an independent set in . The maximum cardinality of a dissociation set in , also known as the -dependence number [12, 13], is denoted as  [23], and we have . Let , and denote a simple path, a simple cycle and a complete graph with vertices, respectively, then , and  [1, 6]. When is some special graph [7, 6, 17, 5, 16], the exact value of , for some values of , can also be computed in polynomial time. For the cases where the exact value of is unable to be computed in polynomial time, there are works proving several lower and/or upper bounds on ; to name a few, Brešar et al. [7] proved that and , where is the degree of in ; Brešar et al. [6] showed that for -regular graphs when .

The PVC problem is NP-hard for every  [7, 1]. From the inapproximability (hardness of approximation) perspective, the VC (i.e., PVC) problem is APX-complete even on cubic graphs [2]; it cannot be approximated within unless P = NP [11] and it cannot be approximated within any constant factor less than  [20] under the unique game conjecture [19]. Brešar et al. [7] proved that for any , a -approximation for the PVC problem implies a -approximation for the VC problem. It follows that it is NP-hard to approximate the PVC problem, for every , within too, unless P = NP. Recall that the maximum dissociation set problem is NP-hard even on (some sub-classes of) planar bipartite graphs [27, 4, 23]; the PVC problem is shown NP-hard on cubic planar graphs with girth  [25]. The PVC problem is proven APX-complete on cubic bipartite graphs and on -free graphs [10].

From the approximation algorithm perspective, the simple greedy algorithm, that iteratively takes all the vertices from a -path in the remaining graph until there is no -path left, is a -approximation for the PVC problem, for every . Furthermore, the VC (i.e., PVC) problem admits a -approximation [18]; the PVC problem admits a primal-dual -approximation [26]; the PVC problem admits a primal-dual -approximation [8].

On -regular graphs, where , the PVC problem can be approximated better, for every . The approximation ratio for the VC (i.e., PVC) problem can be reduced to  [3],  [15]; Ries et al. [24] gave a -approximation for PVC and Devi et al. [10] presented a greedy -approximation for PVC (using the lower bound given by Brešar et al. [6]). Ries et al. [24] also proposed a -approximation for PVC when .

More specifically on cubic graphs, Tu and Yang [25] gave a -approximation for the PVC problem; Ries et al. [24] claimed that their approximation algorithm for the PVC problem can be reduced to . Table 1 summarizes the best approximation ratios prior this work for the PVC problem, for .

In this paper, we aim to design improved approximation algorithms for the PVC problem, for , on -regular graphs. To this purpose, in Section 2 we first employ an existing polynomial time graph defective coloring algorithm to design a simple yet effective approximation algorithm for the PVC problem, and we are able to show that its approximation ratio is , when . This beats all the best prior approximation results except for PVC, where it ties with the approximation by Devi et al. [10] and in particular at on cubic graphs and on -regular graphs. In Section 3, we first prove a lower bound on when is -regular, then integrate the graph defective coloring algorithm and the current best approximation algorithm for the MIS problem on degree-bounded graphs, to design a -approximation for PVC on cubic graphs. When is even, we show in Section 4 how to compute -path vertex covers in and selecting the minimum one gives a -approximation for PVC. This turns out to be a -approximation for PVC on -regular graphs. Also in Section 4, we are able to provide a better analysis to show that the algorithm is actually a -approximation on -regular graphs, and construct an instance to show that the ratio is almost tight for Approx2 -regular graphs. Lastly, in Section 5, we propose a -approximation algorithm for PVC on -regular bipartite graphs. We conclude the paper in Section 6 with some remarks.

## 2 An approximation for PkVc on d-regular graphs

We consider the PVC problem for , for which the best approximation algorithms prior our work are summarized in Table 1.

As a consequence of Lovász’s graph decomposition [22], Cowen and Jesurum [9] have proved the following result on defective coloring for any graph of maximum degree , where a defective -coloring colors the vertices of the graph using colors such that each vertex is adjacent to at most the same colored neighbors. A -coloring is the classic vertex coloring using colors.

###### Theorem 1

[9] Any graph of maximum degree can be -colored in time.

Let be a -regular graph. Using Theorem 1 by setting , we have a defective -coloring for the graph . In this defective -coloring, suppose these colors are ; let denote the subset of the vertices colored . Then clearly,

1. is a partition of the vertex set , and

2. the subgraph induced on , , does not contain any -path, suggesting is a -path vertex cover (and thus it is also a -path vertex cover for any ), for every .

It follows that the minimum among , denoted as , has size

 |Fmin|≤(1−1p)|V|. (1)

Recall that when , Brešar et al. [6] have proved the following lower bound on :

 ψk(G)≥d−k+22d−k+2|V|. (2)

Therefore, turns out to be an approximate solution within ratio

 |Fmin|ψk(G)≤(p−1)(2d−k+2)p(d−k+2)=⌊d/2⌋(2d−k+2)(⌊d/2⌋+1)(d−k+2).

That is, using the existing graph defective coloring algorithm, we can design an algorithm, denoted as DC, to first compute in time a defective -coloring for the input -regular graph , then in time find the color with the most vertices, and last return containing all the vertices not colored . See Figure 1 for a high-level description of the algorithm DC. We conclude with Theorem 2.

###### Theorem 2

The algorithm DC for the PVC problem on -regular graphs, where , is an -time -approximation.

On -regular graphs, our algorithm DC for the PVC problem for beats all the best prior approximation results except for ours ties with the approximation by Devi et al. [10]. Table 2 summarizes the improvement over the corresponding entries in Table 1. From this table, we see that currently there is no published approximation result for PVC on -regular graphs such that (and ). For PVC, the approximation ratios are constants strictly less than for all , while they are for both . In the next two sections, we design a -approximation for PVC on cubic graphs and a -approximation for PVC on -regular graphs, respectively.

## 3 P4Vc on cubic graphs

In this and the next sections, we will design improved approximation algorithms for PVC on cubic graphs and on -regular graphs, with performance ratios and , respectively. With them, all the approximation ratios for PVC on -regular graphs become strictly less than .

Let denote the input -regular graph (we set later after we develop a lower bound on general ). We first examine some structural properties associated with the optimal -path vertex covers in .

###### Lemma 1

Let be a -regular graph. Then, , where denotes the total number of vertices on the -cycles in and is an optimal -path vertex cover in .

Proof. Let be an optimal -path vertex cover in ; then the subgraph does not contain any -path. In other words, all the connected components of

can be classified into the following five kinds:

1. a -path (also called a singleton),

2. a -path,

3. a -path,

4. a claw, for , and

5. a cycle (also called a triangle);

let denote the total numbers of vertices in these five kinds of components, respectively. It follows that

 |V|=|F∗|+a+b+c+e+f. (3)

Because is -regular, the number of edges connecting a vertex of and a vertex of is at least

 da+(d−1)b+(d−43)c+(min3≤ℓ≤d{d−2+2ℓ+1})e+(d−2)f.

Given that each vertex of can be incident with at most such edges, and that achieves at , we have

 d|F∗|≥(d−2+2d+1)(a+b+c+e)+(d−2)f=d(d−1)d+1(|V|−|F∗|)−2d+1f.

It follows from Eq. (3) that, when is -regular,

 ψ4(G)=|F∗|≥d−12d|V|−1d2f. (4)

This proves the lemma.

We now consider only , that is, is cubic.

In this case, Lemma 1 (or Eq. (4)) states that . Therefore, one sees that when the number of triangles in is small, we can expect this new lower bound to be more effective, compared against the lower bound stated in Eq. (2). For example, when , we have

 ψ4(G)≥19(3−35)|V|=415|V|.

It follows that the defective -coloring for gives a -path vertex cover , see Eq. (1), satisfying

 |Fmin|ψ4(G)≤12×154=158. (5)

On the other hand, when , that is, there are a considerable number of triangle components in , we will construct a new graph denoted as from the input graph as follows. For every triangle in , we create a distinct vertex of in ; two vertices of are adjacent if and only if the corresponding two triangles of share a common edge or they are connected by an edge in .111We remark that two distinct triangles of the cubic graph either share exactly one edge, or have no vertex in common. One may easily verify that the graph can be constructed in time, , and the maximum degree of the vertices of is at most . Moreover, a subgraph of that is a collection of triangle components one-to-one corresponds to an independent set of .

Recall the best approximation algorithm for the MIS problem on degree- graphs by Berman and Fujito [3] has a performance ratio for any small positive . We next run this approximation algorithm on to obtain an independent set of , and therefore (roughly, by ignoring )

 |I|≥56α(G)≥518f,

where is the independence number of , which corresponds to the maximum number of non-adjacent triangles in . Let denote the set of the vertices of not on the triangles of ; then is a solution to the PVC problem on , and its cardinality is

 |F|≤|V|−56f=58|V|+38|V|−56f≤58|V|−524f, (6)

where the last inequality is due to .

Combining Eq. (6) and the lower bound in Eq. (4), we have

 |F|ψ4(G)≤58×3=158. (7)

From Eqs. (5) and (7), we can design an algorithm, denoted as Approx1, to first compute in time a defective -coloring for the input cubic graph , then in time find the color with less vertices and set to contain all these vertices. It also constructs the triangle graph from and applies the best approximation algorithm for MIS to compute an independent set in , then it sets to contain all the vertices of not on the triangles of . Lastly, it returns the smaller one between and as the final solution. See Figure 2 for a high-level description of the algorithm Approx1, of which the running time is dominated by the running time of the best approximation algorithm for the MIS problem on degree- graphs.

We thus conclude with Theorem 3.

###### Theorem 3

The algorithm Approx1 is a -approximation for the PVC problem on cubic graphs.

## 4 P4Vc on 4-regular graphs

The design ideas in the above algorithm Approx1 for cubic graphs do not trivially extend to -regular graphs, for one of the most important reasons that there are many more configurations for two triangles being adjacent (due to degree ) and the maximum degree of the similarly constructed triangle graph can be as high as . Such a high maximum degree voids the effectiveness of the best approximation algorithm for MIS on degree- graphs.

We present next an approximation algorithm, denoted as Approx2, for PVC on -regular graphs when is even, and show that its performance ratio is . When , the ratio is . We are able to provide a better but slightly more complex analysis for to show that the approximation ratio is actually no greater than ; we also use an instance to show that the ratio is almost tight for Approx2 when .

### 4.1 An approximation algorithm when d≥4 is even

One of the design ideas in our algorithm is borrowed from Devi et al. [10]. Let denote the input -regular graph, where is even.

In the algorithm Approx2, we first compute a subset of vertices by iteratively adding to it a degree- vertex until no more degree- exists in the remaining graph; then similarly and sequentially compute a subset of vertices by iteratively adding to it a degree- vertex until no more degree- exists in the remaining graph, for . Denote

 Vd−1=V−∪d−2i=1Vi.

The last remaining graph is , which has maximum degree and thus an optimal (i.e., minimum) -path vertex cover, denoted as , can be computed in time.

We will prove in Theorem 4 that is a -path vertex cover of the input graph , so is , for each . The algorithm Approx2 outputs the smallest among these covers as the final solution. A high-level description of Approx2 is depicted in Figure 3, and we prove in Theorem 4 that Approx2 is an -time -approximation for PVC on -regular graphs, when is even.

###### Theorem 4

For the PVC problem on -regular graphs, when is even, the algorithm Approx2 is an -time -approximation.

Proof. First of all, since is -regular, we have ; therefore, is computed in time, for each . Computing needs only time since each connected component of is either a simple cycle or a simple path. That is, the running time of Approx2 is in .

Next, we conclude that the vertices of , for each , are pairwise non-adjacent to each other. Also, in the induced subgraph graph which has maximum degree , since every vertex of the computed subset has degree exactly , a vertex of can be adjacent to at most one vertex of . These two properties suggest that the longest path in the subgraph induced on , , contains at most three vertices (two of and one of ), that is, is a -path vertex cover in , for each . On the other hand, since is a -path vertex cover in , is a -path vertex cover in . This proves that the solution returned by Approx2 is feasible.

Recall that each connected component of is either a simple cycle or a simple path. The optimal -path vertex cover in contains exactly vertices from each -path, and contains exactly vertices from each -cycle [1, 6]. From the fact that and for all , we conclude that

 |Ud−1|≤25|Vd−1|.

Consequently, the size of is

 |Ud−1|+d−2∑i=1|Vi|≤25|V|+35d−2∑i=1|Vi|. (8)

It follows that the minimum cardinality of these -path vertex covers is at most

 63d+4d2−1∑i=1(|V|−|V2i−1|−|V2i|)+103d+4⎛⎝25|V|+35d/2−1∑i=1(|V2i−1|+|V2i|)⎞⎠=3d−23d+4|V|.

From Eq. (2) we obtain the lower bound of on using  [6], and therefore we prove that the algorithm Approx2 has an approximation ratio of . Note that due to , the above ratio is strictly less than that stated in Theorem 2 (or the one by Devi et al. [10]).

###### Corollary 1

For the PVC problem on -regular graphs, the algorithm Approx2 is an -time -approximation.

### 4.2 Approx2 is a 1.852-approximation

In this section, we present a better analysis for our algorithm Approx2 than what is done in the proof of Theorem 4. Theorem 4 leads to the conclusion in Corollary 1 that Approx2 is a -approximation for PVC on -regular graphs. Our better analysis shows that the performance ratio of Approx2 is actually at most .

Recall that when , our algorithm Approx2 (see Figure 3) computes and , and computes an optimal -path vertex in , where . Both and are feasible -path vertex covers in the input graph . Approx2 returns the smaller one between and , denoted as . In the following we denote as .

#### 4.2.1 An outline of the analysis

Throughout the analysis, we fix an arbitrary optimal -path vertex cover in for discussion.

We color the vertices of black and color the other vertices of (that is, ) white. An edge of is black (respectively, white) if both of its endpoints are black (respectively, white); an edge of neither black nor white is bicolor. Using this coloring scheme, a bicolor -edge has its endpoint in white and its endpoint in black. Bicolor -edges, -edges, and -edges are defined similarly.

Recall that each connected component of is one of the following five kinds:

1. a -path,

2. a -path,

3. a -path,

4. a claw, for , and

5. a triangle (i.e., a cycle).

We merge the first four kinds and name them uniformly a star. The center vertex of a star is the one with the maximum degree (tie broken arbitrarily), and all the other vertices (can be , or of them) are referred to as the satellites of . This way, each connected component of is either a triangle or a star.

Our goal is to show that . Let . Since for any coefficient , it suffices to show that there is a constant such that . In the remainder of this section, we show that satisfies this inequality. To reach our goal, we first make two important observations summarized in the next two lemmas, respectively.

###### Lemma 2

Let be the number of black edges in , and be the number of star components in . Then, .

Proof. We apply a similar counting as in the proof of Lemma 1. Let denote the number of star components in which the star has satellites, for ; and denote the number of triangle components. It follows that

 |V|=|B|+4∑i=0(i+1)xi+3y.

Because is -regular and there are black edges, the number of bicolor edges (each connecting a vertex of and a vertex of ) is exactly

 4|B|−2be=4∑i=02(i+2)xi+6y.

By eliminating from the above two equalities, we have

 2|V|−2|B|−4∑i=02(i+1)xi=4|B|−2be−4∑i=02(i+2)xi.

Using , we have . This proves the lemma.

###### Lemma 3

Let (respectively, ) be the total number of vertices in those connected components of each is a path of order at most (respectively, at least ). Let (respectively, ) be the total number of vertices in those connected components of each is a cycle of order exactly (respectively, exactly or at least ). Then, .

Proof. It is known that and , where and are a simple path and a simple cycle of order , respectively [1, 6]. Let be the total number of vertices in those connected components of each is a cycle of order exactly . Therefore,

 |U3|≤14p4↑+13c4,6↑+25c5.

Using to cancel out , we achieve the inequality stated in the lemma.

By the above Lemmas 2 and 3, it remains to show the following inequality:

 α|V1|+α|V2|+(1−3α5)|V3|−2α5(p3↓+c3)−3α20p4↑−α15c4,6↑13|V|+13be+13sc≤1.852. (9)

In (the denominator in Eq. (9)), we call the basic lower bound on and call the extra lower bound on .

Similarly, in (the numerator in Eq. (9)), we call the basic upper bound on and call the saving on .

Roughly speaking, we used only the basic lower bound on (as in Eq. (2) and the basic upper bound (with , as in Eq. (8)) in the proof of Theorem 4 (when ). In other words, Lemma 2 gives a better lower than Eq. (2) when and , and Lemma 3

gives a better estimation than Eq. (

8) when . The extra lower bound and the saving will help us get a better analysis.

It seems difficult to verify Eq. (9) if we consider the graph as a whole. So, to ease the proof of Eq. (9), we consider the following two kinds of subgraphs of and want to verify Eq. (9) on each of these subgraphs.

• Type-1: The subgraphs of this type one-to-one correspond to the connected components of , and they are constructed as follows. Consider a connected component of , which is either a cycle or a star, and all its vertices are white. Let be the subgraph of induced by the vertices of and their black neighbors in . Let be the graph obtained from by deleting all black edges. Then, is the type-1 subgraph corresponding to .

• Type-2: The subgraphs of this type one-to-one correspond to the black edges in . That is, the subgraph corresponding to a black edge consists of only and its two ending black vertices.

Let (respectively, ) be the collection of type-1 (respectively, type-2) subgraphs in . Obviously, each white vertex of appears in exactly one subgraph in . In contrast, a black vertex of can appear in one or more subgraphs in . Nevertheless, each edge of appears in exactly one subgraph in .

To prove Eq. (9), we proceed as follows:

Step 1:

Distribute the numerator and the denominator of the left hand side of the inequality to the subgraphs in .

Step 2:

Prove that for each subgraph in , , where (respectively, ) is the portion of the numerator (respectively, denominator) of the left hand side of the inequality distributed to .

The next three subsections are devoted to detailing the above two steps, respectively. After these two steps, we are done because for any sequence of positive numbers , for all implies .

### 4.3 Distributing the denominator

Initially, we distribute the basic lower bound (namely, ) evenly to the edges in so that each edge holds a basic lower bound of ; we further distribute the extra lower bound (namely, ) evenly to the black edges in and the star components of so that each black edge holds an extra lower bound of and so does each star component of .

If a -cycle in has at least one black edge, then it is good; otherwise, it is bad. Consider a bad -cycle in . Since is a solution (i.e., -path vertex cover), must have exactly two black vertices and a unique white edge. Let be the white edge in , and be the white vertex of that is not an endpoint of . We call the independent white vertex in . If or appears in a star component of or at least one vertex of is incident to a black edge in , then is slightly bad; otherwise, is very bad. A simple but important observation is that no star component of can contain both and . A -edge of is good if its black endpoint either is an endpoint of a black edge in or appears in a good or slightly bad -cycle of .

First, consider a connected component of that has at least one black edge. Let be the number of black edges in , and be the number of good -edges in whose black endpoints appear in . We collect the extra lower bounds held by the black edges of ; the total is obviously . From this total, we distribute evenly to the good -edges so that each of them receives , and then distribute the remaining (namely, ) to the black edge so that each of them receives . Since , each black edge in still holds an extra lower bound of .

Next, consider a slightly bad -cycle in . Let be the white edge in , and be the independent white vertex of . We transfer a portion of the extra lower bound (namely, ) held by as follows (three possible cases):

• Suppose that appears in a star component of . Then, we say that is of type-1. Among the extra lower bound (namely, ) held by , we transfer to each good -edge whose black endpoint appears in . Obviously, is the unique bad -cycle whose white edge appears in . Moreover, there are exactly good -edges whose black endpoints appear in . Thus, the extra lower bound still held by is .

• Suppose that appears in a star component of but does not. Then, we say that is of type-2. Among the extra lower bound (namely, ) held by , we transfer to each good -edge whose black endpoint appears in . Obviously, there are at most bad -cycles whose independent white vertices appear in . Moreover, each bad -cycle contains the black endpoints of exactly good -edges. Thus, the extra lower bound still held by is