On r-Simple k-Path and Related Problems Parameterized by k/r

06/24/2018 ∙ by Gregory Gutin, et al. ∙ Royal Holloway, University of London Ben-Gurion University of the Negev 0

Abasi et al. (2014) and Gabizon et al. (2015) studied the following problems. In the r-Simple k-Path problem, given a digraph G on n vertices and integers r,k, decide whether G has an r-simple k-path, which is a walk where every vertex occurs at most r times and the total number of vertex occurrences is k. In the (r,k)-Monomial Detection problem, given an arithmetic circuit that encodes some polynomial P on n variables and integers k,r, decide whether P has a monomial of degree k where the degree of each variable is at most r. In the p-Set (r,q)-Packing problem, given a universe V, positive integers p,q,r, and a collection H of sets of size p whose elements belong to V, decide whether there exists a subcollection H' of H of size q where each element occurs in at most r sets of H'. Abasi et al. and Gabizon et al. proved that the three problems are single-exponentially fixed-parameter tractable (FPT) when parameterized by (k/r) r, where k=pq for p-Set (r,q)-Packing and asked whether the r factor in the exponent can be avoided. We consider their question from a wider perspective: are the above problems FPT when parameterized by k/r only? We resolve the wider question by (a) obtaining a 2^O((k/r)^2(k/r)) (n+ k)^O(1)-time algorithm for r-Simple k-Path on digraphs and a 2^O(k/r) (n+ k)^O(1)-time algorithm for r-Simple k-Path on undirected graphs (i.e., for undirected graphs we answer the original question in affirmative), (b) showing that p-Set (r,q)-Packing is FPT, and (c) proving that (r,k)-Monomial Detection is para-NP-hard. For p-Set (r,q)-Packing, we obtain a polynomial kernel for any fixed p, which resolves a question posed by Gabizon et al. regarding the existence of polynomial kernels for problems with relaxed disjointness constraints.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Abasi et al. [1] introduced the following extension of the Directed -Path problem called the Directed -Simple -Path problem: given an -vertex digraph and positive integers ,111Note that can be substantially larger than . decide whether has an -simple -path, that is, a walk where every vertex occurs at most times and the total number of vertex occurrences is . At first glance, one may think that the time complexity of any algorithm for solving Directed -Simple -Path is an increasing function in . However, Abasi et al. showed that this is not the case by designing a randomized algorithm of running time . Their algorithm was obtained by a simple reduction to the -Monomial Detection problem in which the input consists of an arithmetic circuit that succinctly encodes some -variable polynomial , and positive integers . The goal is to decide whether has a monomial of total degree where the degree of each variable is at most . Abasi et al. proved that -Monomial Detection can be solved by a randomized algorithm with time complexity Gabizon et al. [24] derandomized these two randomized algorithms, though at the expense of increasing the constant factor in the exponent and restricting the input of the -Monomial Detection problem to non-canceling circuits.222Non-defined terms can be found in the next section. Both algorithms of Gabizon et al. run in time Gabizon et al. [24] also studied the -Set -Packing problem in which the input consists of an -element universe , positive integers , and a collection of sets of size whose elements belong to . The goal is to decide whether there exists a subcollection of of size where each element occurs in at most sets of . Gabizon et al. designed an algorithm for -Set -Packing of running time where In other words, the above results show that the three problems are single-exponentially fixed-parameter tractable (FPT) when parameterized by the product of two parameters, and .

The motivation behind the relaxation of disjointness constraints is to enable finding substantially better (larger) solutions at the expense of allowing elements to be used multiple (but bounded by ) times. For example, for any choice of , Abasi et al. [1] presented digraphs that have at least one -simple -path but do not have even a single (simple) path on vertices. Thus, even if we allow each vertex to be visited at most twice rather than once, already we can gain an exponential increase in the size of the output solution. The same result holds also for undirected graphs.333Undirected -Simple -Path can be viewed as the special case of Directed -Simple -Path where every pair of vertices has either no arc or arcs in both directions. In addition, Abasi et al. [1] showed that the relaxation does not make the problem easy: both Undirected -Simple -Path and Directed -Simple -Path are shown to be NP-hard with . From this, we observe that NP-hardness holds for a wide variety of choices of , ranging for being any fixed constant to being super-exponential in (e.g., for any fixed constant ). In addition, NP-hardness holds when as well as when for any fixed constant .

As an open problem, both Abasi et al. and Gabizon et al. asked whether it is possible to avoid an exponential dependency on . In other words, they asked whether the above problems are single-exponentially FPT when parameterized by alone.444The interpretation of is a tight lower bound on the number of distinct elements any solution must use. To answer this question for -Monomial Detection, Bonamy et al. [14] proved that the running time of the algorithms of Abasi et al. [1] and of Gabizon et al. [24] for -Monomial Detection are optimal under the Exponential Time Hypothesis (ETH) in the following sense. Unless ETH fails there is no -time algorithm for -Monomial Detection even if for any The question remains open for Directed -Simple -Path and -Set -Packing.

We consider the question from a wider perspective of parameterized complexity: are the above problems FPT when parameterized by only, i.e. whether there exists a computable function such that the problems admit a -time algorithm?

Note that the above algorithms by Abasi et al. and Gabizon et al. are not even XP-algorithms in the parameter because (encoded in binary) can be much larger than the size of the problem instance under consideration. In particular, even when , these algorithms can run in time exponential in the input size. In addition, note that all three problems are easily seen to be FPT when parameterized by and simultaneously, since algorithms that run in time immediately follow by simple modifications of known algorithms for the corresponding non-relaxed versions. When is large enough, the running times of of the algorithms by Abasi et al. and Gabizon et al. are superior. Here, the factor in the exponent naturally arises, and seems to be perhaps unavoidable. To see this, first consider the very special case where the input contains only distinct elements. Then, we can store counters that keep track of how many times each element is used. Our array of counters would have possible configurations, hence a running time of is trivial. However, counters are completely prohibited when dependence on is forbidden, which already renders this extreme special case non-obvious. In fact, a running time of not only disallows using such an array of counters, but it forbids the usage of even a single counter. Thus, in advance, it might seem more natural to vote for W[1]-hardness over FPT for all three problems with respect to .

Our Contribution.

We resolve the parameterized complexity of all three problems, namely Directed -Simple -Path, -Set -Packing and -Monomial Detection, with respect to the parameter . Our main contribution consists of a -time algorithm for Directed -Simple -Path and a -time algorithm for Undirected -Simple -Path.555Recall that is the number of vertices in the input (di)graph. For Undirected -Simple -Path, this answers the question posed by Abasi et al. [1] and Gabizon et al. [24], and reiterated by Bonamy et al. [14] and Socala [41]. (As also noted in previous works, it is easily seen that even when is polynomial in , none of the three problems can be solved in time unless the ETH fails.) In addition, we show that -Set -Packing is FPT based on the representative set method in parameterized algorithmics. Along the way to design this algorithm, we obtain a polynomial kernel for any fixed , which resolves another question posed by Gabizon et al. regarding the existence of polynomial kernels for problems with relaxed disjointness constraints whose sizes are decreasing functions of . We remark that all of our algorithms are deterministic, and are based on ideas completely different from those of Abasi et al. [1] and of Gabizon et al. [24].

Next, we introduce an extension of -Set -Packing to multisets called the -Multiset -Packing problem. In -Multiset -Packing, consists of multisets and in no element of has more than occurrences in total (i.e., if a multiset in contains copies of element , all other multisets of can have at most occurrences of in total). We prove that -Multiset -Packing parameterized by is W[1]-hard. Using this result, we also prove that -Monomial Detection parameterized by is W[1]-hard even if (i) is polynomially bounded in the input length, (ii) the number of distinct variables is , and (iii) the circuit is non-canceling. Moreover, we show that -Monomial Detection is para-NP-hard even if only two distinct variables are in polynomial and the circuit is non-canceling.

The most technical parts of the paper deal with the Directed -Simple -Path and Undirected -Simple -Path problems. We prove that Directed -Simple -Path can be solved in time and polynomial space using a chain of reductions from Directed -Simple -Path that includes three auxiliary problems. The first of these problems is the Directed -Simple Long -Path problem, where we are given a strongly connected digraph , positive integers , and vertices . The objective is to either (i) determine that has an -simple -path or (ii) output the largest integer such that has an -simple -path of size . It is not hard to see that we may assume that has neither a path of size at least nor a cycle of length at least . The key result on Directed -Simple Long -Path is that under the assumption above, there is always, as a solution, an -simple path with fewer than distinct arcs.666In addition, we show that this bound is essentially tight.

For reductions using the other two problems we apply several parameterized algorithms approaches (including color coding and integer linear programming parameterized by the number of variables) and new structural insights. Here, we often alternate between the view of the solution as an

-simple -path and the view of the solution as an Eulerian digraph with degree constraints.

Our proof that Undirected -Simple -Path can be solved in time , initially uses an approach similar to that applied for Directed -Simple -Path. Using the fact that the input graph is undirected, we are able to show that the bound above can be improved to . However, this result in itself is only sufficient to show the existence of an -time algorithm for Undirected -Simple -Path using the reductions applied for Directed -Simple -Path. Thus, we have to take a different route based on a deeper understanding of the structure of the solution. Our approach is partially inspired by an idea from the recent work of Berger et al. [8] and involves a special decomposition of the multigraph induced by a solution for Undirected -Simple -Path into two multigraphs. In our case, one of the multigraphs, , has treewidth at most 2, and all vertices of are of even degree and different color (in a special coloring), i.e.  is colorful. The second multigraph corresponds to an -simple path which visits each component of (which ensures the connectivity of the generated solution), and vertices of the same color are visited by in total a prescribed number of times. The existence of the decomposition above is verified by a two-level dynamic programming algorithm. This algorithm is followed by a way to bound . Here, we identify that when is large enough compared to , then the vertex cover number of the graph can be bounded. The decomposition is modified accordingly to enable the use of a flow network to handle its second multigraph.

Related Work.

Agrawal et al. [2] showed the power of relaxed disjointness conditions in the context of a problem that otherwise admits no polynomial kernel. Specifically, Agrawal et al. studied the Disjoint Cycle Packing problem: given a graph and integer , decide whether has vertex-disjoint cycles. It is known that this problem does not admit a polynomial kernel unless NP  coNP/poly [13]. The main result by Agrawal et al. concerns a relaxation of Disjoint Cycle Packing where every vertex can belong to at most cycles (rather than at most one cycle). Agrawal et al. showed that this relaxation reveals a spectrum of upper and lower bounds. In particular, they obtained a (non-polynomial) kernel of size when . Note that the size of the kernel depends on .

Prior to the work by Gabizon et al. [24], packing problems with relaxed disjointness conditions have already been considered from the viewpoint of parameterized complexity (see, e.g., [33, 19, 38, 39]). Roughly speaking, these papers do not exhibit behaviors where relaxed disjointness conditions substantially (or at all) simplify the problem at hand, but rather provide parameterized algorithms and kernels with respect to . Here, the work most relevant to us is that by Fernau et al. [19], who studied the -Set -Packing problem. In particular, for any , Fernau et al. proved that several very restricted versions of -Set -Packing with are already NP-hard. Moreover, they obtained a kernel with vertices.

In addition, we note that Gabizon et al. [24] also studied the Degree-Bounded Spanning Tree problem: given a graph and an integer , decide whether has a spanning tree of maximum degree at most . This problem demonstrates a limitation of the derandomization of Gabizon et al. as the arithmetic circuit required is not non-canceling. Thus, only randomized -time algorithm was obtained and designing a deterministic algorithm of such a running time remains an open problem.

Finally, let us remark that -Path (on both directed and undirected graph) and -Set -Packing are both among the most extensively studied problems in Parameterized Complexity. In particular, after a long sequence of works during the past three decades, the current best known parameterized algorithms for -Path have running times (randomized, undirected only) [10, 9] (extended in [11]), (randomized) [43] and (deterministic) [44, 20, 40]. In addition, -Path is known not to admit any polynomial kernel unless NP  coNP/poly [12].

This paper is organized as follows. The next section contains preliminaries. Section 3 describes reductions leading to our main result for Directed -Simple -Path. Our proof of the main result for Undirected -Simple -Path is given in Section 4. We show that -Set -Packing parameterized is FPT in Section 5. In Section 6, we prove that -Monomial Detection is para-NP-hard. Our W[1]-hardness results for -Multiset -Packing and -Monomial Detection are shown in Section 7. The last section of the paper discusses some open problems.

2 Preliminaries

Given a multiset and an element , stands for copies of . The size of a multiset is

Graph Notation.

For a directed or undirected graph , the vertex set of is denoted by . If is undirected, its edge set is denoted by , and if is directed, its arc set is denoted by . Given a subset , the subgraph of induced by is denoted by , and the subgraph of obtained by deleting the vertices in and the edges/arcs incident to them is denoted by . Given a subset of edges/arcs in , the subgraph of obtained by deleting the edges/arcs in is denoted by . For a digraph and a vertex , the out-degree and in-degree of in are denoted by and , respectively.

A strongly connected digraph is a digraph such that for any pair of distinct vertices, has a path from to . A directed acyclic graph (DAG) is a digraph with no directed cycle. For any positive integer , an -colored (di)graph is a vertex-colored (di)graph where each vertex is colored by exactly one color from . For an undirected graph , a vertex cover of is a subset of vertices such that every edge in is incident to at least one vertex in , and a matching in is a subset of edges such that no two edges in have a common endpoint. A matching is maximal if there does not exist such that is a matching. The vertex cover number of is the minimum size of a vertex cover of .

Paths, Walks and Trails.

For an undirected multigraph a walk is an alternating sequence such that is an edge between and for all . For a directed multigraph the definition of a walk is the same, but we require that is an arc from to

When is a graph, i.e., has no multiple edges/arcs, then will be denoted by For any , is called a vertex occurrence or a vertex visit, and for all , (resp. ) an edge occurrence (resp. arc occurrence) or an edge visit (resp. arc visit). The length of a walk is the number of edges/arcs visits on the walk, that is, , and the size of a walk is the number of vertex visits on the walk, that is, . If the first and last vertex visits of a walk are equal, then the walk is said to be closed. For a walk , the multisets of vertex visits and edge (resp. arc) visits are denoted by and (resp. ), respectively.

An -simple path is a walk where every vertex occurs at most times. Moreover, an -simple -path is an -simple path of size . Note that a -simple path is just a path. A -simple path that is a closed walk where every vertex occurs at most once, except for the last and first vertex which occur twice, is just a cycle. Note that by this definition, the first and last vertex of a cycle are well defined. Given vertices , an -path is a path that starts at and ends at . Similarly, an -cycle is a cycle that starts at and ends at , in which case . To avoid writing some explanations twice, we refer to an -cycle also as an -path. More generally, an -simple -path is an -simple -path that starts at and ends at .

Given a directed or undirected multigraph and vertices , a walk in is called an Euler -trail if visits every edge/arc in exactly once, and starts at and ends at . An undirected multigraph is called even if for every , is even.

Perfect Hash Families.

The construction of a perfect hash family is a basic tool to derandomize parameterized algorithms. Formally, perfect hash families are defined as follows.

Definition 2.1.

Let . An -perfect hash family is a family of function such that for any subset , there exists a function in that is injective on .

The following proposition asserts that small perfect hash families can be constructed efficiently.

Theorem 2.1 ([5, 26]).

Let . An -perfect hash family of size can be constructed in time . Moreover, the functions in the family can be enumerated with polynomial space and polynomial delay.

Treewidth.

Tree decompositions and treewidth are defined as follows.

Definition 2.2.

A tree decomposition of a graph is a pair , where is a rooted tree and is a mapping that satisfies the following conditions.

  1. For each vertex , the set induces a nonempty (connected) subtree of .

  2. For each edge , there exists such that .

The width of is . The treewidth of is the minimum width over all tree decompositions of .

The vertices of are called nodes. A set for is called the bag at . For any node , denote .

A nice tree decomposition is a tree decomposition of a form that simplifies the design of dynamic programming (DP) algorithms. Formally,

Definition 2.3.

A tree decomposition of a graph is nice if each node is of one of the following types.

  • Leaf: is a leaf in and .

  • Forget: has exactly one child , and there exists a vertex such that .

  • Introduce: has exactly one child , and there exists a vertex such that .

  • Join: has exactly two children, and , and .

It is well-known that a graph of treewidth admits a nice tree decomposition of width (see, e.g., [29, 15]).

Theorem 2.2 ([29]).

Let be a graph of treewidth . Then, admits a nice tree decomposition of width .

Integer Linear Programming (LP).

The Feasibility Linear Programming problem (Feasibility LP) is given by a set of variables and a system of linear equations and inequalities with real-valued coefficients and variables from , and the aim is to decide whether all the linear equations and inequalities (called linear constraints) can be satisfied by an assignment of non-negative reals to variables in . If only integral values are allowed in , then the problem is called the Feasibility integer Linear Programming problem (Feasibility ILP). The Linear Programming problem (LP) is given by a set of variables, a system of linear constraints with real-valued coefficients and variables from and a linear function with real-valued coefficients and variables from , and the aim is to find an assignment of non-negative reals to variables in that satisfies all linear constraints and minimizes/maximizes over all such (feasible) assignments. If only integral values are allowed in , then the problem is called the Integer Linear Programming problem (ILP).

The following well-known result (cf. Section 6.2 in [15]) will be used in this paper.

Theorem 2.3 ([32, 27, 23]).

ILP (Feasibility ILP, resp.) of size with variables can be solved using

arithmetic operations and space polynomial in (in , resp.), where where is an upper bound on the absolute value a variable can take in a solution, and

is the largest absolute value of a coefficient in the cost vector

.

Flow Networks.

A flow network is a digraph with two special vertices and called a source and sink, respectively, and three functions and . For an arc , and are the upper capacity, lower capacity, and cost of A flow in is a function such that for every and for every . The value of is and the cost of is It is well-known that [7]. A flow is integral if is an integer for every .

Arithmetic Circuits.

Let be a monomial in a polynomial . The degree of is the sum of degrees of the variables of . An arithmetic circuit over the field and the set of variables is a DAG as follows. Every vertex in with in-degree zero is called an input gate and is labeled by either a variable or a field element in Every other gate is labeled by either or and called a sum gate and a product gate, respectively. The size of is the number of gates in , and the depth of is the length of the longest directed path in . A circuit computes a polynomial in the following natural way. An input gate computes the polynomial it is labeled by. A sum (product, resp.) gate computes the sum (product, resp.) of the polynomials computed by its in-neighbors in . A circuit is called non-cancelling If its input gates are labelled only by variables (no labelling by field elements).

Parameterized Complexity.

A parameterized problem can be considered as a set of pairs where is the problem instance and (usually a nonnegative integer) is the parameter. is called fixed-parameter tractable (FPT) if membership of in can be decided by an algorithm of runtime , where is the size of , is a computable function of the parameter only, and is a constant independent from and . Such an algorithm is called an FPT algorithm. Let and be parameterized problems with parameters and , respectively. An FPT-reduction from to is a many-to-one transformation from to , mapping each instance to an output such that (i) if and only if with for a fixed computable function , and (ii) is of complexity .

When the decision time is replaced by the much more powerful we obtain the class XP, where each problem is polynomial-time solvable for any fixed value of There is a number of parameterized complexity classes between FPT and XP (for each integer , there is a class W[]) and they form the following tower:

For the definition of classes W[], see, e.g., [15]. Due to a number of results obtained, it is widely believed that FPTW[1], i.e. no W[1]-hard problem admits an FPT algorithm [15].

A parameterized problem is in para-NP if membership of in can be decided in nondeterministic time , where is the size of , is a computable function of the parameter only, and is a constant independent from and

. Here, nondeterministic time means that we can use nondeterministic Turing machine. A parameterized problem

is para-NP-complete if it is in para-NP and for any parameterized problem in para-NP there is an FPT-reduction from to .

For a parameterized problem , a generalized kernelization from to is a polynomial-time algorithm that maps an instance to an instance (the generalized kernel) such that (i)  if and only if , (ii)  , and (iii)  for some computable function . The function is called the size of the generalized kernel. If , is a kernelization and is a kernel.

3 Directed -Simple -Path: FPT

In this section, we focus on the proof of the following theorem.

Theorem 3.1.

Directed -Simple -Path is FPT parameterized by . In particular, Directed -Simple -Path is solvable in time and polynomial space.

We remark that by polynomial space, we mean polynomial in .

3.1 Reduction to a Simpler Problem

In order to prove Theorem 3.1, we begin with two simple claims that reduce the Directed -Simple Long -Path problem to a related problem that is defined as follows. In the Directed -Simple Long -Path problem, we are given a strongly connected digraph , postive integers , and vertices . The objective is to either (i) determine that has an -simple -path or (ii) output the largest integer such that has an -simple -path of size . We observe that Directed -Simple -Path can be reduced to the special case of Directed -Simple Long -Path where the input digraph is strongly connected.

Lemma 3.1.

Suppose that Directed -Simple Long -Path on strongly connected digraphs can be solved in time and polynomial space. Then, Directed -Simple -Path can be solved in time and polynomial space.

Proof.

Let be an algorithm that solves Directed -Simple Long -Path on strongly connected digraphs in time and polynomial space. In what follows, we describe how to solve Directed -Simple -Path. To this end, let be an instance of Directed -Simple -Path. Let be the set of strongly connected components of . For every component , and vertices , we perform the following computation. We call with as input. If concludes that has an -simple -path, then we correctly conclude that is a Yes-instance. Else, we denote by the integer that outputs. Then, is the largest integer such that has an -simple -path of size . So far, the time spent is at most and the space used is polynomial.

Let be an ordering of the components in with the property that for all , and , it holds that . Now, we solve Directed -Simple -Path by dynamic programming (DP) as follows.

Let be a DP vector with an entry for every . This entry will store the largest integer such that has an -simple -path that ends at . At Step 1, we set for every . At Step , where , we set

for every .

It is straightforward to verify that the DP computation is correct and can be executed using polynomial time and space. After this computation is terminated, we correctly conclude that is a Yes-instance if and only if there exist such that . This completes the proof. ∎

From now on, we focus on the Directed -Simple Long -Path problem on strongly connected digraphs. Our second claim shows that the existence of a “long” path or a “long” cycle in the input graph implies that it has an -simple -path.

Lemma 3.2.

Let be a strongly connected digraph. If any of the following two conditions is satisfied, then has an -simple -path.

  • The graph has a cycle of length at least .

  • The graph has a path with at least vertices.

Proof.

First, suppose that has a cycle of length at least . Then, is a sequence of distinct vertices, besides the first and last vertex, of the form for some integer . In this case, where is duplicated exactly times, is an -simple -path for . Thus, has an -simple -path.

Second, suppose that has a path with at least vertices. Then, is a sequence of distinct vertices for some integer . Since is strongly connected, it has at least one path from the last vertex of to the first vertex of . Let denote any such path. Moreover, let denote the subsequence of where the first and last vertex visits are omitted. If , denote where is duplicated exactly times, and otherwise denote where is duplicated exactly times. Since every vertex occurs at most twice in , we have that every vertex occurs at most times in . Moreover, if , then the size of is , and otherwise the size of is . Therefore, is an -simple -path for some integer , which means that has an -simple -path. ∎

The following known proposition asserts that we can efficiently determine whether the input digraph has a long path or a long cycle.

Theorem 3.2 ([22, 45]).

There exists a deterministic algorithm that given a digraph , vertices , and , determines in time and polynomial space whether has a path from to on at least vertices.

Thus, from now on, we may assume not only that the input digraph is strongly connected, but that it also has neither a path of size at least vertices nor a cycle of length at least . Accordingly, we say that an instance of Directed -Simple Long -Path is nice if is strongly connected and it has neither a path with at least vertices nor a cycle of length at least . Moreover, we say that is positive if has an -simple -path, and otherwise we say that it is negative.

3.2 Bounding the Number of Distinct Arcs

Having established the two simple claims above, the second part of our proof concerns the establishment of an upper bound on the number of distinct arcs in at least one -simple -path (if at least one such walk exists) or at least one -simple -path of maximum size. The main definition in this part of the proof is the following one.

Definition 3.1.

Let be an instance of Directed -Simple Long -Path. Let be an -simple path in .

  • Let be the subgraph of that consists of the vertices and edges in that are visited at least once by , and let be the directed multigraph obtained from by replacing each arc by its copies, where is the number of times is visited on

  • Let be the set that contains and every vertex that occurs times in , and .

  • For any two (not necessarily distinct) vertices , denote . (In case , it holds that .)

Before we begin our analysis, we relate our problem to the notion of an Euler trail by a well-known proposition, to which we will repeatedly refer later.

Theorem 3.3 ([7, 17]).

Let be a directed multigraph whose underlying undirected graph is connected. Let .

  • If , then there exists an Euler -trail in if and only if and , and the out-degree and in-degree of any other vertex in are equal/

  • If , then there exists an Euler -trail in if and only if the out-degree and in-degree of every vertex in are equal.

Our argument will modify a given walk in a manner that might increase its length to keep certain conditions satisfied. To ensure that we never need to handle a walk that is too long, we utilize the following lemma.

Lemma 3.3.

Let be a nice instance of Directed -Simple Long -Path. Let be an -simple -path in for some integer . Then, has an -simple -path , for some integer , such that is a subgraph of that is not equal to .

Proof.

First, observe that since has neither a path of size at least nor a cycle of length at least , it holds that contains at least one cycle. We choose such a cycle arbitrarily. In what follows, we use the cycle to modify the walk in order to obtain a walk that has the desired property. To this end, let be the minimum number of times an arc of occurs in . Let be the directed multigraph obtained from by removing copies of every arc in as well as isolated vertices. In addition, let be the set of maximal components of whose underlying undirected multigraphs are connected in the underlying undirected multigraph of . Let and denote the first and last (not necessarily distinct) vertices visited by . We consider two subcases depending on .

  1. First, suppose that . Then, the underlying undirected graph of is connected. Since has a -path that visits every arc (that is the path ), by Theorem 3.3, every vertex in has out-degree equal to its in-degree, except for and in case —then, and . By the definition of , every vertex in has either both its out-degree and in-degree in equal to those in or both its out-degree and in-degree in smaller by compared to those in . Thus, every vertex in has out-degree equal to its in-degree, except for and in case —then, and . Thus, by Theorem 3.3, has an Euler trail . Moreover, since is nice, and therefore . Lastly, since the out- and in-degrees of at least one vertex of was reduced from in to in , it holds that . Thus, is an -simple -path, for some integer , such that is a subgraph of that is not equal to .

  2. Now, suppose that . Let be a component in that has minimum number of arcs. Then, . Let be the directed multigraph obtained from by removing all the arcs in as well as isolated vertices. Since has a -path that visits every arc (that is the path ), by Theorem 3.3, every vertex in has out-degree equal to its in-degree, except for and in case —then, and . As in the previous case, every vertex in has either both its out-degree and in-degree in equal to those in or both its out-degree and in-degree in smaller by compared to those in . If , this means that either both or both Indeed, as in any directed multigraph, in , the sum of in-degrees of all vertices equals the sum of out-degrees of all vertices. Thus, if then as well.

    However, this means that every vertex in has out-degree equal to its in-degree, except for and in case both and —then, and . Moreover, the underlying undirected graph of is connected (because consists of a collection of components in together with the arcs in that connect their underlying undirected graphs). Thus, by Theorem 3.3, has an Euler trail . Moreover, . In addition, since there is at least one vertex of that is present in but not in . Thus, is an -simple -path, for some integer , such that is a subgraph of that is not equal to .

In both cases, we constructed a walk with the desired property, hence the proof is complete. ∎

A repeated application of Lemma 3.3 brings us the following corollary.

Corollary 3.1.

Let be a nice instance of Directed -Simple Long -Path. Let be an -simple -path in for some integer . Then, has an -simple -path , for some integer , such that is a subgraph of that is not equal to .

We now prove that if is a positive instance of Directed -Simple Long -Path, then has an -simple -path for some such that and satisfy three properties regarding their structure. In addition, we prove that if is a negative instance of Directed -Simple Long -Path, then at least one -simple -path in of maximum size satisfies these three properties as well.

Lemma 3.4.

Let be a nice instance of Directed -Simple Long -Path. If is a positive instance, then has an -simple -path for some that satisfies the following three properties.

  1. is an acyclic digraph.

  2. For any (not necessarily distinct) , has at most one -path.777Recall that if , by a -path we mean a -cycle.

  3. .

Otherwise (if is a negative instance), has an -simple -path of maximum size that satisfies these three properties.

Proof.

We define a collection of walks as follows: if is a positive instance, then is the set of all -simple -paths in where ; otherwise, is the set of all -simple -paths in of maximum size. In both cases, . For any and -simple path of size , can contain at most vertices. Therefore, in the first case, since , every walk in satisfies Property 3. In the second case, every walk contains less than vertices (since the instance is negative), therefore satisfies Property 3. Thus, it suffices to show that there exists a walk in that satisfies Properties 1 and 2.

Let be the set of walks with minimum number of arcs in . Moreover, let be the set of walks that maximize .

We claim that every walk in satisfies Properties 1 and 2. For this purpose, we consider an arbitrary walk . Let and denote the first and last (not necessarily distinct) vertices visited by . (If is a negative instance, then and .) Suppose, by way of contradiction, that does not satisfy Property 1. Then, has a directed cycle . Let be the maximum out-degree in of a vertex in . Note that because . Let be the directed multigraph obtained from by adding copies of every arc in . Since has a -path that visits every arc (that is the path ), by Theorem 3.3, every vertex in has out-degree equal to its in-degree, except for and in case —then, and . By our construction of , it has the same property. Indeed, every vertex in has either both its out-degree and in-degree in equal to those in or both its out-degree and in-degree in larger by compared to those in . Thus, by Theorem 3.3, has an Euler trail