Hypergraph cyclicity has been identified as a key factor for the computational complexity of multiple fundamental database and reasoning problems. While various natural notions of hypergraph acyclicity exist, the two most general ones — - and -acyclicity — have proven to be the most relevant in the study of the complexity of reasoning. Important problems that are -hard in general often become tractable when restricted to acyclic instances. An example from databases is the evaluation of conjunctive queries (CQs), which is -hard in general but becomes tractable when the underlying hypergraph structure of the query is -acyclic (Yannakakis, 1981a). The restriction to -acyclic instances yields tractable classes for a variety of fundamental problems, including (Ordyniak et al., 2013), # (Brault-Baron et al., 2015), and evaluation (Brault-Baron, 2012; Ngo et al., 2014). Notably, these problems remain -hard when restricted to -acyclic instances or #P-hard in the case of #(see (Capelli et al., 2014)).
However, while both types of acyclicity have proven to be interesting, the generalization of -acyclicity has received significantly more attention. There, a rich hierarchy of width measures has been developed over the last two decades. The tree-likeness of -acyclic hypergraphs has been successfully generalized by hypertree width (Gottlob et al., 2002) and even further to fractional hypertree width (Grohe and Marx, 2014). What makes these generalizations particularly interesting is that they remain sufficient conditions for tractable CQ evaluation, emphasizing the deep connection between cyclicity and the complexity of CQs. The yet more general submodular width (Marx, 2013)
characterizes the fixed-parameter tractability of CQ evaluation on the hypergraph level. Moreover, related research has revealed notable parallels to other fields, e.g., to game theory(Gottlob et al., 2005) and information theory (Atserias et al., 2013; Khamis et al., 2017).
Despite the unquestionable success of the generalization of -acyclicity, the generalization of -acyclicity has received little attention so far. In the most prominent approach, Gottlob and Pichler (Gottlob and Pichler, 2001) introduced -hypertree width () as an analogue to hypertree width. In particular, they define as the maximum hypertree width over all subhypergraphs, mirroring a characterization of -acyclicity in terms of every subhypergraph being -acyclic. However, it is difficult to exploit low algorithmically. An inherent problem with is that a witness for low would need to include a hypertree decomposition for each of the, exponentially many, subhypergraphs. Furthermore, none of the problems listed above as tractable on -acyclic instances are known to be tractable for bounded (beyond those that are tractable for the more general bounded ).
In recent work, Carbonell, Romero, and Zivny introduced point-decompositions and the accompanying point-width () (Carbonnel et al., 2019), which generalizes both -acyclicity and MIM-width(Sæther et al., 2015). They show that, given a point-decomposition of bounded point-width and polynomial size, Max-CSP can be decided in polynomial time. However, just as with , it is not known if can be decided in polynomial time, even for constant .
In this paper, we propose a new generalization of -acyclicity which we call nest-set width (). In contrast to and , it is not based on decompositions but instead generalizes a characterization of -acyclicity by the existence of certain kinds of elimination orders. Nest-set width has several attractive properties that suggest it to be a natural extension of -acyclicity. Importantly, can be decided in fixed-parameter tractable time when parameterized by . Furthermore, we show that bounded yields new islands of tractability for and evaluation. The full contributions of this paper are summarized as follows:
We introduce a new hypergraph width notion – nest-set width – that generalizes the existence of nest point elimination orders.
We establish the relationship of to other related widths. In particular, we show that bounded is a special case of bounded and incomparable to other prevalent width measures such as bounded clique width and treewidth.
It is shown that deciding is -complete when is part of the input but fixed-parameter tractable when parameterized by .
Building on work by Brault-Baron (Brault-Baron, 2012) for the -acyclic case, we show the tractability of evaluation of boolean CQs with negation for classes with bounded .
Finally, we demonstrate how to derive the fixed-parameter tractability of parameterized by from our main result.
The rest of the paper is structured as follows. Section 2 introduces necessary notation and preliminaries. We define nest-set width and establish some basic properties in Section 3. We move on to establish the relationship between and other width measures, most importantly , in Section 4. The complexity of checking is discussed in Section 5. The tractability of under bounded is shown in Section 6. Concluding remarks and a discussion of future work in Section 7 complete the paper.
For positive integers we will use as a shorthand for the set . When is a set of sets we sometimes write for . The same applies analogously to intersections.
Linear orders will play an important role throughout this paper. Recall, a binary relation is a linear order if it is antisymmetric, transitive and connex (either or for all and ). We will be particularly interested in whether the subset relation is a linear order on some domain. If is a linear order for some set , we say is linearly ordered by . Note that is inherently transitive and antisymmetric and we can limit our arguments to connexity.
We use standard notions from (parameterized) computational complexity theory such as reductions and the classes P and . We refer to (Papadimitriou, 2007) and (Cygan et al., 2015) for comprehensive overviews of computational complexity and parameterized complexity, respectively. Furhtermore, we assume the reader to be familiar with propositional logic.
2.1. Hypergraphs, Acyclicity & Width
A hypergraph is a pair where is a set of vertices and is a set of hyperedges. For hypergraph and vertex , we denote the set of incident edges of as . The notation is extended to sets of vertices as . We say an edge is incident to the set . If is clear from the context we drop in the argument and write only .
A subhypergraph of is a hypergraph with and . The vertex induced subhypergraph of is the hypergraph with and . For a set of vertices we write as shorthand for the vertex induced subhypergraph .
All common notions of hypertree acyclicity have numerous equivalent definitions (see e.g., (Fagin, 1983; Brault-Baron, 2016)). Here we recall only those definitions that are necessary to present the results of this paper.
A join tree of is a pair where is a tree and is a bijection from the nodes in to the edges of such that the following holds: for every the set is a subtree of . If has a join tree, then we say that is -acyclic.
A (weak) -cycle is a sequence with where are distinct hyperedges, , and are distinct vertices. Moreover, for all , is in and and not in any other edge of the sequence. A hypergraph is -acyclic if it has no -cycle.
An alternative (equivalent) definition of -acyclicity is that is -acyclic if and only if all subhypergraphs of are -acyclic. In this paper, a third characterization of -acyclicity will be important. We call a vertex of a nest-point if is linearly ordered by . We can then characterize -acyclicity by a kind of elimination order for nest-points (this will be made more precise for a more general case in Definition 3.2).
Proposition 2.1 ((Duris, 2012)).
A hypergraph is -acyclic if and only if the empty hypergraph can be reached by successive removal of nest-points and empty-edges from .
Join trees have been successfully generalized to hypertree decompositions. A hypertree decomposition (Gottlob et al., 2002) of a hypergraph is a tuple , where is a rooted tree, for every node of the tree, is called the bag of node , and is the cover of . Furthermore, must satisfy the following properties.
The subgraph for vertex is a tree.
For every there exists a such that .
For every node in it holds that .
Let be the subtree of rooted at node and let be the union of all bags of nodes in . For every node in it holds that .
The first property is commonly referred to as the connectedness condition and the fourth property is called the special condition. The hypertree width () of a hypertree decomposition is and the hypertree width of () is the minimal width of all hypertree decompositions of .
If we exclude the special condition in the above list of properties, we obtain the definition of a generalized hypertree decomposition. The generalized hypertree width of hypergraph () is defined analogously to before as the minimal width of all generalized hypertree decompositions of .
It is known that if and only if is -acyclic (Gottlob et al., 2002). Analogous to the definition of -acyclicity in terms of every subhypergraph being -acyclic, Gottlob and Pichler (Gottlob and Pichler, 2001) introduced -hypertree width is a subhypergraph of . Note that we therefore also have if and only if is -acyclic.
We will make some comparisons to some further well-known width notions of hypergraphs: treewidth and clique width of the primal and incidence graph. The technical details of these concepts are of no importance in this paper and we refer to (Gottlob and Pichler, 2001) for full definitions.
2.2. Conjunctive Queries
A signature is a finite set of relation symbols with associated arities. We write for the arity of relation symbol . A database (over signature ) consists of a finite domain and a relation for each relation symbol in the signature.
A conjunctive query with negation (over signature ) is a set of literals. A literal is of the form where are variables and is either or for any -ary relation symbol in . If is of the form we call the literal positive, otherwise, if it is of the form we say that the literal is negative. We commonly refer to a simply as query. We write for the set of all variables that occur in the literals of query . We sometimes denote queries like logical formulas, i.e., with the understanding that the query is simply the set of all conjuncts.
Let be a query and a database over the same signature. We call a function an assignment for . For a set of variables we write for the assignment with domain restricted to . In a slight abuse of notation we also write for the tuple where is a sequence of variables. An extension of an assignment is an assignment with and for every variable .
We say that the assignment satisfies a positive literal if . Similarly, satisfies a negative literal if . An assignment satisfies a query (over database ) if it satisfies all literals of . We write for the set of all satisfying assignments for over . We can now define the central decision problem of this paper.
Instance: & A and a database
Question: & ?
A query has an associated hypergraph . The vertices of are the variables of . Furthermore, has an edge if and only if there exists a literal or in .
To simplify later arguments we will assume that every relation symbol occurs only once in a query. We will therefore sometimes write the relation symbol, without the variables, to identify a literal. Note that every instance of can be made to satisfy this property, by copying and renaming relations, in linear time.
Finally, for our algorithmic considerations we assume a reasonable representation of queries and databases. In particular we assume that a relation has a representation of size . Accordingly, we assume the size of a database as and the size of a query as . Finally, we refer to the cardinality of the largest relation in as . When the database is clear from the context we write just .
3. Nest-Set Width
In this section we introduce nest-set width and establish some of its basic properties. The crucial difference between and is that the generalization is based on a different characterization of -acyclicity. While generalizes the condition of every subgraph having a join tree, nest-set width instead builds on the characterization via nest point elimination from Proposition 2.1. We start by generalizing nest points to nest-sets:
Definition 3.1 (Nest-Set).
Let be a hypergraph. A non-empty set of vertices is called a nest-set in if the set
is linearly ordered by .
As the comparability by of sets minus a nest-set will appear frequently, we introduce explicit notation for it. Let be a hypergraph and . For two sets of vertices , we write for . We could thus alternatively define nest-sets as those sets for which is linearly ordered by .
In later sections, the maximal elements with respect to will play an important role. For a nest-set we will refer to a maximum edge in w.r.t. as a guard of . Note that there may be multiple guards. However, in all of the following usage it will make no difference which guard is used and we will implicitly always use the lexicographically first one (and thus refer to the guard).
Like for nest points, we want to investigate how a hypergraph can be reduced to the empty hypergraph by successive removal of nest sets. We formalize this notion in the form of nest-set elimination orderings.
Definition 3.2 (Nest-Set Elimination Ordering).
Let be a hypergraph and let be a sequence of sets of vertices. Define and . We call a nest-set elimination ordering (NEO) if, for each , is a nest-set of and is the empty hypergraph.
Note that an elimination ordering is made up of at most nest-sets. We are particularly interested in how large the nest-sets have to be for a NEO to exist. Hence, we introduce notation for restricted-size nest-sets and NEOs:
If is a nest-set of with at most elements then we call a -nest-set.
A nest-set elimination ordering that consists of only -nest-sets is a -nest-set elimination ordering (-NEO).
Finally, the nest-set width of a hypergraph is the lowest for which there exists a -NEO.
It is easy to see that a hypergraph has a -nest-set if and only if is a nest point. Therefore, a -NEO corresponds directly to a sequence of nest point deletions that eventually result in the empty hypergraph. As this is exactly the characterization of -acyclicity from Proposition 2.1, we see that generalizes -acyclicity.
Corollary 3.3 ().
A hypergraph has if and only if is -acyclic.
Example 3.4 ().
Let be the hypergraph with edges , , , , and . Figure 1 illustrates the step-wise elimination of according to the -NEO .
For the first nest-set we see that , and . To verify that is a nest-set of we observe that . Note that is also a nest-set of whereas is not since and are both in and clearly neither nor holds.
In the second step of the elimination process we then consider and the nest-set . It is again straightforward to verify that is linearly ordered by . This is in fact the only nest-set of . The third nest-set in the NEO, only becomes a nest-set after elimination of : observe that which is not linearly ordered by .
In the final step, only has two vertices left. The set of all vertices of a hypergraph is trivially a nest-set since is always . Thus, the set is a nest-set of . The hypergraph has no -NEO (it has a -cycle) and therefore .
An important difference between - and -acyclicity is that only the latter is hereditary, i.e., if hypergraph is -acyclic then so is every subhypergraph of . Nest-set width, just like -acyclicity and -hypertree width, is indeed also a hereditary property. In the following two simple but important lemmas, we first establish that NEOs remain valid when vertices are removed from the hypergraph (and the NEO) and then show that this also applies to removing edges.
Note that the construction in the following lemma, and Lemma 3.6 below, can technically create empty sets in the resulting NEOs. Formally speaking this is not allowed (recall that nest-sets are non-empty). Whenever this occurs the implicit meaning is that all the empty sets are removed from the NEO.
Lemma 3.5 ().
Let be a hypergraph with -NEO and let . Then the sequence is a -NEO of .
We first show that for any nest-set let we have that is either the empty set or a nest-set of .
Suppose is not empty and not a nest-set of , then there are that are not comparable by . It is easy to see that there exist such that and . Since is a nest-set in , w.l.o.g., and therefore also
and we arrive at a contradiction.
It follows that is a -nest-set of . Since is a NEO, must be a nest-set of . Now, to verify we need to show that is a -nest-set of . However, this is clearly the same hypergraph as and the above obsevation applies again. We can repeat this argument for all until and thus is a -NEO. ∎
Lemma 3.6 ().
Let be a hypergraph with -NEO . Let be a connected subhypergraph of and the set of vertices no longer present in the subhypergraph. Then the sequence is a -NEO of .
From the argument at the beginning of the proof of Lemma 3.5 we know that is empty or a nest-set of . Therefore, has a linear order under . Now, since does not contain any vertices from and is a subhypergraph of we have and thus . Therefore can be linearly ordered by and thus is a nest-set. This observation can again be iterated along the NEO in the same fashion as in the proof of Lemma 3.5 to prove the statement. ∎
4. Nest-Set Width vs. -Hypertree Width
A wide variety of hypergraph width measures have been studied in the literature. To provide some context for the later algorithmic results, we will first investigate how relates to a number of prominent width notions from the literature. In particular, in this section we show that is a specialization of -hypertree width and incomparable to primal and incidence clique width and treewidth. The relationship to -hypertree width is of particular interest since bounded also generalizes -acyclicity. The section is structured around proving the following theorem.
Theorem 4.1 ().
Bounded is a strictly less general property than bounded -hw. In particular, the following two statements hold:
For every hypergraph we have -hw.
There exists a class of hypergraphs with bounded -hw and unbounded .
We begin by establishing a useful technical lemma that will eventually lead us to the second statement of Theorem 4.1. An important consequence of the following Lemma 4.2 is that the length (minus 1) of the longest -cycle of is a lower bound of since any vertex in a cycle has to be removed at some point in any NEO.
Lemma 4.2 ().
Let be a -cycle in a hypergraph . For every nest-set of we have that is either or at least .
Suppose vertex . Then and are in and therefore must be comparable by . Since is a -cycle, we have that is in but not in . The opposite situation holds for , which is in but cannot be in . Hence, either or must also be in to make these edges comparable by .
Then, if , we can repeat the argument on the comparability of and . If , we repeat the argument for and . We can continue this process until the two edges that are compared share a vertex, i.e., when the cycle is complete. ∎
Lemma 4.2 further emphasizes the aforementioned distinction between generalizing acyclicity in sense of tree-likeness and our approach. Any cycle graph has hypertree width whereas the lemma shows us that since any nest-set will contain at least one vertex of the cycle, so it must contain at least of them. Furthermore, cycle graphs have clique width at most 4 (Courcelle and Olariu, 2000) and treewidth at most 2. We therefore arrive at the following lemma.
Lemma 4.3 ().
There exists a class of hypergraphs that has bounded , treewidth, and clique width and unbounded .
The lemma establishes the second statement of Theorem 4.1. We can derive some further results by combining Lemma 4.3 with results from (Gottlob and Pichler, 2001). There it was shown that there exist classes of -acyclic hypergraphs that have unbounded clique width and treewidth. In combination with the previous lemma this demonstrates that bounded clique width and bounded treewidth are incomparable to bounded . The results in (Gottlob and Pichler, 2001) also apply to incidence clique width and incidence treewidth and since the incidence graph of a cycle graph is also a cycle graph, so does Lemma 4.3. Thus, bounded is also incomparable to bounded incidence clique width and bounded incidence treewidth. The resulting hierarchy is summarized in Figure 2 at the end of this section.
We move on to show that . We will give a procedure to construct a generalized hypertree decomposition of width from a -NEO. Since -NEOs are hereditary, every subhypergraph of will also have a generalized hypertree decomposition of width . By a result of Adler, Grohe, and Gottlob in (Adler et al., 2007) we have that . From there we can then derive our bound of . In particular, we make use of the observation that a nest-set is connected to the rest of the hypergraph only via its guard. The necessary details of this observation are captured by the following two definitions and the key Lemma 4.6 below. The following construction is inspired by the hinge decompositions of Gyssens, Jeavons, and Cohen (Gyssens et al., 1994).
Definition 4.4 (Exhaustive Subhypergraphs).
Let be a hypergraph and . Let be the edges covered by . Then we call the subhypergraph with the exhaustive -subhypergraph of .
We use the term connected exhaustive -subhypergraphs of to refer to the connected components of (considering each component as an individual hypergraph).
We use exhaustive subhypergraphs to express that, when we remove a set of edges from , then we also want to remove the edges that are covered by . The following construction of a hypertree decomposition from a NEO will use sets of the form as its bags. This means that the respective bag also covers all edges in . We are therefore interested in the components resulting from removing all of instead of just from .
In particular, we want to remove sets of edges in such a way that the exhaustive -subhypergraphs are all connected to via a single edge. This will allow us to bring together the decompositions of the subhypergraphs in a way that preserves all properties of hypertree decompositions.
Definition 4.5 (Exhaustive Hinges).
Let be a hypergraph, and the connected exhaustive -subhypergraphs of . For an we say that is an exhaustive -hinge if for every we have that .
Lemma 4.6 ().
Let be a -nest-set of hypergraph and let be the guard of . Then there exists an exhaustive -hinge with the following properties:
Let and be as in the statement. Let be a minimal edge cover of . Observe that as is incident to and therefore . We now claim that is the required hinge. Clearly we have . For the first property, recall that for every we have and thus also . It is then easy to see from the definition of that and the property follows.
What is left to show is that that is in fact an exhaustive -hinge. Let be one of the connected exhaustive -subhypergraphs of and partition the set in two parts: and .
First we argue that . It was already established that , thus every edge incident to is removed in the exhaustive -subhypergraph. It is therefore impossible for a vertex of to be in .
Second, observe that by construction every edge in is incident to and by definition of the guard of we thus have . It follows immediately that and the statement holds. ∎
Lemma 4.6 is the key lemma for our construction procedure. It tells us that we can always find a small exhaustive hinge in a hypergraph if it has a -NEO. By the first property from the lemma, the exhaustive subhypergraph no longer contains the vertices . From the connected exhaustive -subhypergraphs we can construct subhypergraphs of that connect to via a single edge and have shorter -NEOs than . Since the subhypergraphs are connected to via a single edge, it is straightforward to combine individual hypertree decompositions for every subhypergraph into a new decomposition for . This step can then be applied inductively on the length of the -NEO to construct a hypertree decomposition of width for .
Lemma 4.7 ().
For any hypergraph it holds that .
We show by induction on that if a hypergraph has a -NEO of length then it has a generalized hypertree decomposition of width at most . For the base case, , the NEO consists of a single nest-set with . The base case then follows from the straightforward observation that .
Suppose the statement holds for . We show that it also holds for every -NEO of length . Let be a -NEO of . Let be the guard of and let be the exhaustive -hinge from Lemma 4.6 and let be the connected exhaustive -subhypergraphs. Finally, for each , we add to to obtain the hypergraph .
Therefore, has a -NEO of length at most and we can apply the induction hypothesis to get a generalized hypertree decomposition with of . Observe that the hypergraph has an edge which has to be covered completely by some node in .
Let be a fresh node with and . For each we now change the root of to be and attach the tree as a child of . A cover of a node in can contain an edge that are not in because the vertices are removed. Such an is always a subedge of an edge in and can therefore be replaced by an edge in in a way that the bag is still covered. We claim that this newly built decomposition is a generalized hypertree decomposition of with . It is not difficult to verify that this new structure indeed satisfies all proprieties of a generalized hypertree decomposition. Detailed arguments for the claim are given in the full proof that is available in the appendix. ∎
5. The Complexity of Checking Nest-Set Width
For the existing generalizations of -acyclicity – and – it is not known whether one can decide in polynomial time if a structure has width , even when is a constant. This then also means that no efficient algorithm is known to compute the respective witnessing structures. In these situations, tractability results are inherently limited. One must either assume that the witnesses are given as an input or that a tractable algorithm does not use the witness at all. In comparison, deciding treewidth is fixed-parameter tractable when parameterized by (Bodlaender and de Fluiter, 1996) and checking hypertree width is tractable when is constant (Gottlob et al., 2002).
When is considered constant, it is straightforward to find a -NEO in polynomial time, if one exists. We can simply check for all combinations of up to vertices whether they represent a nest-set. If so, eliminate the nest-set and repeat from the beginning on the new hypergraph until it becomes empty. By Lemma 3.5, this greedy approach of always using the first found -nest-set will result in a sound and complete procedure.
However, we can improve on this straightforward case by analyzing the following decision problem where is part of the input.
Instance: & A hypergraph , integer
Question: & ? We first observe that is -complete in Section 5.1. In more positive news, we are able to show that is fixed-parameter tractable when parameterized by in Section 5.2. Importantly, the fixed-parameter algorithm explicitly constructs a -NEO as a witness, if one exists, and can therefore serve as a basis for the algorithmic results in the following sections.
We show -hardness by reduction from , a classical -complete problem (Karp, 1972). Due to space limitations we provide only an outline of the most important ideas for the reduction here. A full proof is available in the appendix.
Theorem 5.1 ().
Finding a vertex cover of can be seen as making a sequence of choices on the edges of . For every edge of either , or , or both must be in the vertex cover. We can intuitively encode this choice into finding a nest-set by two edges (the choice edges) and where is some set of vertices that contains neither nor . Observe that and become comparable by exactly when , or , or both vertices are removed. Hence, by enforcing that and are in for every nest-set , we can encode the choice of how to cover into finding a nest-set.
When we encode the covering choices for each edge this way we need to be careful to not introduce any additional choices as artifacts of our construction. We therefore construct our choices in layers such that every layer corresponds to the vertex cover choice for an edge in . We let both of the edges that encode the choice at layer contain all vertices that occur in all lower layers. Then, all edges on lower layers are already subsets of the edges at layer , even without any vertex removal. If one removes vertices such that all the choices in the individual layers are resolved, then all edges of the construction are linearly ordered.
Such layers of choice edges make up the core of the reduction. The initial nest-set of the constructed hypergraph corresponds to a vertex cover for . After this first crucial nest-set is eliminated, the remaining hypergraph is -acyclic. Hence, the of the constructed hypergraph depends only on the size of the first nest-set. It follows that deciding whether a hypergraph has a -nest-set is also -complete. ∎
On a final note, one may notice similarities between finding nest-sets and an important work by Yannakakis (Yannakakis, 1981b) on vertex-deletion problems in bipartite graphs. Yannakakis gives a complexity characterization for vertex-deletion problems on bipartite graphs that extends to hypergraphs via their incidence graph. Furthermore, the specific problem of finding a vertex-deletion such that the edges of the hypergraph become linearly ordered by is stated to be polynomial. While this strongly resembles the nest-set problem, the results of Yannakakis are not applicable here since we are not interested in a global property of the hypergraph but only in the orderability of the edges that are incident to the deleted vertices.
5.2. Fixed-parameter Tractability
Recall that every nest-set has a maximal edge with respect to ; the guard of . The main idea behind the algorithm presented in this section is to always fix an edge and check if there exists a nest-set that specifically has has its guard. This will allow us to incrementally build a nest-set relative to the guard . We first demonstrate this principle in the following example.
Example 5.2 ().
We consider a hypergraph with three edges , , and . We want to find a nest-set with guard . The hypergraph with highlighted is shown in Figure 5.2. To start, if is a nest-set with guard , then at least one vertex of must be in . For this example let .
Since we also have that . For to be a nest-set with guard it must then hold that . Since is in but not in we can deduce that also . More generally, any vertex that occurs in an edge from but not in must be part of the nest-set . Now, since it follows that and therefore, by the previous observation, also .
At this point we have deduced that if is in , then so are and . We now have the situation that for every edge we have . However, as illustrated in Figure 5.2, is not linearly ordered by . A nest-set must therefore contain further vertices. In this case it is easy to see that either removing or is enough. In conclusion we have shown that if , then there are two -nest-sets and that have guard .
What makes the problem difficult is that there can be many possible ways of making edges linearly ordered by vertex deletion. In Example 5.2 both choices, removing either or , lead to a -nest-set. However, suppose there were an additional edge . Then, choosing would also imply and . Choosing would lead to a smaller nest-set.
In general, this type of complication can occur repeatedly and it is therefore necessary to continue this expansion procedure for all possible (minimal) ways of ordering the known incident edges of . We will therefore first establish an upper bound on these possible expansions.
Intuitively, when we have edges and , the only way they become comparable by is if either or is removed. The existence of a linear order over all the edges thus requires resolving all such conflicts. By encoding these conflicts in a kind of conflict graph we can see that the problem is equivalent to finding a vertex cover in the conflict graph.
Definition 5.3 (-conflict graph).
Let be a hypergraph, we define the -conflict graph of as the graph obtained by the following construction (with ): For every two distinct edges , if and , then add an edge to . We say that and have a -conflict in .
Lemma 5.4 ().
Let be a hypergraph and let . Then is linearly ordered by if and only if is a vertex cover of the -conflict graph of .
Let be the -conflict graph of . We first show the implication from right to left. Let be a vertex cover for and suppose that is not linearly ordered by . Hence, there are two edges that are incomparable, i.e., there exist vertices and and neither nor is in the vertex cover . A conflict can not be introduced by removing the vertices of and therefore it was already present in . Therefore, there must be an edge in that is not covered by , contradicting that is a vertex cover.
For the other direction let such that is linearly ordered by . Then for every -conflict, i.e., every pair of vertices where there are with and , at least one of must be in . All edges of are exactly between such pairs of vertices, hence contains at least one vertex of each edge in . Therefore is also a vertex cover of . ∎
This correspondence allows us to make use of the following classical result by Fernau (Fernau, 2002) on the enumeration of all minimal vertex covers. A vertex cover is called a minimal vertex cover if none of its subsets is a vertex cover.
Proposition 5.5 ((Fernau, 2002)).
Let be a graph with vertices. There exist at most minimal vertex covers with size and they can be fully enumerated in time.
In combination with Lemma 5.4 we therefore also have an upper bound on computing all minimal vertex deletions that resolve all -conflicts. With this we are now ready to state Algorithm 1 which implements the intuition described at the beginning of this section. The algorithm is given a hypergraph and an edge to use as guard and tries to find a -nest-set with guard by exhaustively following the steps described in Example 5.2. We are able to show that this indeed leads a correct procedure for finding -nest-sets with a specific guard. See the appendix for a full proof of this statement.
Lemma 5.6 ().
Algorithm 1 is sound and complete.
For the sake of simplicity, Algorithm 1 is stated as a decision procedure. Even so, it is easy to see that a -nest-set with the appropriate guard has been constructed at any accepting state. It is then straightforward to use Algorithm 1 to decide in fixed-parameter polynomial time if a hypergraph has any -nest-set, and if so output one. In the following we use for the size of hypergraph .
Theorem 5.7 ().
There exists a time algorithm that takes as input hypergraph and integer and returns a -nest-set of if one exists, or rejects otherwise.
We simply call Algorithm 1 once for each edge of as the guard. Since every nest-set has a guard Lemma 5.6 implies that this will find an appropriate nest-set if one exists. If all calls reject, then there can be no nest-set with at most elements as it is not guarded by any edge of .
What is left to show is that Algorithm 1 terminates in time. Calling the procedure times clearly preserves this bound. First, observe that every recursive call of NestExpand increases the cardinality of by at least one. The call tree of the recursion therefore has maximum depth . Furthermore, by Proposition 5.5 every node in the call tree has at most children if , or exactly one when . Hence, at most calls to NestExpand are made in one execution of Algorithm 1.
Once we can find individual -nest-sets, finding -NEOs becomes simple. Recall from Lemma 3.5 that vertex removal preserves -NEOs. Thus straightforward greedy removal of -nest-sets is a sound and complete algorithm for finding -NEOs. Since at most nest-set removals are required to reach the empty hypergraph, using the procedure from Theorem 5.7 to find the -nest-sets yields a time algorithm for .
Corollary 5.8 ().
parameterized by is fixed-parameter tractable.
6. Nest-Set Width & Conjunctive Queries with Negation
We move on to prove our main algorithmic result. Recall that a query has an associated hypergraph . We define the nest-set width of the query as . We say that a class of instances has bounded if there exists a constant , such that every query in has .
Theorem 6.1 ().
For every class of instances with bounded , is decidable in polynomial time.
As usual, the result can be extended to unions of conjunctive queries with negation (U) when the of a Uis defined to be the maximum of its parts.
While the complexity of CQs without negation has been extensively studied and is well understood, few results extend to the case where negation is permitted. When there are only positive literals, then the satisfying assignments for each literal are explicitly present in the database. Finding a solution for the whole query thus becomes a question of finding a consistent combination of these explicitly listed partial assignments. However, with negative literals it is possible to implicitly express a large number of satisfying assignments. Recovering an explicit list of satisfying assignments for a negative literal may require exponential time and space.
This additional expressiveness of negative literals has important implications for the study of structural parameters. While evaluation of CQs is -complete with and without negation, permitting negation allows for expressing problems as queries with a simpler hypergraph structure. Such a change in expressiveness relative to structural complexity must also be reflected in structural parameters that capture tractable classes of the problem.
As an example, consider for propositional formulas in conjunctive normal form (CNF). Recall, that for a formula in CNF, the corresponding hypergraph has as its vertices the variables of the formula and every edge is the set of variables of some clause in the formula. A clause has satisfying assignments to the variables of the clause. Thus, a corresponding positive literal in a CQ, that contains all the satisfying assignments, will be of exponential size (unless the size of clauses is considered bounded). On the other hand, there is a single assignment to that does not satisfy . It is therefore possible to compactly encode by having a negative literal for each clause that excludes the respective non-satisfying assignment. Since this reduction preserves the hypergraph structure of the formula it follows that structural restrictions can only describe a tractable fragment of if they also make tractable. For example, is -hard when restricted to -acyclic formulas (Ordyniak et al., 2013), and thus so is evaluation. In contrast, evaluation of CQs without negation is tractable for -acyclic queries (Yannakakis, 1981a).
Theorem 6.2 (Implicit in (Ordyniak et al., 2013)).
is -hard even when restricted to -acyclic queries.
To simplify the presentation we make the following assumptions on the instances of . First we assume that queries in instances are always safe, i.e., no variable occurs only in negative literals. An unsafe query can always be made safe: If a variable occurs only in negative literals, we simply add a new literal with to the query. The resulting query is clearly equivalent to the unsafe one on the given domain. Importantly, the additional unary literals does not change the nest-set width of the query. At some points in the algorithm we operate on (sub)queries that are not safe. The assumption of safety is made for the starting point of the procedure.
Our second assumption is that the size of the domain is exactly a power of 2, i.e., for some integer . Since we already assume safe queries, increasing the size of the domain has no effect on the solutions since the newly introduced constants cannot be part of any solution. Furthermore, this assumption increases the size of the domain at most by a constant factor less than .
6.1. Relation to Previous Work
The algorithm presented here builds on the work of Brault-Baron (Brault-Baron, 2012) for the -acyclic case. While we can reuse some of the main ideas, the overall approach used there does not generalize to our setting. There the tractability is first shown for boolean domains, i.e., the domain is restricted to only two values. over arbitrary domains is reduced to the problem over the boolean domain by blowing up each variable in such a way as to encode the full domain using boolean variables. This naturally requires every variable in the original query to be replaced by many new variables. While this operation preserves -acyclicity, it can increase by a factor .
Example 6.3 ().
Consider the following query and a domain with elements.
The reduction to a query over the boolean domain will then replace every variable by three variables , resulting in the equivalent query over the boolean domain
It is easy to see that has because any combination of two variables is a nest-set of . However, while is a nest-set of , this does not translate to the existence of a -nest-set in . It is easy to verify that any for is not a nest-set. Indeed, applying the ideas from Section 5.2 it is easy to see that in general, for such a triangle query, .
A subtle but key observation must be made here. While the previous example shows that the variable blowup from the binary encoding affects the nest-set width in general, this does not happen when . Consider a nest-set of some hypergraph . The edges incident to are linearly ordered by . If we add a new vertex in all the edges that contain , then clearly the edges incident to are the same as those of and therefore also linearly ordered by .
Lemma 6.4 ().
Let be a hypergraph with a nest-set . Let be a hypergraph obtained by adding a new variable to that occurs exactly in the same edges as . Then and are both nest-sets of .
This subtle difference between -nest-sets and larger nest-sets will be principal to the following section.
6.2. Eliminating Variables
The algorithm in the following Section 6.3 will be based around successive elimination of variables from the query. This elimination will be guided by a nest-set elimination ordering where we eliminate all variables of a nest-set at once. This elimination of a nest-set is performed in three steps.
Eliminate all occurrences of variables from in positive literals.
Extend the negative literals incident to in such a way that they form a -acyclic subquery.
Eliminate the variables of from the -acyclic subquery.
In this section we introduce the mechanisms used for these steps. For steps 1 and 2 we need to extend literals in such a way that their variables include all variables from some set . We do this in a straightforward way by simply extending the relation by all possible tuples for the new variables. It is then easy to see that such extensions are equivalent with respect to their set of satisfying solutions.
Definition 6.5 ().
Consider a literal where is either or and the respective relation . Let be a set of variables and let be the variables in that are not used in the literal. We call the literal with the new relation the -extension of (where represents the -ary Cartesian power of the set and we use the relational algebra semantics of the product ).
Lemma 6.6 ().
Let be the -extension of . Then an assignment satisfies if and only if every extension of to satisfies .
The process for positive elimination is simple. Straightforward projection is used to create a positive literal without the variables from . A new negative literal then restricts the extensions of satisfying assignments for the new positive literal to exactly those that satisfy the old positive literal. A slightly simpler form of this method was already used in (Brault-Baron, 2012).
Lemma 6.7 ().
Let be a positive literal. Define new literals with and with where is the -extension of . Then an assignment satisfies if and only if satisfies .
For the elimination of variables that occur only negatively we build upon a key idea from (Brault-Baron, 2012). There, a method for variable elimination is given for the case where the domain is specifically . We repeat parts of the argument here to highlight some important details. Consider a query . The main observation is that the satisfiability of the negative literals with variables is equivalent to satisfiability of the formula