Throughout the paper, we denote by the set of Boolean variables. We will refer to the members of as positive and to their negations as negative literals, respectively. A Boolean function is a mapping . A conjunctive normal form (CNF) is the conjunction of clauses, where each clause is a disjunction of literals. The CNF is also viewed as a set of clauses .
A CNF is called Horn if each of its clauses contains at most one positive literal, and pure Horn if every clause contains exactly one positive literal. A Boolean function is called (pure) Horn if it has a (pure) Horn CNF representation. Note that every CNF defines a Boolean function, but a Boolean function may have many different CNF representations. For instance, given the pure Horn CNF on variables , we can also represent the same Boolean function by the pure Horn CNF . Note that a pure Horn clause can also be viewed as an implication. For instance, is equivalent with the implication . Thus, we can view a pure Horn CNF as an implication system, e.g., we shall write equivalently, as . For an implication we call the body and the head. We say that is an implicate of the Horn function if any assignment that falsifies also falsifies . In particular, if is represented by a pure Horn CNF then the clauses of this CNF are all implicates of .
The concept of Horn functions has been widely studied under different names, such as directed hypergraphs in graph theory and combinatorics 
, as implication systems in machine learning[1, 2], database theory [3, 18], and as lattices and closure systems in algebra and concept lattice analysis [8, 15]. Horn functions form a fundamental subclass of Boolean functions endowed with interesting structural and computational properties. The satisfiability problem can be solved for Horn functions in linear time and the equivalence of such formulas can be decided in polynomial time . Horn functions are strongly related to relational databases  and many interesting algorithmic problems arise from that context. Given a database, we associate the set of Boolean variables to the set of attributes of the database. For and we write if the knowledge of the attribute values in uniquely determines the value of (in the database records). Such a relation is called a functional dependency in the database. The set of all functional dependencies define a unique pure Horn function associated to the given database [3, 18]. One of the important notions that arise from databases is the notion of a key. A key in a relational database is a set of attributes the values of which determine uniquely the values of all other attributes. Accordingly, a subset of the variables is a key of a Horn function if is an implicate of for all .
We call a pure Horn function key Horn if the body of any of its implicates is a key of the function. Key Horn functions generalize the well studied class of hydra functions introduced in , where all the bodies are of size . Finding the shortest CNF representation of a given Horn function with respect to multiple relevant measures (number of clauses, number of literals, etc.) is an outstanding hard problem [4, 18, 16]. For general pure Horn functions not even non-trivial approximation algorithms are known. For hydra functions a 2-approximation algorithm was given in , while  proved that the minimization remains NP-hard even in this special case. In , the authors provided logarithmic factor approximation algorithms for general key Horn functions with respect to all of the above mentioned measures.
The present paper focuses on the structure of the set of minimal keys of a pure Horn function. In particular, we are interested in finding Sperner hypergraphs that form the set of minimal keys of a unique pure Horn function . We call such a a unique key hypergraph, and the corresponding Horn function a unique key Horn function.
Section 2 gives a characterization of unique key hypergraphs and unique key Horn functions. In particular, we show that cuts of a matroid form a unique key hypergraph. The special case when every hyperedge has size two is discussed in Section 3, where we show that recognizing unique key graphs is co--complete. Subsequently, we identify several classes of graphs for which the recognition problem can be decided in polynomial time. Section 4 provides an algorithm that generates all minimal keys of a pure Horn function with polynomial delay. Furthermore, we show that the problems of finding a minimum key of a pure Horn function and of finding a minimum target set of a graph are closely related. Using this connection, our algorithm can be used to generate all minimal target sets with polynomial delay when the thresholds are bounded by a constant.
2 Unique Key Horn Functions
The purpose of this section is to give an understanding of the structure of pure Horn functions that have the same set of keys and in particular the structure of unique key Horn functions.
We start with additional definitions and notation. We view the set of variables as a ground set. A hypergraph is called a Sperner hypergraph if none of its hyperedges contains another one. Given a Sperner hypergraph , we say that is a transversal of , if for all . We say that is an independent set of if is a transversal of . We denote by the set of minimal transversversals of , and by the family of its independent sets.
For a hypergraph and subset we denote by the subhypergraph of induced by . In particular, if then . Furthermore, we denote by the projection of to where denotes the family consisting of the inclusionwise minimal members of . Clearly, if is not a transversal of then we have . We introduce the notation to denote the union of the hyperedges of , i.e. . We will use the following well-known lemma.
Lemma 1 (Seymour ).
For a Sperner hypergraph and subset we have and
For a Boolean function , we denote by
the set of true vectors of, i.e., . For two functions and we write if for all we have , in other words, if . We say that a clause is an implicate of if for all . For a subset we define the forward chaining closure of by . Note that if , then , since any implicate of is also an implicate of . For a CNF we use the same terminology and notation as it defines a unique Boolean function. For example, implies .
Keys of a pure Horn function clearly form an upward monotone system. We denote by the set of minimal keys of . To a Sperner hypergraph we associate the pure Horn CNF
Note that we have . For a Sperner family we call a key Horn CNF. Consequently, a pure Horn function is key Horn if and only if it has such a CNF representation. Let us observe that for a Sperner hypergraph and pure Horn function , implies that .
Let us also note that there may be several pure Horn functions with the same family of keys. As an example, consider the hypergraph over the ground set , and the pure Horn CNFs , , and . It is easy to verify now that the CNFs , and , define four pairwise distinct pure Horn functions and each has as its set of minimal keys.
Let be a Sperner hypergraph and be a pure Horn function such that . Then if and only if there exists an implicate of and a minimal transversal such that and .
Since , any is a key of . Thus implies because is a Sperner hypergraph.
Assume first that , that is, there exists a minimal key . Since the sets of are keys of and is a minimal key of , we must have . Let denote a maximal independent set which contains as a subset. It follows that is a minimal transversal which is disjoint from . Let be an arbitrary node in . Then is an implicate of because is a key. Thus, choosing proves one direction of our claim.
For the reverse direction, let us assume that there exists an implicate of and a minimal transversal such that and . Since is a minimal transversal of , there exists such that . This implies that . Because we have by our assumptions, follows. Therefore there exists a minimal key of . Finally, implies , from which follows as claimed. ∎
Let be a Sperner hypergraph and be a pure Horn function such that . Then if and only if for all implicates of with we have .
Let us first note that for any subset that has a disjoint minimal transversal we must have . Thus, by Lemma 2, we have if and only if for all implicates of for which and for all minimal transversals with we have . Since by Lemma 1 we have and for all Sperner hyperhraphs the equality holds, we have , implying the claim. ∎
Let be a Sperner hypergraph and define . Let be a set of clauses of the form that are not implicates of . Then if and only if .
The claim follows by Lemma 3. ∎
Now we are ready to characterize unique key hypergraphs.
For a Sperner hypergraph the pure Horn function is the only one with if and only if for all and there exists such that and .
For any pure Horn function with we have .
For the only if direction, take an arbitrary and , and let . By definition of , we have that . Since is a transversal, we have that and that is not an implicate of . If , then by Lemma 4 we have which is a contradiction with the assumption that is the only Horn function with this property. It follows that and altogether we get that . In particular, this means that there exists a with being minimal and . Since is a transversal of , we have . Consider an element . By the minimality of , for every different from either or . This means that is a transversal of .
For the opposite direction, take an arbitrary and . Then , hence there exists disjoint from . By the assumption, there exists such that is also a minimal transversal of . Therefore there exists for which . As , there exists such that and . This implies , contradicting being a transversal. This shows that the set in Lemma 4 is empty, proving the uniqueness of . ∎
The cuts of a loopless matroid form a unique key hypergraph.
If is the set of cuts of a matroid, then is the set of bases. If the matroid is loopless, then . The basis exchange axiom implies the necessary and sufficient condition of Theorem 5. ∎
The following example shows that not all unique key hypergraphs are related to matroids. Let , where . Then and satisfies the conditions of Theorem 5, hence is unique key. Clearly, is not the set of bases of a matroid.
3 Unique Key Graphs
Let us now consider Sperner hypergraphs such that for all (i.e., graphs). For the sake of simplicity, we use to denote such a hypergraph . We say that is a unique key graph if is a unique key hypergraph. Following standard graph theory notation, we denote by the set of neighbors of vertex . For a subset we denote by the set of neighbors of .
3.1 Complexity of Recognizing Unique Key Graphs
Given a graph and a maximal independent set we say that is an individual neighbor of if .
A graph is unique key if and only if for every maximal independent set and vertex there exists a vertex that is an individual neighbor of .
The minimal transversals of are exactly the complements of the maximal independent sets of , that is the minimal vertex covers of . For a maximal independent set with and , the set is an independent set if and only if is an individual neighbor of . If this is the case, then can be extended to a maximal independent set of not containing . Thus the statement follows from Theorem 5. ∎
Our next goal is to show that recognizing if is the set of minimal keys of a unique key function is difficult already for hypergraphs of dimension two. Let us consider a CNF over Boolean variables , . Let us associate a graph to as follows. The set of vertices is . The edges are defined as follows: vertices , and form a triangle for all . Vertices , and form a clique. Finally, all vertices are connected to the literals they include (see Figure 1).
A CNF is not satisfiable if and only if the graph is unique key.
We derive this claim using Theorem 8.
Let us note first that every maximal independent set has exactly points, one from each of the following cliques: , , and . This is because an independent set can contain at most one vertex from each of these cliques, and if it is disjoint from , then is also independent. Similarly, if , then is also independent. We now verify the conditions of Theorem 8.
Let be a maximal independent set, and assume that or . In both cases is an individual neighbor of . Note next that the sets and are disjoint, and therefore any independent set is disjoint from at least one of these sets. Thus, if , then either or (or both) is an individual neighbor of . If , then is an individual neighbor of .
Thus, the only claim left to show is that is satisfiable if and only if there exists a maximal independent set of containing vertex such that does not have an individual neighbor. To see this let us first assume that is satisfiable. Consider the set that contains the literals that are true in a satisfying assignment and vertex . Since every clause is satisfied, it has a neighbor in other than , and thus does not have an individual neighbor. For the other direction let us assume that is a maximal independent set, containing such that does not have an individual neighbor. Therefore, every clause must have a neighbor in , which must be a literal. Since is an edge of for all , cannot contain a complementary pair of literals, and thus the literals in can be set to true simultaneously, satisfying . ∎
Deciding if a hypergraph is unique key is co--complete already for hypergraphs of dimension 2.
It is easy to see that the problem belongs to co-, and thus the statement follows by Theorem 9. ∎
3.2 Bipartite Graphs
A bipartite graph without isolated vertices is unique key if and only if is a perfect matching.
If forms a perfect matching on , then every maximal independent set contains exactly one end vertex of every edge in . For any vertex , the other end vertex of the matching edge incident to is an individual neighbor of , thus is unique key by Theorem 8.
For the other direction, let and denote the color classes of , that is, . By the assumption that there are no isolated vertices in , both and are maximal independent sets. By Theorem 8, every vertex has an individual neighbor in the opposite color class, that is, a neighbor of degree exactly one. This implies that is a matching as stated. ∎
3.3 Bounded Treewidth Graphs
For graphs of bounded treewidth, it is possible to decide in linear time if a graph is a unique key graph.
We will formulate the problem in monadic second order logic (MSO), the result then follows by Courcelle’s theorem . Assume that a graph is described with a set of vertices and an adjacency relation which represents the set of edges. The unique key property can then be described as the predicate
where is a predicate satisfied if is an independent set of and is satisfied if and is its individual neighbour. These predicates can be defined in the following way.
Since the formulation of uses only quantification over a set of vertices and not over any set of edges, we can use it to show the following corollary.
For graphs of bounded clique-width, it is possible to decide in linear time if a graph is a unique key graph.
3.4 Graphs With Small Induced Matchings
Let be a graph, and assume that the size of the largest induced matching of is bounded by a constant. Then there is an efficient algorithm to decide if is a unique key graph.
If then is the family of minimal vertex covers that are exactly the complements of maximal independent sets. It is known that if the largest induced matching in has size at most , then it has at most maximal independent sets . Thus if is a constant, then all of them can be generated in polynomial time . This in turn implies that the conditions of Theorem 8 can be checked in polynomial time. ∎
4 Generating Minimal Keys
We shift the focus from unique key hypergraphs to the problem of generating all possible minimal keys of a given pure Horn function. The proposed approach can be applied for various problems, for example for generating all minimal target sets of a graph. Note that the number of minimal keys can be exponential in the size of the input CNF, hence the efficiency of generating them is measured by the time spent between outputting two of them. A generation algorithm outputs the objects in question one by one without repetition. Such a procedure is called polynomial delay if the computing time between any two consecutive outputs is bounded by a polynomial of the input size.
Given a pure Horn CNF , we associate to it a directed graph as follows. For a minimal key , an arbitrary variable , and a clause , we define the set . Note that is a key of , hence there exists with . We find such a using a greedy procedure by dropping variables from one-by-one, and checking at each step if the remaining set is a key by using forward chaining with respect to . We include the directed edge into for all possible choices and . For some we might not have a clause in which case we do not generate the corresponding . Note that every vertex in has at most outgoing edges. Let us remark that the final graph is not uniquely defined as its edge set depends on the choices of the sets in the above procedure.
is strongly connected.
First we introduce a measure between minimal keys. Let be two minimal keys. We know that the forward chaining closure of is equal to . Let us partition into layers where , define , and is the largest index such that . Let where for .
We claim that there exists an out-neighbor of in such that is strictly smaller in the reverse lexicographic order than . To see this, let be the largest index such that , and let be in . Since , there exists an such that . For the set we have that and for . Thus the out-neighbor satisfies the claim. By induction in the reverse lexicographic order of the possible vectors, there exists a directed path in from to . As , the same holds for , thus finishing the proof of the lemma. ∎
Given a pure Horn CNF , we can generate all minimal keys of with polynomial delay.
Consider the directed graph . Our algorithm will generate all out-neighbors of the minimal keys that are already generated, starting from a minimal key which we generate by greedily leaving out elements from . As is strongly connected according to Lemma 15, all minimal keys are obtained this way.
The set of minimal keys that are already generated is kept in a last-in-first-out queue. Before outputting the top element of the queue, we generate all its out-neighbors and add the new ones to the queue. Since the generation of the out-neighbors can be done in polynomial time the numbers of variables and clauses, this procedure has a polynomial delay. ∎
4.1 Minimum Target Set Selection
In the Minimum Target Set Selection problem, we are given an undirected graph and a threshold function . As a starting step, we can activate a subset of vertices. In every subsequent round, a vertex becomes activated if at least of its neighbors are already active. The goal is to find a minimum sized initial set of active nodes (called a target set) so that the activation spreads to the entire graph.
Finding a minimum sized target set is rather difficult. Chen  showed that the problem is difficult to approximate within a factor already when all thresholds are 2 and the graph has a constant degree. Charikar et al.  proved that, assuming the Planted Dense Subgraph conjecture, Minimum Target Set Selection is in fact difficult to approximate within a factor of for every even for constant thresholds.
The aim of this section is to show that the problems of finding a minimum target set in a graph (Min-TSS) and of finding a minimum key of a pure Horn function (Min-Key) are closely related.
The Min-TSS problem with constant thresholds is polynomial-time reducible to the Min-Key problem.
Let , be an instance of the Min-TSS problem. For a vertex , we denote the set of its neighbors by . We construct a Horn CNF as follows (see Figure 2):
Note that can be determined in polynomial time as the thresholds are assumed to be constants. By the definition of , the activation process in is equivalent to the forward chaining process in . This means that is a target set of if and only if it is a key of , concluding the proof of the theorem. ∎
Based on a construction previously used in  for establishing a connection between the directed and undirected variants of the target set selection problem, we show that Min-TSS includes Min-Key as a special case.
The Min-Key problem is polynomial-time reducible to the Min-TSS problem.
Let be a pure Horn CNF on variables . We construct a graph together with a threshold function such that every key of is a target set of , while every target set of can be transformed to a key of without increasing the size of the set.
We add the set of variables to the vertices of , and define for . For every clause of , we construct a gadget as follows. We add a vertex that corresponds to the clause and set . For every variable , we add four new vertices and with thresholds and , together with the edges and . Finally, we add four new vertices and with thresholds and , together with the edges and (see Figure 3).
If is a key of , then the same set of vertices in form a target set. Indeed, when the forward chaining procedure uses a clause to reach a variable , then this step corresponds to the activation of through the gadget associated to in .
Now assume that is a target set of . We cannot directly say that is a key of as might contain vertices from . However, it is not difficult to see that
is a key of with , concluding the proof of the theorem. ∎
Given a graph and constant thresholds , we can generate all minimal target sets of with polynomial delay.
In this paper we defined unique key hypergraphs as Sperner hypergraphs that form the set of minimal keys of a unique pure Horn function. We gave a characterization of such hypergraphs, and showed that cuts of a matroid form a natural example. We proved that the recognition of unique key hypergraphs is co--complete already when every hyperedge has size two. We identified several classes of graphs for which the recognition problem can be decided in polynomial time. We gave an algorithm for generating all minimal keys of a pure Horn function with polynomial delay. By showing that the problems of finding a minimum key of a pure Horn function and of finding a minimum target set of a graph are closely related, we extended our algorithm to generate all minimal target sets of a graph with polynomial delay when the thresholds are bounded by a constant. It remains an open question whether all minimal target sets can be generated with polynomial delay when the thresholds are unbounded.
Kristóf Bérczi was supported by the János Bolyai Research Fellowship of the Hungarian Academy of Sciences and by the ÚNKP-19-4 New National Excellence Program of the Ministry for Innovation and Technology. Ondřej Čepek and Petr Kučera gratefully acknowledge a support by the Czech Science Foundation (Grant 19-19463S). Projects no. NKFI-128673 and no. ED_18-1-2019-0030 (Application-specific highly reliable IT solutions) have been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the FK_18 and the Thematic Excellence Programme funding schemes, respectively. This work was supported by the Research Institute for Mathematical Sciences, an International Joint Usage/Research Center located in Kyoto University.
-  M. Arias and J. L. Balcázar. Canonical Horn representations and query learning. In International Conference on Algorithmic Learning Theory, pages 156–170. Springer, 2009.
-  M. Arias and J. L. Balcázar. Construction and learnability of canonical Horn formulas. Machine Learning, 85(3):273–297, 2011.
-  W. W. Armstrong. Dependency structures of database relationships. Proc. IFIP 74. North Holland, Amsterdam, pp. 580-583, 1974.
-  G. Ausiello, A. D’Atri, and D. Sacca. Minimal representation of directed hypergraphs. SIAM Journal on Computing, 15(2):418–431, 1986.
-  E. Balas and C. S. Yu. On graphs with polynomially solvable maximum-weight clique problem. Networks, 19(2):247–253, 1989.
-  K. Bérczi, E. Boros, O. Čepek, P. Kučera, and K. Makino. Approximating minimum representations of key Horn functions. ArXiv e-prints, Nov. 2018.
E. Boros, Y. Crama, and P. L. Hammer.
Polynomial-time inference of all valid implications for horn and
Annals of Mathematics and Artificial Intelligence, 1(1-4):21–32, 1990.
-  N. Caspard and B. Monjardet. The lattices of closure systems, closure operators, and implicational systems on a finite set: a survey. Discrete Applied Mathematics, 127(2):241 – 269, 2003. Ordinal and Symbolic Data Analysis (OSDA ’98), Univ. of Massachusetts, Amherst, Sept. 28-30, 1998.
M. Charikar, Y. Naamad, and A. Wirth.
On approximating target set selection.
Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016.
-  N. Chen. On the approximability of influence in social networks. SIAM Journal on Discrete Mathematics, 23(3):1400–1415, 2009.
-  B. Courcelle. The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Information and Computation, 85(1):12 – 75, 1990.
-  B. Courcelle, J. A. Makowsky, and U. Rotics. Linear time solvable optimization problems on graphs of bounded clique-width. Theory of Computing Systems, 33(2):125–150, Apr 2000.
W. F. Dowling and J. H. Gallier.
Linear-time algorithms for testing the satisfiability of
propositional Horn formulae.
The Journal of Logic Programming, 1(3):267 – 284, 1984.
-  T. Eiter and K. Makino. On computing all abductive explanations from a propositional horn theory. Journal of the ACM (JACM), 54(5):24–es, 2007.
-  J.-L. Guigues and V. Duquenne. Familles minimales d’implications informatives résultant d’un tableau de données binaires. Mathématiques et Sciences humaines, 95:5–18, 1986.
-  P. L. Hammer and A. Kogan. Optimal compression of propositional Horn knowledge bases: complexity and approximation. Artificial Intelligence, 64(1):131–145, 1993.
-  P. Kučera. Hydras: Complexity on general graphs and a subclass of trees. Theor. Comput. Sci., 658:399–416, 2014.
-  D. Maier. Minimum covers in the relational database model. In Proceedings of the eleventh annual ACM symposium on Theory of computing, pages 330–337. ACM, 1979.
-  K. Makino and T. Ibaraki. The maximum latency and identification of positive boolean functions. SIAM Journal on Computing, 26(5):1363–1383, 1997.
-  H. Nishimura and S. Kuroda. A Lost Mathematician, Takeo Nakasawa: The Forgotten Father of Matroid Theory. Springer Science & Business Media, 2009.
-  P. Seymour. On incomparable families of sets. Mathematica, 20:208–209, 1973.
-  R. H. Sloan, D. Stasi, and G. Turán. Hydras: Directed hypergraphs and Horn formulas. Theor. Comput. Sci., 658:417–428, 2017.
-  S. Tsukiyama, M. Ide, H. Ariyoshi, and I. Shirakawa. A new algorithm for generating all the maximal independent sets. SIAM Journal on Computing, 6(3):505–517, 1977.
-  H. Whitney. On the abstract properties of linear dependence. In Hassler Whitney Collected Papers, pages 147–171. Springer, 1992.