Familial Graph Compression (FGC) is a problem introduced in . The problem entails determining whether it is possible to convert a given graph to a target graph via a series of “compressions” based on the presence of certain sub-graphs in , specified in a set . A complete definition is given in the next section. A single instance of FGC involves , , and as input. This problem was proven to be NP-complete in :
The FGC problem is NP-complete when:
is simple graph on nodes, is the single node graph, and family contains a single motif i.e. a cycle on nodes.
is a simple graph on nodes, is the single node graph, and contains a single motif with disjoint triangles.
is a simple graph, is a forest of isolated nodes, and is a family of graphlets.
In this work, we provide an easier proof for the third setting.
2 Notation and Terminology
We adopt the same notation and terminology as in . The relevant preliminaries have been reiterated below.
A graph is a collection of nodes and edges i.e. pairwise interactions between pairs of nodes. For a node , its neighborhood is defined as the set of all nodes such that there exists an edge in . The degree is defined as the size of the neighborhood of a node . is undirected and unweighted, i.e. for , an edge is same as the edge . For a fixed graph , a given is called a motif of , if is isomorphic to a sub-graph in i.e. is a motif if there exists and a function such that for all edges there is an edge . Similarly, is called a graphlet of , if is isomorphic to an induced sub-graph in i.e. is a graphlet if there exists and a function such that for all edges if and only if there is an edge . We will use the term motif (and similarly graphlet) for both and any of its isomorphic copies in .
For a given equivalence relation on the set nodes of a graph , the quotient graph, denoted by , is a graph where the node set is the set of equivalence classes defined by and there is an edge between a pair of nodes (classes) if and only if there is an edge between any pair of nodes of two corresponding classes in . Intuitively, in quotient graphs, prescribed subsets of nodes are merged and the incidence is preserved without creating multi-edges . We will repeatedly deal with graphs with names , and ; their node and edge set will, respectively, be denoted by , and . Finally, for a set and a positive integer , is defined as the set of all size subsets of with exactly elements.
2.2 Familial Graph Compression
We start by defining an equivalence relation on the node set of based on a motif (or a graphlet) . Consider the relation where node is related to whenever both and lie in a sub-graph of isomorphic to . We define to be the transitive closure of . Intuitively, if two motifs (resp. graphlets) share a common node in , then all nodes in both motifs (resp. graphlets) are related in . Clearly, is an equivalence relation on . Then, an -compression step (referred to as compression step when is clear from the context) is defined as computing the quotient graph . Recall that a quotient graph is a graph on classes in the partition , where two classes are adjacent if any pair of nodes in the corresponding classes are adjacent in the graph . The familial compression of a graph for a family is the process of repeatedly applying -compression steps on where after each step is replaced by the quotient graph of the previous step. Thus, we say that a graph can be constructed by a -compression of if there exist a sequence of graphs: where and i.e. is result of an -compression on the graph for some . Note, that a graph may be constructed in several different ways via different compression steps. To avoid trivial compressions, we restrict that each contains at least three nodes. The following is the FGC problem:
Problem 2.1 (Familial Graph Compression).
Given simple graphs, , and , and a family of motifs (or graphlets) , can be constructed from a -compression of ?
In the original proof for Theorem 1.1-(3), a reduction is provided from a variant of the 3-SAT problem to FGC. In this section we showcase the same result via reduction from Exacty Cover by Three Sets (XC3), defined below.
Problem 3.1 (Exact Cover by Three Sets ).
Let , and let be a collection of 3-element subsets of , in which no element in appears in more than three subsets. For . The problem consists of determining whether has an exact cover for , i.e. a such that every element in occurs in exactly one member of .
This problem was proven to be NP-complete in . Note that for our reduction, the fact that “each element appears in no more than three subsets” is inconsequential.
Suppose we are given an instance of XC3, i.e. the sets and . We show how one can make graphs , and , and family for an FGC instance that is solvable only if the given XC3 instance is solvable.
Let denote a cycle on vertices. Let for . The graph is the union of disjoint cycles: . For each , we define a graph which is the union of three disjoint cycles: . The family contains for each : . Finally, the target graph is a graph on isolated vertices, i.e. , and .
Intuitively, when a is compressed in , it corresponds to selecting a to form an exact cover for . Observe that FGC would not allow the same element to be covered by different ’s, since the cycle corresponding to the covered elements no longer exist in the quotient graph, and thereby can’t be compressed (selected) again. We get isolated vertices if an only if disjoint 3-element subsets form an exact cover of . Clearly, the reduction can be performed in polynomial time. ∎
Observe that the , , and used in Theorem 3.1 are exactly as described in Theorem 1.1-(3). We note that this reduction holds even when is a family of motifs. We also obvserve that some simple changes to the provided reduction can be made to show the following:
FGC is NP-complete when is a connected, simple graph, H is the single node graph, and is a family of graphlets or motifs.
-  (2020) Interpretable multi-scale graph descriptors via structural compression. Information Sciences. Cited by: §1, §2.
-  (2006) Graph theory, combinatorics and algorithms: interdisciplinary applications. Vol. 34, Springer Science & Business Media. Cited by: §2.1.
-  (1985) Clustering to minimize the maximum intercluster distance. Theoretical Computer Science 38, pp. 293–306. Cited by: Problem 3.1, §3.