DeepAI
Log In Sign Up

Robust Factorizations and Colorings of Tensor Graphs

Since the seminal result of Karger, Motwani, and Sudan, algorithms for approximate 3-coloring have primarily centered around SDP-based rounding. However, it is likely that important combinatorial or algebraic insights are needed in order to break the n^o(1) threshold. One way to develop new understanding in graph coloring is to study special subclasses of graphs. For instance, Blum studied the 3-coloring of random graphs, and Arora and Ge studied the 3-coloring of graphs with low threshold-rank. In this work, we study graphs which arise from a tensor product, which appear to be novel instances of the 3-coloring problem. We consider graphs of the form H = (V,E) with V =V( K_3 × G) and E = E(K_3 × G) ∖ E', where E' ⊆ E(K_3 × G) is any edge set such that no vertex has more than an ϵ fraction of its edges in E'. We show that one can construct H = K_3 ×G with V(H) = V(H) that is close to H. For arbitrary G, H satisfies |E(H) Δ E(H)| ≤ O(ϵ|E(H)|). Additionally when G is a mild expander, we provide a 3-coloring for H in polynomial time. These results partially generalize an exact tensor factorization algorithm of Imrich. On the other hand, without any assumptions on G, we show that it is NP-hard to 3-color H.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

11/12/2020

Note on 3-Coloring of (2P_4,C_5)-Free Graphs

We show that the 3-coloring problem is polynomial-time solvable on (2P_4...
10/23/2019

Parameterized Coloring Problems on Threshold Graphs

In this paper, we study several coloring problems on graphs from the vie...
08/30/2019

Approximation Algorithms for Partially Colorable Graphs

Graph coloring problems are a central topic of study in the theory of al...
09/07/2018

Local Coloring and its Complexity

A k-coloring of a graph is an assignment of integers between 1 and k to ...
11/21/2022

Quasi-stable Coloring for Graph Compression: Approximating Max-Flow, Linear Programs, and Centrality

We propose quasi-stable coloring, an approximate version of stable color...
03/07/2022

Turbocharging Heuristics for Weak Coloring Numbers

Bounded expansion and nowhere-dense classes of graphs capture the theore...

1 Introduction

The 3-coloring problem is one of the most classical problems in theoretical computer science [Kar75]. Although it is NP-hard, much effort has been made to understand the approximate 3-coloring problem: given a 3-colorable graph111In this paper, all graphs are undirected, simple (i.e., no double edges) and loopless (no edge from a vertex to itself). on vertices as input, what is the fewest number of colors one can efficiently color the graph with? Initially, combinatorial algorithms were dominant in approximate 3-coloring, bringing us Wigderson’s famous -approximation [Wig83], as well as Blum’s -approximation and Blum’s 3-coloring algorithm for many random 3-colorable graphs [Blu94]. Then, SDP-algorithms took center stage222The one exception to this trend is Kawarabayashi and Thorup’s -approximation [KT12].. This began with the celebrated work of Karger, Motwani, and Sudan [KMS98]. With a lot of extra work, clever observations were made to augment their algorithm or combine it with combinatorial algorithms and obtain better approximation results [AC06, Chl07, KT17]. We give a more complete history of the 3-coloring problem in Section 1.2.

It is unclear how much further success can be obtained by combining SDP algorithms with fancier combinatorial techniques, and it is likely that completely new ideas are needed. One way to continue building new insight on 3-coloring is to focus on interesting subclasses of 3-colorable graphs. Indeed, Blum did just this in targeting random 3-colorable graphs [Blu94], and Arora and Ge generalized this result by studying low threshold rank 3-colorable graphs [AG11]. Additionally, the improvement from Kawarabayashi and Thorup comes from focusing on graphs with high degree [KT17]. Overall, we do not fully understand what properties make graphs easy or hard to color, and our work is a further exploration of this.

In this work, we propose a new class of 3-colorable graphs that are interesting for the 3-coloring problem: graphs that are close to the tensor graph . For undirected graphs and , their tensor product333Also known as the cardinal, direct, or Kronecker product, among other names. is a graph on the vertex set , where vertices and are incident if and only if and . Observe that if is connected, we can set and the graph is easy to 3-color. In particular, we first locate (say by brute force) one triple of the form for some in , which we call a core triple. We color the core triple with three distinct colors, and then observe the colors of the neighbors in the graph are forced. That is, for any in ’s neighborhood, the core triples and have 6 edges between them such that if ’s copy of is colored with three distinct colors, there is only one valid coloring for ’s copy. This coloring propagates through out . See Figure 1 for an illustration. On the other hand, suppose we delete edges from to form a graph . If the number of deletions is large enough that a coloring is not immediately forced from fixing the colors on one core triple, then it is not obvious how to 3-color .

Another way to view this is via LP hierarchies. In particular, consider a -level Sherali-Adams lift on the basic 3-coloring LP. This simple coloring algorithm that was successful for 3-coloring is also successful for 3-coloring exactly when the lifted LP variables directly provide the core triples. In fact, we prove this algorithm is successful when is an expander for our deletion model in Theorem 2. Instead of arguing that a lifted LP/ SDP rounding procedure could succeed through properties of the LP solution– we do not know whether such a proof is possible in our setting– we study its combinatorial (or topological!) analog. We believe this is a more promising avenue to complement the existing 3-coloring work, and overall lead to more progress on the problem.

Figure 1: To color , fix a triple of vertices with 3 colors. Then, see what colors this forces on the rest of the graph. Some initial triple will induce a valid 3-coloring.

However, this seemingly simple family of graphs does not seem to be properly captured by the current literature on coloring algorithms. In particular, to the best of our knowledge, there is no guarantee that previous analyses of coloring algorithms would work well on . First, can look far from random, which prohibits a guaranteed success by Blum’s coloring tools [Blu94]. Additionally, the threshold rank of is uncontrollable. If has high threshold rank, then the tensor has high threshold rank, and the deletion of edges to form has varied effects on the spectrum. Overall, this means we have no guarantee for a polynomial time -coloring by the algorithm from Arora and Ge [AG11] 444It is possible that there exists some reweighting of the edges of such that the reweighted graph has at most polylog threshold rank. This leaves open the possibility that a combination of combinatorial techniques together with the algorithm by Arora and Ge [AG11] could produce a quasi-polynomial time algorithm using many colors.. To the best of our knowledge, current analyses of gaussian cap rounding procedures would not color near tensors in constant colors. In response, we ask the following question:


Suppose edges are deleted from to build . When can one 3-color in polynomial time?


If no edges are deleted from the tensor product, polynomial time factorization is possible. We will describe an algorithm presented by Imrich [Imr98] in Section 1.3.1, which shows that if and are connected non-bipartite, and prime with respect to the tensor product 555 I.e. there is no such graphs on more than 1 vertex where , and same for . one can reduce to a factoring problem for the Cartesian product.666The Cartesian product of two graphs is a graph on such that in if and or and . Note that unless and contain loops, and have disjoint sets of edges. In particular, Imrich constructs a Cartesian product graph (without knowing what and are), where and . Then, a similarity metric by Winkler [Win87] (see also [Hoc92, IP07, FHS85]) can be used on to identify components, one of which will build the graph and the other . Until then, one important thing to note about these procedures is that they are very brittle to any deviations from a tensor graph. In other words, these procedures cannot be run on graphs which are very close to a tensor, e.g. adding a small number of edges to would turn it into a tensor. Tensor product graphs (and graphs close to a tensor product) are in a wide range of applications, including image processing [DM84], network design [LCK10], complex datasets [SM14], dynamic location theory [HIK11], and chemical graph theory [DEG01]. We note that graphs near tensor products are especially important in modeling social networks [SM14]. Therefore, an interesting combinatorial question is


When can one approximatly factor a graph that is close to a tensor product in polynomial time?


We now present some models and results on these questions.

1.1 Problem and theorem statements

Due to both the motivation from 3-coloring and our interest in robust tensor graph factoring algorithms, we study the following question. Let be the clique on three vertices, be a graph on vertices, and be their tensor product. We consider graphs of the form , for an edge set such that no vertex in has more than an fraction of its incident edges in . We say that such an is -near the triangle tensor product . Our deletion model is very general, as the deleted edges could be adversarial chosen in such a way that nodes with substantially different looking neighborhoods in look the same in . This model appears to be novel, as approximate graph products have mainly only been studied for the Cartesian product, not for the tensor product–we will discuss the related work more in Subsection 1.2.

Suppose we are given as above. Our primary reconstruction goal is as follows. We construct with . For the reconstruction goal,777As the name suggests, there are other natural reconstruction goals, see Section 5 for more details. we reconstruct so that

We prove the following algorithmic results in Section 3. Our first theorem holds for any .

Theorem 1.

Assume . Let be -near . Then, there is an algorithm running in time that constructs a tensor with achieving the reconstruction goal.

Our second theorem holds when is an expander. For any vertex , let denote the neighbors of . We will call an -edge-expander if for every , the following holds

Theorem 2.

Fix . Let be -near , where graph is a -edge-expander. Then, there is an algorithm running in time that extracts a valid 3-coloring of .

As we shall see, this theorem is possible as the expansion of allows for partial three-coloring found in Theorem 1 to be converted into a globally consistent three-coloring. We can replace the condition in Theorem 2 that is a -edge-expander with a small-set expansion condition on . See Section 3.6 for more details.

We complement our algorithmic results by showing that 3-coloring is hard for general .

Theorem 3.

Given as input an graph which is -near for some , it is NP-hard to find a -coloring of .

Our algorithms are combinatorial, although they seem to draw on some topological properties of graph tensors (see the technical overview).

1.2 Related work

As our work bridges the rich theory of graph factoring algorithms with the world of approximate graph coloring, there are a number of prior works which relate to our investigation.

Approximate coloring algorithms.

We refer the reader to the introduction in [KT17] for a detailed recap of 3-coloring progress over the past several decades. A simple algorithm by Wigderson [Wig83] uses colors and has remained an important subroutine in coloring graphs with high degree [KMS98]. Blum’s work broke the barrier of by using more intricate combinatorial techniques, and again his framework has inspired further coloring work, in particular those of Kawarabayashi and Thorup [KT12, KT17]. Since the seminal result of Karger, Motwani, and Sudan [KMS98], algorithmic results in approximate graph coloring have focused on SDP based algorithms [AC06, Chl07, AG11, KT17]. Several integrality gap instances for the original 3-coloring SDP formulation presented by Karger, Motwani, and Sudan [KMS98] have been found by Karger, Motwani, and Sudan [KMS98]; Frankl et. al [FR87, AG11] ; and Feige et. al [FLS04]. These graphs have valid solutions to the SDP, but have chromatic number at least , for small constant and the size of the vertex set. On the other hand, if a 3-colorable graph has threshold rank

, i.e. once its eigenvalues are scaled to be in

, at most of them are less than , then one can color the graph in colors in time [AG11].

Hardness of approximate graph coloring.

The hardness results for approximate graph coloring are quite far from the algorithmic results. For a 3-colorable graph it is known that it is NP-hard to color with 5 colors [BBKO21], beating a long-standing record of 4 colors [KLS93, GK04, BG16]. For -colorable graphs, it is NP-hard to color with colors [WŽ20]. Under a variety of conditional assumptions, it is hard to color a 3-colorable graph with superconstant colors (up to roughly ), see [DMR09, DS10, WŽ20, GS20].

Graph factoring algorithms.

Given a graph product, one can efficiently factor a graph with respect to the product into prime graphs, i.e. graphs that cannot be further non-trivially factored according to the product. For the Cartesian product, any finite, simple, connected graph can be factored in polynomial time  [FHS85, Win87, Fed92, IP07]. Moreover, Imrich and Peterin [IP07] present an algorithm that performs this factorization in time and space linear in the number of edges. Similarly for the strong product, whose graph unions the edges of the Cartesian product and the tensor product graphs, a polynomial time factorization was found by Feigenbaum and Schäffer [FS92] for finite, simple, connected graphs. To factor a graph with respect to the tensor product, one can combine algorithms of Imrich and Winkler [Imr98, Win87]; we detail these algorithms–and how they can be combined to decompose a tensor product–at the end of Section 1.3.1. For both the Cartesian product and the strong product, prime factorizations (i.e., a decomposition into irreducible factors) are unique for finite, connected, simple graphs  [HIK11]. In the case of the tensor product, we are required to make the additional assumption that the graph is non-bipartite, and with this additional assumption, a unique factorization can also be found in polynomial time [Imr98].

Approximate graph products.

The theory of approximate graph products has also been studied before, but in different settings and with different goals from us. For a graph that is close to a product graph , i.e., it takes a small number of edge deletions or insertions to transform to the product graph , previous works study how to find [ZZ01, ZZ07, HIKS09a, HIKS09b, HIK13]. Feigenbaum and Haddad [FH89] showed that for the Cartesian product, obtaining such an with the fewest possible edge insertions or the fewest possible edge deletions is NP-hard. Overall, the Cartesian product is the most well studied graph product, and both approximate Cartesian product and strong product graphs have connections to theoretical biology, as they model evolutionary relationships of observable characteristics [HIKS09a, HIKS09b]. To the best of our knowledge, prior to our work, approximate graph products have only been studied for the Cartesian product and the strong product, and not with respect to the tensor product. The closest problem is the Nearest Kronecker Product problem, which given seeks to find and such that is minimized, for the Kronecker product of and and , [VLP93, VL00]. Note that the relation between this problem and the approximate tensor product problem lies in the fact that the adjacency matrix of the tensor product of two graphs is the Kronecker product of the underlying adjacency matrices. Another similar problem is the closest separable state problem in Quantum systems, which approximates the entanglement of a system by measuring is how far it is from a composite of separable states [MPS10, WPSW20]. More on product graphs, their factorizations, and approximate graph products can be found in the book by Hammack, Imrich, and Klavžar [HIK11].

Learning theory.

A related line of work to ours is tensor decomposition in the learning theory community. Tensors (not necessarily tensor graphs, just tensors) represent higher order information from data. A common goal is uncover the latent (hidden) variables underlying some data in order to understand it in a lower dimensional form [Moi14]. One popular way to achieve this is the CP (CANDECOMP/ PARAFAC) decomposition, which writes a tensor as a sum of rank 1 tensors [Rou21, JGK19]. Other related decompositions are PCA, Tensor Robust PCA, and the Tucker decomposition [Rou21, JGK19, LFC16]. A similar research topic is reconstructing a partially observed tensor (e.g., [ZSCL18]).

1.3 Technical overview

We now present overviews for the proofs of Theorems 1, 2, and 3. The full proofs for Theorems 1 and 2 are in Section 3, and the full proof for Theorem 3 is in Section 4.

1.3.1 Overview of reconstructing a tensor graph

To give intuition for our results, we first summarize Imrich’s algorithm for efficiently factoring a tensor . The first goal is to find a graph on the same vertex set as that is isomorphic to a Cartesian product . The procedure Imrich uses to construct was inspired by an algorithm of Feigenbaum and Schäffer [FS92] for the strong product.

The key to constructing is the following observation on intersections of neighborhoods in tensor graphs. For any vertices in , let . Since is a tensor graph, one can show that , where is the projection of onto , etc. In particular, if is maximal (as a set) among , then it must be that either or . In particular, adding all such maximal edges to will keep consistent with a Cartesian product , where is on the same vertices as and is on the same vertices as . If is not a connected graph, we repeat this procedure, where the maximal ’s are found for ’s which are not in the same connected component as .888There are other considerations which Imrich carefully handles, such as if there is a third vertex with and .

Figure 2: An illustration of Imrich’s algorithm. The algorithm is given an unknown connected, nonbipartite tensor . Imrich’s algorithm builds , with and . Then, the algorithm factors to find and , and projects onto and to find and .

Once is constructed, factoring the Cartesian product can be done with a variety of algorithms, including the one of Winkler [Win87].999Winkler’s algorithm is rather elegant. Let , the length of the shortest path between and in . Define to be similar if While this similarity relation may not be transitive, we can perform a depth first search to find all components that are transitively similar. These components correspond precisely to the Cartesian product. Once this algorithm is run, assume101010If not, then and thus can be decomposed into more than two factors. that we find Cartesian factors and of such that and . From this factorization, we can extract and from with the following projection trick. Given and with , we add to and to . See Figure 2 for an illustration. This completes the factoring algorithm.

1.3.2 Reconstruction Algorithm

To approximately factor a graph , we need to approximate for each vertex of the and corresponding to , which we call the “color class” and “ class” of respectively. Since is unknown, we first try to group the vertices of

into triangles to work out the color class of each vertex, and then use the edges between these triangles to estimate

.

Candidate edge graph.

For Imrich, it sufficed to connect pairs of nodes in the surrogate Cartesian product graph whose neighborhoods’ intersection in satisfied some maximality criteria. To make this maximality criteria more robust, we define what is known as the candidate edge graph on . Informally, two vertices form an edge of if the intersection of their neighborhoods has size approximately half their degrees (see Section 2.2). Note that the edges of and the edges of are qualitatively quite different (they can even be disjoint). However, a key property of is that for every vertex , the three vertices of corresponding to form a triangle in . We call this a core triangle.

We show the triangles of , which we call , have a very particular form. Such a triple is one of two types: it is either (1) “close” to some core triangle (which we call quasi-core) or (2) contains vertices whose color classes are all the same and whose classes have very structured pairwise intersection (which we call monochrome, see Lemma 2). Because the color classes of are hidden information, we cannot directly determine if any triangle of is quasi-core or monochrome.

Triangle components.

Instead, we separate these two types of triangles topologically. We say that two triangles of are compatible if the subgraph of on the six vertices of these triangles is isomorphic to . In other words, it is consistent that these two triangles correspond to and for some . This compatible relation divides into connected components. Perhaps the most crucial (although easy to prove) technical lemma of this paper is that each component of consists only of quasi-core triangles (which we call a core component) or only of monochrome triangles (see Lemma 3).

Coloring algorithm.

Assume we pick an arbitrary component of . Let be the vertices of covered by . By guessing the colors of one of the triangles of and then performing a depth-first search, we can efficiently color all the vertices of (i.e., the subgraph of induced by ); or, if it fails, we can deduce that is not a core component (see Proposition 6). If is a sufficiently good expander, then we can show that has enough edges to force there to be only a single core component (although there can be a large number of monochrome components). Thus, by looping over all possible ’s (of which their are clearly at most ), our coloring algorithm will succeed on one of them, proving Theorem 2.

Matching algorithm to build triangles.

By finding the valid 3-coloring of , we can augment this by finding an approximate tensor factorization of . This is the heart of Lemma 6. The key idea is we take the three color classes of , which we call , and perform a tripartite matching algorithm on them. More formally, one can build a weighted tripartite graph on , where the bipartite subgraphs between and , as well as and , are complete, with each edge having weight corresponding to their pairwise intersection, normalized by the degrees of the vertices. By finding a max-min matching, that is a matching which maximizes the weight of the minimum weight edge (which can be done in polynomial time), on this tripartite graph, we find a chosen set of triples on the vertices in . In particular, assuming is a core component, each of the triples found will be approximately quasi-core triangles. We can then build a tensor graph corresponding to by having each triple found by the matching correspond to a vertex of the reconstructed graph , and have each edge of correspond to any pair of triples sharing at least one edge in . Analyzing this factorization accurately requires showing that the every error in the reconstruction can be “charged” to a mismatch in the neighborhoods of the matched triples. The max-min guarantee implies that the number of such mismatches is of edge density, allowing us to achieve the reconstruction goal for this subgraph.

Finishing the factorization.

To factor the whole graph, we recursively apply Lemma 6. In particular, we loop through the triangle components of . If the lemma successfully factors the subgraph of induced by that component and the cut between that subgraph and the rest of has sufficiently few edges, we recurse on the remainder of . Our tensor graph is then the disjoint union of the tensors products found for each subgraph. By combining the reconstruction guarantees for each connected component, we have the factorization achieves the reconstruction goal, proving Theorem 1.

Note that we cannot easily achieve a 3-coloring through such a recursive algorithm, as even though each component is correctly 3-colored, the sparse edges between the components make finding a globally consistent coloring intractable. We formalize this hardness in Theorem 3.

1.3.3 Hardness

The proof of Theorem 3 is ultimately a reduction from 3-coloring, albeit in a roundabout manner. Given an instance of the 3-coloring problem, we replace each vertex of with a “base” gadget that is isomorphic to , which is tensored with itself 3 times. By a result of Greenwell and Lovasz [GL74], there are three types of 3-colorings of , arising from the three different triangles in the tensor product. For each edge , we add edges between their respective base gadgets such that any valid -colorings of and must have different types. One can show that is 3-colorable if and only if the graph produced by the gadget reduction is. Many similar gadget reductions have been performed in the hardness of approximation literature, such as in [DMR09].

One can show that if is 3-colorable, then the graph resulting from this reduction is a subset of from some graph . However, the reduction may not be -close to this tensor. To circumvent this, we use a replication trick–for each base gadget , we produce a “cloud” of ’s we label , where is the degree of divided by . For each in the cloud, we add a copy of between it and . This forces each cloud copy of to have the same coloring type.

One can show that if is 3-colorable, then this graph is -near a tensor. However, if is not 3-colorable, then the resulting graph isn’t even three colorable. This is enough to establish Theorem 3, for if we had a polynomial-time algorithm for 3-coloring these graphs -near a tensor, then we could solve the general 3-coloring problem.

1.4 Paper outline

In Section 2, we define the necessary terms and concepts needed to prove our algorithmic results. Then, in Section 3 we prove the main algorithmic results: Theorem 1 and Theorem 2. Section 4 contains the proof of our main hardness theorem: Theorem 3. We conclude in Section 5 with directions for future work. Finally, in Appendix A, we include some proofs omitted in Section 3.

2 Algorithm Preliminaries

In this section, we give the key definitions and other concepts needed to prove the algorithmic results.

2.1 Important definitions and propositions

We let denote the tensor product. As in the problem statement, the graph is -near a triangle tensor, as the following definition details.

Definition 1.

Graph is -near a triangle tensor if there exists some product with , for a set of edges where no vertex in has more than an fraction of its incident edges in .

Let be an arbitrary graph. We denote the neighborhood of a vertex of by , where we specify the graph in which we take the neighborhood in the subscript. Additionally, for every subset , define

Recall that Theorem 2 considers graphs whose tensor component is an -edge-expander. That is, for every , the following holds111111See [Chu97] for the spectral implications of this definition of an expander.

Additionally, in our arguments, we need to argue about vertices of and with similar degrees. We choose the following definition

Definition 2.

For any graph , vertices have -similar degree when , for .

Frequently, we refer to the intersection of two vertices. Let the set of vertices in the intersection of and in graph be denoted by , and similarly let the intersection of in graph be . Sometimes we are interested in intersections or neighborhoods for nodes in and , and we will indicate the graph of interest in the subscript.

We denote the vertices of as and often refer to these as colors. To refer to an arbitrary color in , we use the variable . We refer to the triple with the shorthand . Similarly, we use the shorthand to refer to the set of vertices .

For every vertex in , there is a hidden pair of labels associated with from . Formally, corresponds to a node from the tensor , whose vertices are tuples . We call the unknown color class of and the unknown -class of . We use the same notation as in our informal problem statement, and say that the function extracts the unknown color class of , and the function extracts the unknown class of . Then, we can define the complete hidden label function for as .

Definition 3.

For , the functions and map to its unknown, underlying color class and -class of , respectively. Together, they give the hidden label .

We refer to the hidden label function’s inverse, , when discussing the unknown location of in the graph . Instead of writing out for a set , we use the shorthand , with similar notation for . Note that is a bijection.

Next, we define a set of triples that will be important for our reconstruction algorithm.

Definition 4.

In the graph , the core triangles are the triples of the form

One might wish to find the core triangles of all vertices in in order to prove our reconstruction goal, but such a task is impossible. In particular, nodes in may have such similar neighborhoods that it is impossible to distinguish vertices in with their -classes. This motivates our next definition.

Definition 5.

Vertices are -confusable when

Since the -confusable vertices of make finding the core triangles impossible, we are content to reconstruct quasi-core triangles.

Definition 6.

A triple of vertices is a quasi-core triangle if the color classes of all three vertices are different and the classes of all three vertices are all -confusable with each other, i.e. for , for , are all -confusable with each other.

Note that core triangles are quasi-core. Finding a set of disjoint quasi-core triangles that cover all nodes in is challenging because of triangles that are not quasi-core, but may look similar to quasi-core triangles (see Proposition 4). We call such triangles monochrome–as their nodes have the same color class–in order to distinguish them from quasi-core triangles.

Definition 7.

A triple of vertices is a monochrome triangle if

for all .

In fact, the only triples of vertices from that will be relevant to us are the quasi-core (which includes core) and monochrome triangles (see Lemma 2).

The following two propositions will be used very frequently in proving Theorem 1. The first, Proposition 1, implies that is very close to , and it follows from the definition of the tensor structure, plus the constraint that no node has more than an fraction of its edges removed to form from . Similarly, the second, Proposition 2, says that for with , is very close to .

Proposition 1.

For with , let and . Then for ,

Proof.

From the definition of the tensor structure, we have that

Further, deleting edges can never increase the size of an intersection in the graph, so . The upper bound follows by combining the two equations. Since no node has more than an fraction of the edges are deleted from to form , we see that

The lower bound follows since , and for . ∎

Proposition 2.

For vertex in with ,

(1)
Proof.

The lower bound on follows from the observation that On the other hand, since the number of deletions from each vertex is bounded,

2.2 Important graphs

We define several graphs here that will be used throughout the proof section.

The candidate edge graph, .

First, we construct a graph on , whose edges contain those from the core triangles (see Proposition 3). Since our eventual goal is to find quasi-core triangles and the edges in our graph are candidates for edges in the core triangles, we will call this graph the candidate edge graph, . We define the graph on such that distinct form an edge if and only if

  1. and have -similar degree in , and

  2. (2)

The components of will be denoted by . For a single component, we denote it for .

Figure 3: An illustration of how the neighborhoods of three vertices in must look in order to form a triangle in . The dashed lines are edges deleted in building from .
The triangle graph, .

Next, we construct a graph , whose vertices are all sets of triples from such that for , for all with . See Figure 3 for an illustration. Note that by definition, since there are no self-loops in , are all distinct. We will refer to the vertices of as triangles. An important property of is that it includes all the core triangles (see Proposition 3). We say the triangles are adjacent if and only if they are compatible in the following sense.

Definition 8.

Triangles are compatible exactly when

  1. and are on disjoint sets of vertices of , and

  2. there is an indexing of and of such that if and only if .

The compatibility condition is defined so that triangles and corresponding to core triangles of and in , i.e. and , are adjacent in if and none of the edges between and are deleted in building from .

We say that two triangles of are connected if there is a path of compatible triangles from one to the other. The connected components of will be denoted by . We denote a single component as , for . Note that a component of might intersect many components of , because the triangles of are connected via edges of , which may cross different components of . We will show in Lemma 5 that for any and any , either contains every vertex in or none of them.

3 Algorithmic Results

In this section, we provide formal algorithm statements and proofs for our algorithmic results. Through out, we take .

3.1 Properties of , the Graph of Candidate Edges

We use the graph to identify the core triangles, or triangles that are close to the core triangles. The following lemma shows that the edges within core triangles will be kept in .

Proposition 3.

Fix For all , .

Proof.

Fix distinct with and . By rewriting the equation in Proposition 2, both and have neighborhood sizes satisfying

Therefore, and have -similar degree in .

Now, we show falls in the specified range for to be an edge in . Using Proposition 1, set and to see that

We can further lower bound by using the fact that from Proposition 2, so

(3)

On the other hand, from the lower bound in Proposition 2. Using this as an upper bound on , we see that . Comparing with Equation 2, it follows that is an edge in . ∎

Next, Proposition 4 proves that edges in either come from (1) vertices whose unknown color classes are the same and whose classes have intersection size roughly or (2) vertices whose unknown color classes are different and whose classes have almost identical intersection. The edges from core triangles are of type (1).

Proposition 4.

Fix For , let and , where . Either (1) and

or (2) and

Proof.

If , then by Proposition 1

and recall from Equation 2 that

Crossing the upper and lower bounds from the inequalities, we see that

The same argument holds when , but in the first inequality the factor from is now a 1 instead of a 2, so we obtain

Then we see that the statement holds, since for

Next, we consider certain triples in with the graph . Recall that the triples in have an identifiable structure that is found in the core triangles

3.2 Properties of , the Graph of Triangles of

Recall the components of are . We will show that one can use the partitioning induced by the components of to partition the core triangles. Further, if and are in the same component of , then they remain in the same component of , i.e., and are in the same component in , as shown in the following lemma.

Lemma 1.

Fix If and are in the same component of , then and are in the same component in . Moreover, for each component of , there exists a component of such that each core triangle of is contained in .

Proof.

Suppose for now that and are incident in component of . We want to show that there is some where is compatible with both and Such a guarantees that triangle is incident to both triangles and in proving they are in the same component of . We have a lower bound on from Proposition 4, so it remains to argue that not too many compatibility relations are destroyed in the process of deleting edges from to form .

Recall that for any node in , . Using this and the fact that at most an fraction of edges are deleted from any vertex, there are at most edges deleted from vertices that would connect to core triangles. That implies that out of the possible core triangles for , at least of them still maintain all 6 edges between and that are necessary for the triangles to be compatible. This similarly holds for , at least of them still maintain all 6 edges between and .

Then from Proposition 4, we have that

Since for