 # Breaking the 2^n barrier for 5-coloring and 6-coloring

The coloring problem (i.e., computing the chromatic number of a graph) can be solved in O^*(2^n) time, as shown by Björklund, Husfeldt and Koivisto in 2009. For k=3,4, better algorithms are known for the k-coloring problem. 3-coloring can be solved in O(1.4^n) time (Beigel and Eppstein, 2005) and 4-coloring can be solved in O(1.8^n) time (Fomin, Gaspers and Saurabh, 2007). Surprisingly, for k>4 no improvements over the general O^*(2^n) are known. We show that both 5-coloring and 6-coloring can also be solved in O((2-ε)^n) time for some ε>0. Moreover, we obtain an exponential improvement for k-coloring for any constant k for a very large family of graphs. In particular, for any constants k,Δ,α>0, k-coloring for graphs with at least α· n vertices of degree at most Δ can be solved in O((2-ε)^n) time, for some ε = ε_k,Δ,α > 0. As a consequence, for any constant k we can solve k-coloring exponentially faster than O^*(2^n) for sparse graphs.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The problem of -coloring a graph, or determining the chromatic number of a graph (i.e., finding the smallest for which the graph is -colorable) is one of the most classic and well studied NP-Complete problems. Computing the chromatic number is listed as one of the first NP-Complete problems in Karp’s paper from 1972 . In a similar fashion to -SAT, the problem of -coloring is polynomial, yet -coloring is NP-complete for every (proven independently by Lovász  and Stockmeyer ). An algorithm solving -coloring in sub-exponential time would imply, via the mentioned reductions, that -SAT can also be solved in sub-exponential time. It is strongly believed that this is not possible (as stated in a widely believed conjecture called The Exponential Time Hypothesis ), and thus it is believed that exact algorithms solving -coloring must be exponential.

There is a substantial and ever-growing body of work exploring exponential-time worst-case algorithms for NP-Complete problems. A 2003 survey of Woeginger  covers and refers to dozens of papers exploring such algorithms for many problems including satisfiability, graph coloring, knapsack, TSP, maximum independent sets and more. A subsequent book of Fomin and Kaski  further covers the topic of exact exponential-time algorithms.

For satisfiability (i.e., SAT), the running time of the trivial algorithm enumerating over all possible assignments is . No algorithms solving SAT in time for any are known, and a popular conjecture called The Strong Exponential Time Hypothesis  states that no such algorithm exists. On the other hand, it is known that for every fixed there exists a constant such that -SAT can be solved in time. A result of this type was first published by Monien and Speckenmeyer in 1985 . A long list of improvements for the values of were published since, including the celebrated 1998 PPSZ algorithm of Paturi, Pudlák, Saks and Zane  and the recent improvement over it by Hansen, Kaplan, Zamir and Zwick .

For coloring, on the other hand, the situation is less understood. The trivial algorithm solving -coloring by enumerating over all possible colorings takes time. Thus, it is not even immediately clear that computing the chromatic number of a graph can be done in time for a constant independent of . In 1976, Lawler  introduced the idea of using dynamic-programming to find the minimal number of independent sets covering the graph. The trivial implementation of this idea results in an algorithm. More sophisticated bounds on the number of maximal independent sets in a graph and fast algorithms to enumerate over them (Moon and Moser , Paull and Unger ) resulted in an algorithm. This was improved several times (including Eppstein  and Byskov ), until finally an algorithm computing the chromatic number in time was devised by Björklund, Husfeldt and Koivisto in 2009 . This settled an open problem of Woeginger . A relatively recent survey of Husfeldt  covers the progress on graph coloring algorithms.

For , better algorithms are known for the -coloring problem. Schiermeyer  showed that -coloring can be solved in time. Biegel and Eppstein  gave algorithms solving -coloring in time and -coloring in time in 2005. Fomin, Gaspers and Saurabh  have improved the running time of -coloring to in 2007. Unlike the situation in -SAT, for every the best known running time for -coloring is , the same as computing the chromatic number. Thus, a very fundamental question was left wide open.

###### Open Problem 1.

Can -coloring be solved in time, for some ?

and more generally,

###### Open Problem 2.

Can -coloring be solved in time, for some , for every ?

In our work, we answer Problem 1 affirmatively, the answer extends to -coloring as well. We also make major steps towards settling Problem 2.

The main technical theorem of our paper follows.

###### Definition 1.1.

For and we say that a graph is -bounded if it contains at least vertices of degree at most .

###### Theorem 1.2.

For every there exists such that we can solve -coloring for
-bounded graphs in time.

In other words, we can answer Problem 2 affirmatively unless the graph has almost only vertices of super-constant degrees. In particular, we get faster algorithms for sparse graphs.

###### Corollary 1.3 (of Theorem 1.2).

For every there exists such that we can solve -coloring
for graphs with in time.

Improvements for exponential algorithms solving partitioning problems in the case of bounded average degree appeared before. Cygan and Pilipzcuk  obtained an improvement for the running time required for the Traveling Salesman Problem for graphs with bounded average degree. In their paper, they state the problem of doing the same for chromatic number as an open problem, which we settle in this paper. A similar result for the more restricted case of bounded degree graphs was obtained by Björklund et al. in .

It is important to stress that Theorem 1.2 is much stronger than Corollary 1.3. In particular, we use it to construct the following reductions.

###### Theorem 1.4.

Given an algorithm solving -list-coloring in time for some constant , we can construct an algorithm solving -coloring in time for some (other) constant . Furthermore, the reduction is deterministic.

###### Theorem 1.5.

Given an algorithm solving -list-coloring in time for some constant , we can construct an algorithm solving

-coloring with high probability in time

for some (other) constant .

From which we finally conclude the following, answering Problem 1 affirmatively.

###### Theorem 1.6.

-coloring can be solved in time for some constant .

###### Theorem 1.7.

-coloring can be solved with high probability in time for some constant .

We note that our -coloring algorithm is deterministic, while our -coloring algorithm is randomized with an exponentially small one-sided error probability.

As part of our work, we develop a new removal lemma for small subsets. This could be of independent interest. Very roughly, it states that every collection of small sets must have a large sub-collection that can be made pairwise-disjoint by the removal of a small subset of the universe. The exact statement follows.

###### Theorem 1.8.

Let be a collection of subsets of a universe such that every set is of size . Let be any constant. Then, there exist subsets and , such that

• , where depends only on .

• The sets in are disjoint when restricted to , i.e., for every we have .

In Appendix A.1 we present an upper bound for the function appearing in Theorem 1.8. This upper bound implies that the constant we can obtain using our technique must be very small.

### 1.1 Organization of the Paper

The rest of the paper is organized as follows. In Section 2 we go over the preliminary tools that we use in the paper. In Section 3 we further elaborate on the algorithm of Björklund, Husfeldt and Koivisto for computing the chromatic number of a graph .

The main algorithmic contribution of the paper appears in Section 4, in which we prove Theorem 1.2. The section is partitioned into two main parts. In Section 4.1 we present our ideas in a simpler manner and get a result limited to bounded degree graphs. Then, in Section 4.2, which is more technically involved, we complete the proof of Theorem 1.2.

As part of Section 4.2, we use Theorem 1.8, a combinatorial result independent of the algorithmic tools of Section 4. The proof of Theorem 1.8 appears in Section 5.

In Section 6 we use Theorem 1.2 as a main ingredient in a reduction from -coloring to -list-coloring. In this section, we prove Theorems 1.4 and 1.6. In Section 7 we refine the ideas used in Section 6 and construct a reduction from -coloring to -list-coloring. In this section, we prove Theorems 1.5 and 1.7.

We finally conclude the paper and present a few open problems in Section 8.

## 2 Preliminaries

The terminology used throughout the paper is standard. For a graph we denote by and its vertex-set and edge-set, respectively. For a subset we denote by the sub-graph of induced by . For we denote by the degree of in , by the set of neighbours of , and by .

For and we say that a graph is -bounded if it contains at least vertices of degree at most . Note that if this definition coincides with the standard definition of a bounded degree graph.

In the -coloring problem, we are given a graph and need to decide whether there exists a -coloring of , such that for every we have . If a graph has a -coloring, we say that it is -colorable. In the chromatic number problem, we are given a graph and need to compute , the minimal integer for which is -colorable.

In the -list-coloring problem, we are given a graph and a set of size for every , where is some arbitrary universe. We need to decide whether there exists a coloring such that for every we have and for every we have .

In a general -CSP (Constraint Satisfaction Problem, see  or  for a complete definition and discussions) we are given a list of constraints222A general constraint on a set of -ary variables is a subset of the possible assignments in . The constraint is satisfied by an assignment , possibly on more variables, if . on the values of subsets of size of -ary variables, and need to decide whether there exists an assignment of values to the variables for which all constraints are satisfied. -coloring and -list-coloring are examples of -CSP problems. -SAT is an example of a -CSP problem.

### 2.1 Inverse Möbius Transform

Let be an -element set. The Inverse Möbius transform (sometimes also called the Zeta transform)  maps a function from the power-set of into another function defined as

 ^f(X)=∑Y⊆Xf(Y).

Naively, is computed using additions. Thus, we can compute all values of in a straightforward manner with operations. Yates’ method from 1937 (, ) improves on the above and computes all values of using just operations. The resulting algorithm is usually called the fast möbius transform or the fast zeta transform (, ). The authors of  and 

use the fast Inverse Möbius Transform to devise algorithms for combinatorial optimization problems such as computing the chromatic and the domatic numbers of a graph. The algorithm of

 is summarized in Section 3.

A description of Yates’ method follows.

###### Lemma 2.1.

The Inverse Möbius Tranform for some function can be computed in time, where .

###### Proof.

Denote by some enumeration of ’s elements. Denote by . We preform iterations for , in which we compute all values of the function defined using as follows.

 fi(X)={fi−1(X)+fi−1(X∖{ui})if ui∈Xfi−1(X)otherwise

Namely, in the -th iteration we add the values the function gets in the sub-cube defined by to the corresponding values in the sub-cube defined by .

A simple induction on shows that where is the set of all subsets such that

 {uj∈Y|j>i}={uj∈X|j>i}

In particular, by the end of the algorithm . ∎

### 2.2 Decision versus Search

The -coloring problem can be stated in two natural ways. In the first, given a graph decide whether it can be colored using colors. The the second, given a graph return a -coloring for it if one exists, or say that no such coloring exists. A few folklore reductions show that the two problems have the same running time up to polynomial factors. We state one for completeness. Others appear in the survey of .

###### Lemma 2.2.

Let be an algorithm deciding whether a graph is -colorable in time. Then, there exists an algorithm that finds a -coloring for , if one exists, in time.

###### Proof.

We describe . First, use to decide whether is -colorable, if it returns False we return that no -coloring exists. Otherwise, repeat the following iterative process. For every pair of distinct vertices that is not an edge of , use to check whether stays -colorable after adding as an edge. If it does, add to . We stop when no such pair exists.

The reader can verify that the resulting graph must be a complement of disjoint cliques, and thus we can easily construct a -coloring. ∎

A problem comes up while trying to use this type of reductions in the settings of this paper. The aforementioned reduction adds edges to the graph, and therefore increases the degrees of vertices. In particular, we cannot use it (or other similar reductions) in a black-box manner for statements like Theorem 1.2. The algorithm of  solves the decision version of -coloring for bounded degree graphs, and cannot be trivially converted into an algorithm that finds a coloring. The algorithms presented in this paper, on the other hand, can be easily converted into algorithms that find a -coloring. This is briefly discussed later in Section 4.3.

## 3 Overview of the O∗(2n) algorithm

In this section we present a summary of Björklund, Husfeldt and Koivisto’s algorithm from . We present a concise variant of their work that applies specifically to the coloring problem. The original paper covers a larger variety of set partitioning problems and thus the description in this section is simpler.

We begin by making the following very simple observation, yielding an equivalent phrasing of the coloring problem.

###### Observation 1.

A graph is -colorable if and only if its vertex set can be covered by independent sets.

A short outline of the algorithm follows, complete details appear below. We need to decide whether can be covered by independent sets. In order to do so, we compute the number of independent sets in every induced sub-graph and then use a simple inclusion-exclusion argument in order to compute the number of (ordered) covers of by independent sets. We are interested in whether this number is positive.

###### Definition 3.1.

For a subset of vertices, let denote the number of independent sets in the induced sub-graph .

We next show that using dynamic programming, we can quickly compute these values.

###### Lemma 3.2.

We can compute the values of for all in time.

###### Proof.

Let be an arbitrary vertex contained in . The number of independent sets in that do not contain is exactly . On the other hand, the number of independent sets in that do contain is exactly . Thus, we have

 i(G[V′])=i(G[V′∖{v}])+i(G[V′∖N[v]]).

We note that both and are of size strictly less than . Thus, we can compute all values of using dynamic programming processing the sets in non-decreasing order of size. ∎

Consider the expression

 F(G)=∑V′⊆V(G)(−1)|V(G)|−|V′|⋅i(G[V′])k.

Using the values of computed in Lemma 3.2, we can easily compute the value of by directly evaluating the above expression in time.

###### Lemma 3.3.

Let be sets. It holds that

 ∑S1⊆S⊆S2(−1)|S|={0if S1≠S2(−1)|S2|if S1=S2
###### Proof.

If then there exists a vertex . We can pair each set with , its symmetric difference with

. Clearly, in each pair of sets one is of odd size and one is of even size, and thus their signs cancel each other. Therefore, the sum is zero. In the second case, the claim is straightforward. ∎

###### Lemma 3.4.

equals the number of -tuples of independent sets in such that .

###### Proof.

As counts the number of independent sets in , raising it to the -th power (namely, ) counts the number of -tuples of independent sets in .

Let be a -tuple of independent sets in . It appears exactly in terms of the sum corresponding to sets such that . Each time this -tuple is counted, it is counted with a sign determined by the parity of . By Lemma 3.3, the sum of the signs corresponding to sets is zero if and one if . ∎

We conclude with

###### Corollary 3.5.

can be computed in time , and is -colorable if and only if .

## 4 Faster k-Coloring Algorithms for (α,Δ)-bounded Graphs

The main purpose of this section is proving Theorem 1.2.

We first outline our approach. Let be a graph with a constant chromatic number . It is well known that must contain a large independent set. Let be an independent set in . We think of as a constant fraction of , when we consider as a constant. Let be a -coloring of the induced sub-graph . We say that can be extended to a -coloring of if there exists a proper -coloring such that . For a subset of vertices, we say that does not use the full palette on if , namely, if does not use all colors on the vertices of . Clearly, a proper -coloring of can be extended to a proper -coloring of if and only if for every .Our approach, on a high-level, is to construct an algorithm that finds an extendable -coloring of . We aim to do so in time.

In Section 4.1 we consider a restricted version of the problem in which the independent set has the following two additional properties. First, we assume that every vertex is of degree , where is some constant. Second, we assume that no pair of vertices share a neighbor in . Equivalently, the neighborhoods for every are all disjoint. Under these conditions, we present an algorithm that runs in time, where depends only on . As does not depend on , we can in fact compute the chromatic number of exponentially faster than if contains an independent set with these properties. We also observe that if is of maximum degree then it contains a large such independent set . Our algorithm is based on methods that generalize Section 3, and on a simple approach to implicitly compute values of the Inverse Möbius Transform.

In Section 4.2 we modify the algorithm of Section 4.1 and remove the second assumption on . Namely, we now only assume that is an independent set and that for every we have . Our algorithm still runs in time, yet now depends on both and . A main ingredient in the modification is a strong new removal lemma for small subsets. The proof of this combinatorial lemma is given in Section 5 and its statement is used in a black-box manner in this section.

### 4.1 k-coloring bounded-degree graphs

In this subsection we begin illustrating the ideas leading towards proving Theorem 1.2. We also prove the following (much) weaker statement.

###### Theorem 4.1.

For every there exists such that we can solve -coloring for graphs with maximum degree in time.

In fact, as a graph with maximum degree has chromatic number , we can compute the chromatic number of a graph with degrees bounded by in time .

As outlined in the beginning of this section, our approach begins by finding a large independent set with some additional properties. We show that a graph with bounded degrees must contain a very large independent set such that the distance between each pair of vertices in is at least three. In other words, is an independent set, and no pair of vertices in share a neighbor. In particular, the neighborhoods for are all disjoint. The core theorem of this subsection is

###### Theorem 4.2.

Let be a graph and a set of vertices such that the distance between each two vertices in is at least three and the degree of each vertex in is at most . For any , we can solve -coloring for in time.

It is important to note that the existence of such a set is our sole use of the bound on the graph degrees. Note that the bound of Theorem 4.2 does not depend on . Thus, we get an exponential improvement for computing the chromatic number of a graph that contains a large enough set with the stated properties.

Before proving Theorem 4.2, we describe a simple algorithm for finding a set with the required properties in bounded-degree graphs.

###### Lemma 4.3.

Let be a graph with maximum degree at most . There exists a set of at least vertices such that the distance between every distinct pair is at least three. Furthermore, we can find such efficiently.

###### Proof.

We construct in a greedy manner. We begin with and . As long as is not empty we pick an arbitrary vertex and add it to . We then remove from the vertex and every vertex of distance at most two from it.

By construction, the minimum distance between a pair of vertices in is at least three. The size of the -neighborhood of a vertex is bounded by and thus we get the desired lower bound on the size of . ∎

Theorem 4.1 now follows from Lemma 4.3 and Theorem 4.2.

###### Proof of Theorem 4.1.

Let be a graph of maximum degree at most and let be an integer. By Lemma 4.3, we can construct a set of size satisfying the conditions of Theorem 4.2. Thus, by Theorem 4.2, we can solve -coloring for in time

In the rest of the subsection we prove Theorem 4.2.

###### Definition 4.4.

For subsets and denote by the number of independent sets in that intersect every neighborhood of , that is, for every .

Consider, for a subset , the following sum

 h(G,S′):=∑V′⊆V(G)∖S(−1)|V(G)|−|V′|β(V′,S′)k.

The following proof is almost identical to the proof of Lemma 3.4 in Section 3.

###### Lemma 4.5.

is the number of covers of by -tuples of independent sets in such that for every and every .

###### Proof.

Each value of counts independent sets in that intersect every neighborhood for .

Each -tuple of that type is counted in terms corresponding to sets such that

 I0∪…∪Ik−1⊆V′⊆V(G)∖S.

By Lemma 3.3 the multiplicity with which such -tuple is counted is one if

 I0∪…∪Ik−1=V(G)∖S.

and zero otherwise. ∎

Consider the following expression.

 H(G,S):=∑S′⊆S(−1)|S|h(G,S′)

is the number of covers of by -tuples of independent sets that do not use the full palette on any neighborhood for . The precise claim follows.

###### Lemma 4.6.

is the number of covers of by -tuples of independent sets in such that for every there exists such that .

###### Proof.

In Lemma 4.5 we showed that counts the number of covers of by -tuples of independent sets in such that for every and for every we have .

A covering -tuple of independent sets is counted exactly in terms corresponding to subsets such that for every and every , the independent set intersects the neighborhood . These are exactly the subsets such that

 S′⊆{s∈S|∀0⩽i⩽k−1,Ii∩N(s)≠∅}.

Using Lemma 3.3 with and we deduce that the multiplicity with which the -tuple is counted is one if

 {s∈S|∀0⩽i⩽k−1,Ii∩N(s)≠∅}=∅

and zero otherwise. ∎

As outlined at the beginning of the section, we now claim that is positive if and only if is -colorable. Note that for the correctness of this lemma we still did not use the disjointness of the neighborhoods . We will need this property to improve the computation time.

###### Lemma 4.7.

Let be a graph and an independent set in it. Then, if and only if is -colorable.

###### Proof.

Assume that there exists a -coloring of . For denote by

 Ii:={v∈V(G)∖S|c(v)=i}

the subset of colored by . Each is an independent set as is a proper coloring of . Furthermore, for each , the neighborhood does not intersect . Thus, is a cover of by independent sets that do not all intersect any neighborhood of . By Lemma 4.6, .

On the other hand, if then by Lemma 4.6 there exists a cover by independent sets and in particular a -coloring of such that the full palette is not used on any neighborhood for . Thus, we may extend to a -coloring of the entire graph by coloring each with a color that does not appear in . As is an independent set, this coloring is proper. ∎

Up to this point, we have formalized the outline from the beginning of this section, reducing -coloring to a problem of -coloring with some restrictions the smaller graph and then to the computation of .

Unfortunately, is a sum of terms, each of the form which is a sum of terms by itself. Evidently, there are different terms of the form that are used in the definition of . Thus, we cannot hope to compute in less than steps if we need to explicitly examine terms of the form . Moreover, it is also not clear how quickly we can compute the values of .

We begin by explaining how values of can be computed efficiently. The term is a weighted sum of the values for all . Denote by the indicator function that gets the value if is an independent set in and for every we have , and otherwise. We can efficiently compute the value of for a specific input in a straightforward manner (i.e., checking whether it is an independent set that intersects the relevant sets). We observe that

 β(V′,S′)=∑V′′⊆V′βμ(V′′,S′),

thus, as functions of , and we can compute the values of for all in time using the Inverse Möbius Transform presented in Section 2.1.

An improvement to the running time comes from noticing that for many inputs the value of is zero. In particular, if , for some , then as no subset (and in particular no independent set) in intersects . In the computation of we only need to consider terms corresponding to subsets in which for every the intersection is non-empty, as the values of other terms are all zero. We present a variant of the Inverse Möbius Transform that computes only the non-zero values by implicitly setting the others to zero. We then show that for most subsets the number of non-zero entries is exponentially smaller than .

###### Definition 4.8.

For any denote by the set of all subsets of intersecting all neighborhoods of .

As we observed above, for every we have . We conclude that

###### Observation 2.

For every we have

 h(G,S′)=∑V′∈B(S′)(−1)|V(G)|−|V′|β(V′,S′)k.
###### Lemma 4.9.

If the neighborhoods are disjoint for all , then we can compute in time.

###### Proof.

It suffices to compute for every and then use Observation 2. We do so by introducing a variant of the Inverse Möbius Transform that implicitly sets the value of to zero for every .

We first note that

 B(S′)≅P(V(G)∖(S∪⋃s∈S′N(s)))××s∈S′(P(N(s))∖{∅}).

Thus, we can efficiently construct a simple bijection between and as a Cartesian product. We can also efficiently check if a set belongs to . Let be a map from to indices of . If we define . By the observation above, we can define in way for which and are efficiently computable. We also arbitrarily order the vertices of as .

We describe the algorithm in pseudo-code.

We view throughout the algorithm as function . Denote the function represented by at the end of the first for loop by . By definition, for every . Denote by the function represented by at the end of the -th iteration of the second (outer) for loop.

We observe that is defined using as

 fi(V′)={fi−1(V′)+fi−1(V′∖{vi})if vi∈V′fi−1(V′)otherwise

where is implicitly defined to be zero if .

By induction on , similar to this of Section 2.1, we can show that

 fi(V′)=∑V′′⊆V′V′′∖{v1,…,vi}=V′∖{v1,…,vi}f(V′′).

In particular, by the end of the algorithm for the entire domain . ∎

After computing for every we can compute in time. We thus finish the proof of Theorem 4.2 with the following counting lemma.

###### Lemma 4.10.

Assume that the neighborhoods are disjoint for all and that each neighborhood is of size . Then, .

###### Proof.

Denote . Also denote by all neighbors of vertices of and by their complement in . We have

 |B(S′)| =2|Nc|⋅∏s∈S′(2n(s)−1)⋅∏s∈S∖S′2n(s) =2|Nc|⋅∏s∈S′(1−2−n(s))⋅∏s∈S2n(s) =2|Nc|⋅∏s∈S′(1−2−n(s))⋅2|N| =2|V(G)∖S|⋅∏s∈S′(1−2−n(s)).

For every we have and thus . Hence,

 |B(S′)| ⩽2|V(G)∖S|⋅∏s∈S′(1−2−Δ) =2|V(G)∖S|⋅(1−2−Δ)|S′|.

Therefore we have

 ∑S′⊆S|B(S′)| ⩽∑S′⊆S2|V(G)∖S|⋅(1−2−Δ)|S′| =2|V(G)∖S|⋅|S|∑i=0(|S|i)(1−2−Δ)i =2|V(G)∖S|⋅(2−2−Δ)|S|.

### 4.2 From bounded-degree graphs to (α,Δ)-bounded graphs

In this section we prove the main technical theorem of the paper.

###### Theorem 1.2.

For every there exists such that we can solve -coloring for
-bounded graphs in time.

As in Section 4.1, we deduce Theorem 1.2 from the following theorem.

###### Theorem 4.11.

Let be a graph and an independent set in . Assume that the degree of each vertex in is at most . Then, we can solve -coloring for in time, for some constant .

Let be a graph with a subset of vertices such that for every we have . In a similar fashion to Lemma 4.3 of the previous subsection (and even slightly simpler), we can greedily construct a subset of size which is an independent set. Thus, Theorem 4.11 immediately implies Theorem 1.2. Unlike the case of Section 4.1, this time the neighborhoods for are not necessarily disjoint. Thus, statements comparable to Lemma 4.10 are not true. Our solution for this problem is surprisingly general. In Section 1.8 we prove the following new type of removal lemma for small sets.

###### Theorem 1.8.

Let be a collection of subsets of a universe such that every set is of size . Let be any constant. Then, there exist subsets and