# From DNF compression to sunflower theorems via regularity

The sunflower conjecture is one of the most well-known open problems in combinatorics. It has several applications in theoretical computer science, one of which is DNF compression, due to Gopalan, Meka and Reingold [Computational Complexity 2013]. In this paper, we show that improved bounds for DNF compression imply improved bounds for the sunflower conjecture, which is the reverse direction of [Computational Complexity 2013]. The main approach is based on regularity of set systems and a structure-vs-pseudorandomness approach to the sunflower conjecture.

Comments

There are no comments yet.

## Authors

• 20 publications
• 8 publications
• 5 publications
12/04/2020

### On proof theory in computer science

The subject logic in computer science should entail proof theoretic appl...
10/21/2020

### Subword complexity of the Fibonacci-Thue-Morse sequence: the proof of Dekking's conjecture

Recently Dekking conjectured the form of the subword complexity function...
12/11/2019

### On the Resolution of the Sensitivity Conjecture

The Sensitivity Conjecture is a long-standing problem in theoretical com...
11/01/2017

### Credimus

We believe that economic design and computational complexity---while alr...
07/06/2020

### KRW Composition Theorems via Lifting

One of the major open problems in complexity theory is proving super-log...
09/21/2016

### Revealing Structure in Large Graphs: Szemerédi's Regularity Lemma and its Use in Pattern Recognition

Introduced in the mid-1970's as an intermediate step in proving a long-s...
12/23/2008

### Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, S

I argue that data becomes temporarily interesting by itself to some self...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The sunflower conjecture is one of the most well-known open problems in combinatorics. An -sunflower is a family of sets where all pairwise intersections are the same. A -set system is a collection of sets where each set has size at most . Erdős and Rado [ER60] asked how large can a -set system be, without containing an -sunflower. They proved an upper bound of , and conjectured that the bound can be improved.

###### Conjecture 1.1 (Sunflower conjecture, [Er60]).

Let . There is a constant such that any -set system of size contains an -sunflower.

60 years later, only lower order improvements have been achieved, and the best bounds are still of the order of magnitude of about for any fixed , same as in the original theorem of Erdős and Rado. A good survey on the current bounds is [Kos00].

Sunflowers have been useful in various areas in theoretical computer science. Some examples include monotone circuit lower bounds [Raz85, Ros10], barriers for improved algorithms for matrix multiplication [ASU13] and faster deterministic counting algorithms via DNF compression [GMR13]. The focus on this paper is on this latter application, in particular DNF compression.

A DNF (Disjunctive Normal Form) is disjunction of conjunctive clauses. The size of a DNF is the number of clauses, and the width of a DNF is the maximal number of literals in a clause. It is a folklore result that any DNF of size can be approximated by another DNF of width , by removing all clauses of larger width. The more interesting direction is whether DNFs of small width can be approximated by DNFs of small size. Namely - can DNFs of small width be “compressed” while approximately preserving their computational structure?

A beautiful result of Gopalan, Meka and Reingold [GMR13] shows that DNFs of small width can be approximated by small size DNFs. Their proof relies on the sunflower theorem (more precisely, a variant thereof due to Rossman [Ros10] that we will discuss shortly). Before stating their result, we introduce some necessary terminology. We say that two functions are -close if over a uniformly chosen input. We say that is a lower bound of , or that is an upper bound of , if for all .

###### Theorem 1.2 (DNF compression using sunflowers, sandwiching bounds [Gmr13]).

Let be a width- DNF. Then for every there exist two width- DNFs, and such that

1. for all .

2. and are -close.

3. and have size .

Recently, Lovett and Zhang [LZ18] improved the dependence of the size of the lower bound DNF on (but with a worst dependence on ). In particular, the proof avoids the use of the sunflower theorem.

###### Theorem 1.3 (DNF compression without sunflowers, lower bound [Lz18]).

Let be a width- DNF. Then for every there exists a width- DNFs such that

1. for all .

2. and are -close.

3. has size .

It is natural to speculate that a similar bound holds for upper bound DNFs.

###### Conjecture 1.4 (Improved upper bound DNF compression).

Let be a width- DNF. Then for every there exists a width- DNFs such that

1. for all .

2. and are -close.

3. has size .

The main result of this paper is that creftypecap 1.4 implies an improved bound for the sunflower conjecture, with a bound of instead of the current bound of . Thus, the connection between sunflower theorems and DNF compression goes both ways.

To simplify the presentation, we assume from now on that . This will allow us to assume that . In any case, for the sunflower conjecture is trivial, as any -set system of size is an -sunflower.

###### Theorem 1.5 (Main theorem).

Assume that creftypecap 1.4 holds. Then for any there exists a constant such that the following holds. Any -set system of size contains an -sunflower.

In fact, Theorem 1.5 holds even with a slightly weaker conjecture instead of creftypecap 1.4, where the size bound can be assumed to be instead of . In addition, we only need to assume this for monotone DNFs.

### 1.1 Proof overview

The proof of Erdős and Rado [ER60] is by a simple case analysis which we now recall. Let be a -set system. Then either contains disjoint sets, which are in particular an -sunflower; or at most sets whose union intersects all other sets. In the latter case, there is an element that belongs to a fraction of the sets in . If we restrict to these sets, and remove the common element, then we reduced the problem to a -set system of size . The proof concludes by induction.

Our approach is to refine this via a structure-vs-pseudorandomness approach. Either there is set of elements that belong to many sets in (concretely, as least , for an appropriately chosen ), or otherwise the set system is pseudo-random, in the sense that no set is contained in too many sets in . The main challenge is showing that by choosing large enough, this notion of pseudo-randomness is useful. This will involve introducing several new concepts and tying them to the sunflower problem.

The following proof overview follows the same structure as the sections in the paper, to ease readability.

#### Section 2: DNFs and set systems.

First, we note that set systems are equivalent to monotone DNFs. Formally, we identify a set system with the monotone DNF . This equivalence will be useful in the proof, as at different stages one of these viewpoints is more convenient.

The notions of “lower bound DNF” and “upper bound DNF” used in Theorem 1.2, Theorem 1.3 and creftypecap 1.4 have analogs for set systems, which we refer to as proper lower bound and upper bound DNFs (or set systems). For the purpose of this high level overview, we ignore this distinction here.

#### Section 3: Approximate sunflowers.

The notion of approximate sunflowers was initiated by Rossman [Ros10]. It relies on the notion of satisfying set systems.

Let be a set system on a universe . We say that is -satisfying if , where is the corresponding monotone DNF for , and is the -biased distribution on . The importance of satisfying set systems in our context is that a -satisfying set system contains disjoint sets (creftypecap 3.4).

Let be the intersection of all sets in . We say that is a -approximate sunflower if the set system is -satisfying. An interesting connection between approximate sunflowers and sunflowers is that a -approximate sunflower contains an -sunflower (Corollary 3.5).

#### Section 4: Regular set systems.

Let be a distribution over subsets of . We say that is regular if when sampling

, the probability that

contains any given set is exponentially small in the size of . Formally, is -regular if any set it holds that .

A set system is -regular if there exists a -regular distribution supported on sets in . We show that if is -regular, then the same holds for any upper bound set system (creftypecap 4.3) and any “large enough” lower bound set system (creftypecap 4.4). These facts will turn out to be useful later.

#### Section 5: Regular set systems are (1/2,1/2)-satisfying.

In this section, we focus on regular set systems , or equivalently regular DNFs . We show that, assuming creftypecap 1.4 (or the slightly weaker creftypecap 5.2), any -regular DNF of width , where , is -satisfying. Namely, , where is uniformly chosen. In particular, this implies that contains two disjoint sets. However, our goal is to prove that contains an -sunflower for , so we are not done yet.

#### Section 6: Intersecting regular set systems.

Let denote the maximal such that there exists a -regular -set system without disjoint sets. It is easy to prove that the sunflower theorem holds for any set system of size (creftypecap 6.4). However, our discussion so far only allows us to bound ; concretely, assuming creftypecap 1.4 we have .

We show (Lemma 6.5) that nontrivial upper bounds on imply related upper bounds on for every . Concretely, if then where are constants. This concludes the proof, as we get that any -set system of size must contain an -sunflower.

## 2 DNFs and set systems

A DNF is monotone if it contains no negated variables. Monotone DNFs are in one-to-one correspondence with set systems. Formally, if is a set system then the corresponding monotone DNF is

 fF(x)=⋁S∈F⋀i∈Sxi.

In the other direction, if is a monotone DNF then its corresponding set system is

 Ff={S1,…,Sm}.

Observe that a -set system corresponds to a width- monotone DNF, and vice versa. If is the set of elements over which is defined then we write .

To recall, we consider both lower bound and upper bound DNFs. As our main motivation is to better understand sunflowers, we restrict attention to monotone DNFs from now on; however, all the definitions can be easily adapted for general DNFs.

We next define proper upper and lower bound DNFs. Proper lower bound DNFs are obtained by removing clauses from the DNF, and proper upper bound DNFs are obtained by removing variables from clauses in the DNF. We describe both in terms of the corresponding set systems.

###### Definition 2.1 (Proper lower bound DNF / set system).

Let be a set system. A proper lower bound set system for is simply a sub set system . Observe that indeed

 fF′(x)≤fF(x)∀x.
###### Definition 2.2 (Proper upper bound DNF / set system).

Let be a set system. A proper upper bound set system for is a set system that satisfies the following: for each there exists such that . Observe that indeed

 fF′(x)≥fF(x)∀x.

For monotone DNFs, upper bounds and proper upper bounds are the same.

###### Claim 2.3.

Let be set systems over the same universe, such that

 fF′(x)≥fF(x)∀x.

Then is a proper upper bound set for .

###### Proof.

Assume not. Then there exists such that there is no with . Let

be the indicator vector for

. Then but , a contradiction. ∎

###### Corollary 2.4.

In creftypecap 1.4, we may assume that is a proper upper bound DNF for .

We note that the lower and upper bound DNFs in [GMR13] are in fact proper lower and upper bounds, and the same holds for the lower bound DNF in [LZ18].

## 3 Approximate sunflowers

We introduce the notion of approximate sunflowers, first defined by Rossman [Ros10]. We first need some notation. Given a finite set and , we denote by the -biased distribution over , where is sampled by including each in independently with probability . The definition of approximate sunflowers relies on the notion of a satisfying set system.

###### Definition 3.1 (Satisfying set system).

Let be a set system and let . We say that is -satisfying if

 PrW∼Xp[∃S∈F:S⊆W]>1−ε.

Equivalently, if is the DNF corresponding to , then is -satisfying if

 Prx∼Xp[fF(x)=1]>1−ε.

An approximate sunflower is set system which is satisfying if we first remove the common intersection of all the sets in the set system.

###### Definition 3.2 (Approximate sunflower).

Let be a set system and let . Let . Then is a -approximate sunflower if the set system is -satisfying.

Rossman proved an analog of the sunflower theorem for approximate sunflowers. Li, Lovett and Zhang [LLZ18] reproved this theorem by using a connection to randomness extractors.

###### Theorem 3.3 (Approximate sunflower lemma [Ros10]).

Let be a -set system and let . If then contains a -approximate sunflower.

To conclude this section, we show that satisfying set systems contain many disjoint sets, and hence approximate sunflowers contain sunflowers.

###### Claim 3.4.

Let and be a -satisfying set system. Then contains pairwise disjoint sets.

###### Proof.

Let . Consider a random coloring of with colors. A coloring induces a partition of into , where is the set of all elements that attain the color . Given a color , a set is -monochromatic if all its elements attain the color . Observe that for each color ,

 Pr[∃S∈F,S is c-monochromatic]=Pr[∃S∈F,S⊆Wc].

The marginal distribution of each is -biased. By our assumption that is -satsifying, the probability that contains some is more than . So by the union bound,

 Pr[∀c∈[r]∃S∈F,S is c-% monochromatic]>0.

In particular, there exists a coloring where this event happens. Let be the sets where is -monochromatic. Then must be pairwise disjoint. ∎

###### Corollary 3.5.

Let be a -approximate sunflower. Then contains an -sunflower.

###### Proof.

Let . Apply creftypecap 3.4 to the set system which by assumption is -satisfying. We obtain that contains pairwise disjoint sets . This implies that form an -sunflower. ∎

## 4 Regular set systems

The notion of regularity of a set system is pivotal in this paper. At a high level, a set system is regular if no element belongs to too many sets, no pair of elements belongs to too many sets, and so on. It is closely related to the notion of block min-entropy studied in the context of lifting theorems in communication complexity [GLM16].

###### Definition 4.1 (Regular distribution).

Let be a finite set, and let be a distribution on subsets . The distribution is -regular if for any set it holds that

 PrS∼D[T⊆S]≤κ−|T|.
###### Definition 4.2 (Regular set system).

A set system is -regular if there exists a -regular distribution supported on the sets in .

The following claims show that if is a -regular set system then any proper upper bound set system for it is also -regular, and any “large” proper lower bound set system is approximately -regular.

###### Claim 4.3.

Let be a -regular set system. Let be a proper upper bound set system for . Then is also -regular.

###### Proof.

Let be a -regular distribution supported on . Let be a map such that for all . Define a distribution on as follows:

 D′(S′)=∑S∈φ−1(S′)D(S).

Then for any set ,

 PrS′∼D′[T⊆S′]=∑S′∈F′:T⊆S′D′(S′)=∑S∈F:T⊆φ(S)D(S)≤∑S∈F:T⊆SD(S)=PrS∼D[T⊆S]≤κ−|T|.

###### Claim 4.4.

Let be a -regular set system, and be a -regular distribution supported on . Let be a proper lower bound set system for , and let . Then is -regular.

###### Proof.

Define a distribution on by . Then for any non-empty set ,

## 5 Regular set systems are (1/2,1/2)-satisfying

In this section we use creftypecap 1.4 to prove that regular enough DNFs are -satisfying, where in light of creftypecap 3.4 we care about . To recall the definitions, a DNF is -satisfying if for a uniformly chosen ,

 Prx[f(x)=1]>1−ε.

Define

 γ(w)=sup{κ:∃κ-regular w-set system which % is \text@underline{not} (1/2,1/2)-satisfying}.

We start by giving a lower bound on , where the motivation is to help the reader gain intuition.

.

###### Proof.

We construct a -regular -set system which is not -satisfying, for . Let be disjoint sets, each of size for a constant to be determined. Let . Let be the -set system of all sets that contain exactly one element from each set

. It is simple to verify that the uniform distribution over

is -regular, and hence is -regular. Let . Then

 Pr[∃S∈F,S⊆W]=Pr[∀i∈[w],|Xi∩W|≥1]=(1−2−κ)w=(1−c/w)w≤exp(−c).

In particular, for we get that is not -satisfying. ∎

As we shall soon see, creftypecap 1.4 implies that the lower bound is not far from tight:

 γ(w)≤(logw)O(1).

It will be sufficient to assume a slightly weaker version of creftypecap 1.4, where we allow the size of to be somewhat bigger.

###### Conjecture 5.2 (Weaker version of creftypecap 1.4).

Let . For any monotone width- DNF there exists a monotone width- DNF such that

1. is a proper upper bound DNF for .

2. and are -close.

3. has size at most for some absolute constant .

###### Lemma 5.3.

Assume creftypecap 5.2 holds. Then there exists a constant such that the following holds. For let . Let be a -set system which is -regular. Then is -satisfying.

###### Corollary 5.4.

.

We prove Lemma 5.3 in the remainder of this section. We start with some simple claims that would serve as a base case for Lemma 5.3 for .

###### Claim 5.5.

Let . Let be a -regular -set system, where . Then contains pairwise disjoint sets.

###### Proof.

Let be a -regular distribution over . Sample independently . The probability that intersect is at most

 Pr[|S∩S′|≥1]≤∑i∈SPr[i∈S′]≤w/κ.

Let be chosen independently. Then by the union bound, the probability that any two intersect is at most . In particular, there exist pairwise disjoint sets in . ∎

###### Claim 5.6.

Let . Let be a -regular -set system, where . Then is -satisfying.

###### Proof.

Assume . creftypecap 5.5 implies that contains disjoint sets . Let . Then

 Pr[∃S∈F,S⊆W]≥Pr[∃i∈[r],Si⊆W]=1−(1−2−w)r>1−ε.

###### Proof of Lemma 5.3.

We will need several properties from in the proof. To simplify notations, we shorthand for throughout. The constant below is the absolute constant from creftypecap 1.4. We need a constant so that the following conditions are satisfied:

1. for and .

2. for .

3. for .

One can check that the function satisfies , with equality when is even. Thus taking satisfies the conditions for a large enough . We then take .

The proof of lemma Lemma 5.3 is by induction on . The base cases are and which follow from creftypecap 5.6 and condition (i) on . Thus, we assume from now that . We need to prove that for we have

 Pr[f(x)=0]<ε.

Let and assume that is -regular for . Let and let be the corresponding DNF. Applying creftypecap 5.2 to with error parameter , we obtain that there exists a -approximate proper upper bound DNF to of size . Let be the corresponding set system to , and observe that is a proper upper bound set system for . Let and let be the corresponding DNF. Then

 Pr[f(x)=0]≤Pr[f3(x)=0]+(Pr[f2(x)=0]−Pr[f1(x)=0])≤Pr[f3(x)=0]+γ.

Next, observe that is a proper upper bound set system for . As we assume that is -regular, then by creftypecap 4.3 we obtain that is also -regular. Let be a -regular distribution supported on . Let , where . As each set has size then, since is -regular, we have

 D(S)≤κ−w/2.

Summing over all we obtain that

 D(F4)≤|F4|⋅κ−w/2≤|F2|⋅κ−w/2≤((logwε)2cκ−1/2)w.

We would need that . As , this follows from condition (ii) on . Let . Then is a -set system. By creftypecap 4.4 is -regular for

 κ′=κ⋅D(F5)=κ(1−D(F4))≥κ−1.

Let . Assumption (iii) on gives that . Thus, is -regular. Applying the induction hypothesis, if we denote by the corresponding DNF for , then

 Pr[f5=0]<ε′.

Finally, as we have . Putting these together we obtain that

 Pr[f(x)=0]≤Pr[f3(x)=0]+γ≤Pr[f5(x)=0]+γ<ε′+γ=ε/logw+ε(1−1/logw)=ε.

## 6 Intersecting regular set systems

As we showed in creftypecap 3.4, if is a -satisfying set system, then it contains an -sunflower. However we only proved that a regular enough set system is -satisfying so far. In this section, we prove that this is enough to show the existence of an -sunflower for any constant , and with a comparable condition of regularity. Our proof is based on a the study of regular intersecting set systems.

###### Definition 6.1 (Intersecting set system).

A set system is intersecting if any two sets in it intersect. In other words, it does not contain two disjoint sets.

###### Definition 6.2.

For define

 α(w,r)=sup{κ:∃κ-regular w-set system % without r pairwise disjoint sets}.

It will be convenient to shorthand , which can equivalently be defined as

 β(w)=sup{κ:∃κ-regular intersecting w-set % system}.

and for all .

###### Proof.

The first claim follows by our definition that a -set system is a set system where all sets have size at most . In particular, any -set system is also a -set system and hence . The second claim holds since a set system that does not contain disjoint sets, also does not contain disjoint sets. ∎

We start by showing that upper bounds on directly translate to upper bounds on sunflowers. This is reminiscent to the original proof of Erdős and Rado [ER60].

###### Claim 6.4.

Let be a -set system of size . Then contains an -sunflower.

###### Proof.

The proof is by induction on . If contains pairwise disjoint sets then we are done. Otherwise, is not -regular for any . In particular, the uniform distribution over is not -regular. This implies that there exists a nonempty set of size such that

 F′={S∖T:S∈F,T⊆S}

has size . By induction, contains an -sunflower . Hence is a sunflower in . ∎

The main lemma we prove in this section is that upper bounds on imply upper bounds on .

###### Lemma 6.5.

For all it holds that .

Before proving Lemma 6.5, we first prove some upper and lower bounds on . Although these are not needed in the proof of Lemma 6.5, we feel that they help gain intuition on .

.

###### Proof.

Apply creftypecap 5.5 for . ∎

It is easy to construct examples that show that ; for example, the family of all sets of size in a universe of size is intersecting and -regular. The following example shows that is super-constant.

.

###### Proof.

We construct an example of a -regular intersecting -set system for . Let to be optimized later and set . Let be disjoint sets of size each, and let . Consider the set system of all sets of the following form:

 F={S⊆X:∃i∈[m],Xi⊆S,∀j≠i,|Xj∩S|=1}.

Observe that is an intersecting -set system.

Let be the uniform distribution over . We show that is -regular, and hence is -regular. There are two extreme cases: for sets of size we have

 PrS∼D[T⊆S]=1m+(1−1m)1t≤2t.

For sets we have

 PrS∼D[Xi⊆S]=1m.

One can verify that these are the two extreme cases which control the regularity, and hence is -regular for

 κ=min(t/2,m1/t).

Setting gives . ∎

We conjecture that this is essentially tight. In fact, by creftypecap 3.4 we have that

 β(w)≤γ(w)

As we proved, creftypecap 5.2 implies , thus it also implies .

###### Proof of Lemma 6.5.

For define

 η(w,r)=r2r+1β(wr)r.

We will first prove that

 α(w,2r)≤max(η(w,r),2α(w,r))

and then that this implies the bound

 α(w,r)≤η(w,r).

Let be a -regular -set system which does not contain pairwise disjoint sets, where . We will show that this leads to a contradiction.

Let be the corresponding -regular distribution on . Let be any sub set system with . creftypecap 4.4 then implies that is -regular. By our choice of , , and hence contains pairwise disjoint sets.

More generally, consider the following setup. Let with for all . Define and . As long as we are guaranteed that is -regular, and hence contains pairwise disjoint sets. Consider the following process:

1. Initialize for all and .

2. As long as do:

1. Let .

2. Find pairwise disjoint sets .

3. Let .

4. Set if , and otherwise.

5. Set

Assume that the process terminates after steps. Let , which by construction is a set of size . Note that as we assume that does not contain pairwise disjoint sets, we obtain that must be an intersecting set system (possibly with some repeated sets). Let . As , and as we terminate when , we have

 w≥1/2r.

Let , namely taking each set exactly once. As it may be the case that are not all distinct, we only know that . Consider the distribution on given by . Then as is an intersecting set system, we obtain that cannot be -regular for .

Thus, there exists a nonempty set of size such that

 ∑W∈F∗:T⊆WD∗(W)≥β−t.

This implies that if we denote then

 ∑i∈Iwi≥wβ−t≥12rβt.

Next, consider some . Recall that is the union of pairwise disjoint sets . In particular, there must exist such that . We denote . As the number of possibles subsets of is , there must exist such that

 ∑i∈I:Ti=T∗wi≥2−t∑i∈Iwi≥12r(2β)t.

In particular, and

 ∑i∈I:T∗⊆Si,jiwi≥12r(2β)t.

It may be that the list of contains repeated sets (namely, that for some ). For each let . In particular, is not empty only for sets with . We can rewrite the sum as

 ∑i∈I:T∗⊆Si,jiwi=∑S∈F:T∗⊆S∑i∈I(S)wi.

Next, fix some with and consider the internal sum. Recall that , and hence the sum is a telescopic sum and can be bounded by

 ∑i∈I(S)wi≤D0(S)−Dm(S)≤D(S).

We thus obtain that

 ∑S∈F:T∗⊆SD(S)≥∑i∈I:T∗⊆Si,jiwi≥12r(2β)t.

Recall that is -regular. We can upper bound by

 κ≤(2r(2β)t)1/|T∗|≤(2r(2β)t)r/t≤2r(2β)r=η(w,r).

Putting everything together, we get

 α(w,2r)≤max(η(w,r),2α(w,r)).

To conclude the proof, note that if is a power of two then by induction and our choice of we have

 α(w,2r)≤max(η(w,r),2η(w,r/2),4η(w,r/4),…)=η(w,r).

Thus for a general , if is a power of two then

 α(w,r)≤α(w,s)≤η(w,s/2)≤η(w,