# Improving Gebauer's construction of 3-chromatic hypergraphs with few edges

In 1964 Erdős proved, by randomized construction, that the minimum number of edges in a k-graph that is not two colorable is O(k^2 2^k). To this day, it is not known whether there exist such k-graphs with smaller number of edges. Known deterministic constructions use much larger number of edges. The most recent one by Gebauer requires 2^k+Θ(k^2/3) edges. Applying derandomization technique we reduce that number to 2^k+Θ(k^1/2).

## Authors

• 5 publications
12/03/2019

### Sometimes Reliable Spanners of Almost Linear Size

Reliable spanners can withstand huge failures, even when a linear number...
11/16/2018

### A Spanner for the Day After

We show how to construct (1+ε)-spanner over a set P of n points in R^d t...
05/04/2019

### New Notions and Constructions of Sparsification for Graphs and Hypergraphs

A sparsifier of a graph G (Benczúr and Karger; Spielman and Teng) is a s...
02/25/2021

### Random hypergraphs and property B

In 1964 Erdős proved that (1+1)) ln(2)/4 k^2 2^k edges are sufficient to...
01/20/2019

### Deterministic constructions of high-dimensional sets with small dispersion

The dispersion of a point set P⊂[0,1]^d is the volume of the largest box...
06/02/2021

Near-additive (aka (1+ϵ,β)-) emulators and spanners are a fundamental gr...
07/24/2020

### Improving the dilation of a metric graph by adding edges

Most of the literature on spanners focuses on building the graph from sc...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

In 1964 Erdős proved in [1] that edges are sufficient to build a -graph111i.e. -uniform hypergraph which is not two colorable. To this day that result provides the best known upper bound for the minimum number of edges in such hypergraph. The Erdős’ bound results from the fact that random -graph with that number of edges, built on a set of

vertices can not be colored properly with two colors with high probability.

The best known deterministic construction of a -graph that is not two colorable has been obtained by Gebauer [3]. It requires edges. It is also the first construction in which the number of edges is . The main result of the current paper is an upgrade of this construction that allows to cut down the number of edges to .

Within the whole paper, stands for binary logarithm. We are only concerned with vertex two coloring of hypergraphs. Vertex coloring is proper if no edge is monochromatic. Following common convention we use colors red and blue.

## 2. Gebauer’s construction

We start with recalling the construction of [3], as we are going to modify it. The whole procedure is parametrized by that takes value roughly for some optimized positive . It it convenient to organize the vertices of the constructed hypergraph into a rectangular matrix . Slightly abusing the notation, we use for both the matrix and the set of vertices. We use the same convention for submatrices of . The length of the rows is denoted by . Its value will be a subject of optimization.

### 2.1. Preliminary choice of rows

Vertex coloring can be seen as assigning colors to the entries of the matrix. A color is dominating in a row if at least half of its entries are colored with it (there can be two dominating colors). The main part of the construction is designed to work with a submatrix of rows with the same dominating color. A matrix for which one of the colors is dominating in all rows will be called consistently dominated. We always assume that red is the dominating color in such a matrix.

The ground matrix has rows. Hence, the hypergraph is built on vertices. Let denote the set of submatrices of built of every rows. For every we apply the main construction described in the next section. The construction outputs hypergraph . The union of the edge sets of these hypergraphs forms the edge set of the resulting hypergraph. For every coloring of at least one submatrix is consistently dominated. The main construction guarantees that in such a case, contains a monochromatic edge.

### 2.2. Main construction

Let , recall that has rows. Our goal is to build a hypergraph on the vertex set such that for every consistently dominated coloring of , there exists a monochromatic edge in . For , we denote by matrix in which for every , the -th row has been cyclically shifted by . The construction proceeds as follows.

For every

1. sequence of shifts ,

2. and set of indices of size ,

add to an edge built from all elements of the columns of with indices in .

Note that the edges of are of size as required.

Let us fix a consistently dominated coloring of . We assume wlog that red is the dominating color of the rows. When the sequence of shifts is chosen randomly, the probability that some fixed column is red is at least . As a consequence, for the expected number of red columns is at least . In particular, for some sequence of shifts, there exists a set of red columns. Hence the edge built for these shifts and columns is monochromatic.

### 2.3. Counting

We have

 (2t−1t)<22t

choices for the subset of rows in the preliminary step. Then, in the main construction, every sequence of elements of and a subset of elements of is used to build an edge. The number of choices is

 st⋅(sk/t)⩽st⋅(esk/t)k/t.

For (we assume for simplicity that it is an integer) we obtain

 (k/t)t2t2⋅ek/t2k=2tlog(k/t)+t2+k/tlog(e)+k.

The total number of edges is smaller than

 22t+tlog(k/t)+t2+k/tlog(e)+k.

Finally we choose so that the above exponent is minimized. That happens for . In the end we obtain that the total number of edges is .

## 3. Improved construction

We modify only the main construction. Recall that we work with matrix with rows. For a fixed consistently dominating coloring of , sequence of shifts is called good if contains at least red columns. The set of good sequences for a coloring of is denoted by .

If we fix a consistently dominating coloring of and choose the sequence of shifts uniformly at random, the expected number of red columns in is . That observation was used to justify that there exists a good sequence. However, it also suggests that a large number of shift sequences might be good. For the constructed hypergraph not to be two colorable, it is sufficient that for every consistently dominated coloring of , at least one such sequence is used in the main construction.

We apply derandomization techniques to construct relatively small set of sequences of shifts that can be used in the main construction instead of . For a family of sets , a set that intersects every element of that family is called a hitting set for . In these terms we are looking for a small hitting set for family .

### 3.1. Sequential choice of shifts

We start with estimating the size of the set of good shift sequences. While it is not directly used in our construction, it provides good opportunity to introduce some tools. It will also allow to derive a probabilistic argument that small hitting sets actually exist.

The property of being good is generalized to prefixes in the straightforward way – sequence of shifts is good if the matrix trimmed to the first rows and shifted according to the sequence, has at least red columns.

Suppose that is good. We want to estimate the number of possible choices of for which is good as well. If the coloring of the -th row was ”random”, then about half of the choices would be right, and almost all of the choices would be almost right. That property does not hold in the worst case scenario and hence we are going to work with relaxed definitions.

For , a sequence of shifts is -good if the number of red columns in the shifted matrix trimmed to the first rows is at least . Then, every -good sequence of shifts of length gives a shifted matrix with at least

 s(1−ε)t−12t

red columns. For and , the number of red columns is at least as needed. In the modified construction we set to .

We also define as the set of -good sequences for a coloring of and as .

The following proposition is used to derive a lower bound for the number of -good sequences. It is formulated in more general terms that needed here, but we are going to use it again later. For a set and a number , set is defined as the set shifted cyclically within by , formally . Purely technical proof of the proposition is moved to Appendix A.

###### Proposition 1.

For any positive and sets , let , there exist at least

 ε1−(1−ε)ααs

elements for which .

For we get that there exist at least

 2ε1+εs/2

elements for which .

Applying the proposition iteratively, we obtain that the number of -good sequences of length is at least

 (ε1+εs)j.

(For a fixed , and some -good sequence of length , let be the set of indices of the red columns in the matrix trimmed to the first rows and shifted according to , and be the set of indices of red entries of the -th row.)

For we get a lower bound for the number of -good sequences. Once we have that bound, typical application of the probabilistic method (along the lines of the proof from [1]) allows to proof that there exists a hitting set for of size (see Appendix B). We are interested however in deterministic construction.

### 3.2. Expanders for hitting sets

Linial, Luby, Saks and Zuckerman [4] worked on deterministic constructions of small hitting sets for combinatorial rectangles. We summarize in this section, their results that are relevant for our developments. We follow closely their definitions.

Graph is an -expander if it has vertices, maximum degree and for any , the fraction of vertices in that have a neighbor in is at least . For a fixed graph let denote the set of walks in of length . Let be the set of subsequences of elements of of length (not necessarily subsequences of consecutive elements). Set is a combinatorial rectangle if it is of a form for some . The volume of rectangle , denoted as , is defined as .

###### Lemma 2 ([4]).

Let be positive integers and be a rectangle in . Suppose is an -expander with . If , then contains a point from .

The above lemma implies that a specific set of sequences hits every combinatorial rectangle in of sufficiently large volume.

The following rough estimations for the size of will be sufficient for our needs. We have

 |Wr|⩽m(Δ+1)r

and

 |Wr,d|<2r|Wr|⩽m(2(Δ+1))r.

Lemma 2 leaves some space for the choice of expander graph. Authors of [4] used the construction of Margulis [5] (see also [2]) which allows to build an expander with and . A minor inconvenience is that the construction requires the number of vertices to be a perfect square. However, as observed already in [4], we can consider the rectangles of our interest as subsets of a larger space , and apply the lemma in that space. For every we can choose number that is a perfect square and satisfies . While that change affects the volumes of rectangles, they get smaller at most by a factor of . For our purposes this cost is negligible.

When we are interested in rectangles of volume at least , Lemma 2 instructs to take

 r=r(d,V)=1+(4/α)(d+log(2d/V)).

For some specific constant and for all positive and we have

 r(d,V)⩽^C(d+log(1/V)).
###### Corollary 3.

There exists constant such that, for every integers , and there exists a subset of of size at most

 m⋅2C(d+log(1/V)),

that intersects every combinatorial rectangle in of volume at least .

We apply that result, to construct a small hitting set for . That set is then used in the modified main construction instead of the set of all shift sequences.

### 3.3. Under false assumption

Unfortunately, for a fixed consistently dominating coloring of , the set of good or

-good shift sequences does not need to form a combinatorial rectangle. It is instructive to pretend for a moment that it does. We assume (falsely) in this subsection that

contains only combinatorial rectangles.

By the discussion that follows Proposition 1, for every consistently dominating coloring of , the set of -good shift sequences has volume at least

 ν=(ε2(1+ε))t.

By Corollary 3 there exists a hitting set for all rectangles of volume of size . For and , the size of is at most (assuming that is sufficiently large). Note that in the original construction all possible shift sequences were used. Using set instead of and choosing , the total number of edges becomes

 2k+O((klog(k))1/2).

### 3.4. Decomposing good shift sequences

We showed in Section 3.1 that, for every consistently dominating coloring of , the set of -good shift sequences is large. While, in general, it does not have a structure of combinatorial rectangle, in some sense it can be decomposed into a small number of such. We start by altering the way that the sequences of shifts are represented. For the clarity of the exposition we assume that is a power of 2.

Let be a rooted plane complete binary tree with leaves222 i.e. all the internal nodes of have two children (left and right) and all the leaves are of the same distance from the root . A subtree rooted at some internal node of consists of that node and all its descendants. A node of is at level if its distance to the set of leaves is . Let be the set of inner nodes at level . Note that , we denote that value by . For , the tree has levels with all the leaves on level 0.

We associate leaves of with rows of in such a way that the -th leaf from the left, corresponds to the -th row. Inner nodes of the tree are going to be labeled by elements of . These labels represent the relative shifts between neighboring rows of . For an inner node , if is the rightmost leaf of the left subtree of and is the leftmost leaf of the right subtree of , then the label of describes how row is shifted wrt .

Labeling of a subtree rooted at node is -good, if for being the number of descendant leaves of , the submatrix of the rows that correspond to these leaves, shifted according to the labels of the inner nodes of the subtree, has at least red columns. Note that -good labellings of the whole tree correspond to -good sequences (up to a cyclic shift of the whole matrix, which is clearly redundant in the original construction).

We order the nodes of from left to right and represent labellings of the nodes of as elements of . We are going to work bottom up and label inner nodes in groups consisting of the nodes of the same level. A labeling of is -good up to level if all the subtrees rooted at level at most are -good. In all the places where we use this definition, it can be assumed that the labeling is undefined for the nodes of higher levels. Suppose that is a labeling of that is -good up to level . Then, a sequence of labels is called an -good level extension (of ) if the labeling in which the labels of the nodes of level has been set to is -good up to level .

###### Proposition 4.

Suppose, that a labelling of is -good up to level . Then, the set of its -good level extensions forms a combinatorial rectangle of volume at least

 νj=(ε((1−ε)/2)−2j−1)dj.
###### Proof.

Fix and suppose that labeling is -good up to level . We want to assign labels to the nodes of in such a way that all the subtrees rooted at depth are -good shift trees as well. Note that for any pair of distinct nodes of level , the property of the corresponding subtrees of being -good shift trees are determined by disjoint sets of rows of the underlying matrix. That justify that the set of -good level extensions forms a combinatorial rectangle.

Let be a node of and let and be the sets of indices of red columns respectively in the shifted submatrices corresponding to the left and right subtrees of . By the assumptions we know that both these sets have cardinality at least

 s((1−ε)/2)−2j−1.

We need to estimate the number of for which the set has cardinality at least

 s((1−ε)/2)−2j.

Proposition 1 gives that there exist at least

 ε((1−ε)/2)−2j−1s

such values. We obtain that the volume of combinatorial rectangle of -good level extensions is at least

 (ε((1−ε)/2)−2j−1)dj

By Corollary 3, there exists a set of cardinality

 s⋅2C(dj+log(1/νj)),

that is a hitting set for the family of -good level extensions for labellings that are -good up to level . That implies the following proposition.

###### Proposition 5.

Set is a hitting set for the family of sets of -good labellings of .

It remains to estimate the size of . We have

 |HS| ⩽∏j=1…hs⋅2C(dj+log(1/νj)) =slog(t)⋅2C∑j=1…h(dj+log(1/νj))

and

 ∑j=1…hlog(1/νj) =∑j=1…hdj(log(1/ε)+2j−1log(2/(1−ε)))

Therefore, for our parametrization (i.e. and ), and for all sufficiently large we get

 |HS|⩽24tlog(t).

### 3.5. Modified main construction

Let be the set from Proposition 5. As we already observed labellings of correspond to shift sequences up to a cyclic shift of the whole matrix. For a labeling let be a shift sequence that is compatible with . Observe, that if is an -good labeling, then is -good shift sequence. Recall that we chose so that if is an -good sequence for some consistently dominated coloring of , then has at least red columns. The modified main construction proceeds as follows.

For every

1. labeling of the tree ,

2. and set of indices of size ,

add to an edge build from all elements of the columns of with indices in .

By Proposition 5 for every consistently dominated coloring of , at least one -good labeling is used in the construction. Then, for every such coloring, matrix shifted according to has at least red columns. As a consequence at least one of the edges of is monochromatic.

#### 3.5.1. Counting

Just like in the original construction, we have less than choices for the subset of rows in the preliminary step. Then, in the modified main construction, we use every sequence of with every subset of elements of to build an edge. The number of choices is smaller than

 24tlog(t)⋅(sk/t)<24tlog(t)⋅(esk/t)k/t.

Substituting the value of we obtain a value that is smaller than

 2⋅24tlog(t)⋅e2k/t2k=21+4tlog(t)+(2k/t)log(e)+k.

The bound is multiplied by to compensate for the ceiling in the definition of . Taking into account preliminary choices of rows, the total number of edges is smaller than

 22t+1+4tlog(t)+(2k/t)log(e)+k.

For , the total number of edges becomes .

## References

• [1] Paul Erdős, On a combinatorial problem. II, Acta Mathematica Academiae Scientiarum Hungaricae 15 (1964), 445–447.
• [2] Ofer Gabber and Zvi Galil, Explicit constructions of linear-sized superconcentrators, J. Comput. System Sci. 22 (1981), no. 3, 407–420, Special issued dedicated to Michael Machtey. MR 633542
• [3] Heidi Gebauer, On the construction of 3-chromatic hypergraphs with few edges, Journal of Combinatorial Theory. Series A 120 (2013), no. 7, 1483–1490.
• [4] Nathan Linial, Michael Luby, Michael Saks, and David Zuckerman, Efficient construction of a small hitting set for combinatorial rectangles in high dimension, Combinatorica 17 (1997), no. 2, 215–234. MR 1479299
• [5] G. A. Margulis, Explicit constructions of expanders, Problemy Peredači Informacii 9 (1973), no. 4, 71–80. MR 0484767

## Appendix A Proof of Proposition 1

###### Proof.

denote the size of , when is chosen uniformly at random. By the fact that and linearity of expectation we obtain

 E(X)=α|A|.

From the definition of , we get also

 X⩽|A|.

We can observe now that a distribution that minimizes and satisfies the above conditions, is supported only by values and . There is only one such distribution that satisfies . Straightforward calculations give

 Pr[X>(1−ε)α|A|]⩾εα1−(1−ε)α.

## Appendix B Small hitting sets exist

Recall that, for a fixed consistently dominated coloring of matrix with rows, the volume of -good sequences is at least

 p=(ε1+ε)t.

The volume is exactly the probability that uniformly random sequence is -good. Let be a set built from uniformly and independently sampled random sequences from . (Since, the sequences are sampled with repetitions, it may happen that .) The following formula upperbounds the expected number of consistently dominated colorings of , for which the set of -good sequences is not hit by

 2st⋅(1−p)m

Therefore, whenever , some set of sequences hits all the sets of -good sequences for consistently dominating colorings. For and it is sufficient to take of the order to satisfy the inequality. As a consequence there exists a hitting set for of size .