Let be the matrix that represents the adjacency matrix of the intersection bipartite graph of all subsets of size of . Thus, each row and column of is indexed by a subset in . The size of is , and if and only if the two subsets intersect.
Intersecting families of subsets have been studied extensively over the years, and some of the results achieved can be inferred as results about families of submatrices of the matrix . For example, Pyber  proved that the maximal cross-intersecting family of subsets of is of size , and thus, the largest all-ones submatrix of is of size . Another example is a theorem of Bollobás  about cross-intersecting sets, that allows to show that the largest submatrix representing a crown graph in is of size .
Here we suggest to continue and explore various families of maximal submatrices of . In particular, we would like to find small submatrices of whose Boolean rank is large. The Boolean rank of a matrix of size is equal to the smallest integer , such that can be factorized as a product of two Boolean matrices, , where is a matrix of size and is a matrix of size , and all additions and multiplications are Boolean. The Boolean rank is also equal to the minimal number of monochromatic combinatorial rectangles required to cover all of the ones of , and it is equal to the minimal number of complete bi-cliques needed to cover the edges of the bipartite graph whose adjacency matrix is (see ). Lastly, the Boolean rank is also tightly related to the notion of nondeterministic communication complexity .
The Boolean rank of was shown in  to be for any . Furthermore, it was proved in  that there exists a family of submatrices of , each of size , where and , whose Boolean rank is also , for a large range of values of . These submatrices are rather large, and a question that arises is if there are smaller submatrices of whose Boolean rank is , or as close as possible to . We answer this question and prove that for a large enough , there are, in fact, submatrices of size of , whose Boolean rank is .
Natural candidates for small matrices with a large Boolean rank are isolation sets (or fooling sets as they are called in communication complexity). An isolation set for a Boolean matrix is a subset of entries in that are all ones of , such that no two ones in are in the same row or column of , and no two ones in are contained in an all-one submatrix of size of . Throughout the paper we will represent an isolation set of a given matrix as a submatrix of , where the ones of the isolation set are on the main diagonal of , and is called an isolation matrix. The Boolean rank of an isolation matrix of size is equal to , and therefore, the size of the maximal isolation set in a given matrix, bounds below the Boolean rank of that matrix (see for example [5, 1]). Hence, finding large isolation sets in answers partially the question of finding small submatrices of with a large Boolean rank.
If , then is just the all-ones matrix, since every two subsets of size intersect, and thus, the largest isolation set is of size . Therefore, the question of finding large isolation sets in is interesting only for
. The simplest form of an isolation matrix is the identity matrix, and thus, we first consider the problem of determining the size of the largest identity submatrix in, and prove the following:
The largest identity submatrix in is of size , where .
Recall that the complement of is the adjacency matrix of the Kneser graph , in which the vertices are all subsets of size of , and there is an edge between two subsets if and only if . Furthermore, the complement of the identity matrix is the adjacency matrix of the crown graph of the same size. Thus, from Theorem 1, we immediately get that the largest submatrix representing a crown graph in is of size . In particular, this is also the maximal size of a clique in , which corresponds to the fact that the chromatic number of is .
Another simple isolation matrix is the triangular matrix with ones in every entry on the main diagonal and below it, and zeros elsewhere. We give an optimal construction of such a triangular matrix in , where our construction uses similar ideas to those used by Tuza , and the upper bound follows easily from a result of Frankl  that proved a skew version of a theorem of Bollobás .
For any and a large enough , the maximal triangular submatrix of is of size , where .
As can be seen, the size of the maximal triangular submatrix of does not depend on (as long as is large enough). Thus, for a large enough , the maximal identity submatrix promised by Theorem 1, is a larger isolation submatrix in . But is the largest isolation matrix in ? If then , and in this case, this is, of course, the maximal isolation set. It is also not hard to verify that if , there exists an isolation set of size in , and this is the maximal isolation set in this case (see for example ). As we prove, for , there are larger isolation sets, and the submatrix is not the largest isolation matrix for these values of and . In fact, when is large enough, there exists in an isolation set of size .
For any , the matrix has an isolation set of the following size:
If and , there exists an isolation set of size .
If there exists an isolation set of size .
Notice that for any fixed given , the size of the isolation set starts at when , and then grows by an additive term of two when is increased by one, until the point that . Then, we get an isolation set of maximal size . Our construction is also maximal for , and as we prove it is also maximal for . It is an open question if the construction is maximal for .
If and , then the size of any isolation set in is at most .
2 The maximal identity submatrix in
In all that follows we denote the identity matrix of size by , and refer to the subsets representing a row or column of as row or column indices. Therefore, each row or column index is a subset of . We now prove Theorem 1, and show that the maximal identity submatrix of is of size , where .
First notice that there exists such a large identity submatrix in . Just take row indices of the form and column indices of the form , for . This defines an identity submatrix of of size .
We next show that this is the largest identity submatrix possible in . Clearly this is true for a submatrix on the main diagonal of . Assume, by contradiction, that there exists an identity submatrix on the main diagonal of , and let be the row and column indices of , where we have that if and only if . But then we get an independent set of size in that includes . Thus, the complement of , that is, the Kneser graph , has a clique of size . This is in contradiction to the fact that the chromatic number of is (see ). In general though, the identity submatrix does not have to be on the main diagonal of , and thus, a different proof is needed.
We first need the following claim proved in  that characterizes the decompositions of the identity matrix. For completeness we include its proof.
Claim 1 ()
Let be a Boolean decomposition of the identity matrix , where is an Boolean matrix and is an Boolean matrix. Denote by the columns of and by the rows of . Then:
For each , either for some , where denotes the
standard basis vector, oris the all-zeros vector, or is the all-zeros vector.
Furthermore, for each , there exists some such that .
Proof: If we write the decomposition with outer products, then , where denotes the outer product of a column vector and a row vector , i.e. it is a matrix of size .
Assume first that there exists an index for which Item of the claim does not hold. But then the matrix contains a one that is not on the main diagonal of the matrix, and since the addition is the Boolean addition, the sum . Thus, Item 1 always holds for any decomposition of .
Now assume that there exists some , such that there is no for which .
But then the entry on the main diagonal of will be a zero.
Let be a decomposition of the identity matrix , where is an Boolean matrix and is an Boolean matrix. Then the total number of ’s in both and is at most .
Proof: By Claim 1, for each , there exists some such that .
Assume, without loss of generality, that for .
Then the maximal number of ’s in any decomposition of occurs when
for all the remaining indices, , it holds that
one of or is the all-zero vector and the other is the all-one vector.
Therefore, the number of ones in both and is at most .
The largest identity submatrix of is , where .
Proof: Let be any identity submatrix of . Consider now the decomposition of , where and , and let , such that .
Notice that is an matrix and is an matrix, and the total number
of ’s in both and is exactly .
But, by Claim 2, the total number of ’s in both and is at most . Thus,
, and therefore as claimed.
The following bound on the largest crown graph that is a submatrix of is an immediate consequence of Lemma 3.
The largest matrix representing a crown graph that is a submatrix of the Kneser matrix , is of size , where .
3 Maximal triangular matrices in
As stated in the introduction, the following theorem of Bollobás , allows to show that the largest submatrix representing a crown graph in , is of size , and this result is tight. That is, there exists a simple construction of such a large submatrix in .
Theorem 5 ()
Let be pairs of sets, such that for , and if and only if . Then
This theorem has several generalizations, among them is a result of Frankl  that considered the skew version of the problem, and showed that the same bound holds even under the following relaxed assumptions: Let be pairs of sets, such that for , for every , and if . Then . Note that for this formulation of the problem, all entries below the main diagonal are ones, but above the main diagonal there can be either zeros or ones.
Here we consider the following special case: What is the maximal number of pairs of subsets , such that for every , and if and only if . Such a set of pairs of subsets defines a triangular submatrix of of size , for some large enough , with ones on the main diagonal and below it, and zeros elsewhere. Denote such a matrix by , and notice that is an isolation matrix.
Using the result of Frankl  stated above, it can be shown that the size of any triangular submatrix of is bounded above by , where . To verify this, simply add to any maximal triangular submatrix an additional first row and last column that are all zero (for a large enough , it is always possible to define one more row index and column index that do not intersect with any of the given row and column indices of the submatrix). Thus, we get a matrix in which the main diagonal is all-zero, and below the main diagonal all elements are one. By the result of Frankl, the size of such a matrix is at most . Hence, the size of any maximal triangular matrix is bounded above by .
We now proceed to prove Theorem 2, and show a construction of a triangular submatrix of that matches the above upper bound. The construction we describe is recursive, using an idea similar to what was done by Tuza .
Let be the maximal , such that is a submatrix of for a large enough , having row indices that are subsets of size and column indices that are subsets of size . We want to find , for . We first give the stopping conditions for the recursion for .
For any it holds that .
Proof: To see that , take as row indices the subsets , and as column indices the subsets .
For the lower bound, assume by contradiction that , and let the column indices be . Since the last row of the matrix is all-ones, then the index of the last row intersects with all column indices. Thus, it contains the subset , in contradiction to the fact that the size of the row indices is . Hence, .
Similar arguments hold for , while exchanging the row and column indices.
Now we can prove the general recursive formula for .
For any it holds:
Proof: The proof is by induction on and . If or then the lemma follows directly from Claim 5. Otherwise, using the induction hypothesis, let be a triangular submatrix of size , with row indices of size and column indices of size , and let be a triangular submatrix of size , with row indices of size and column indices of size .
Assume that each row index of is disjoint from all column indices of (this is always possible for a large enough range of elements for the indices), and let be a new element that does not appear in any of the row or column indices of or of . Add to each column index of and to each row index of . Therefore, the row and column indices of these two matrices are now subsets of size and , respectively, and each column index of intersects all row indices of (as they all contain ).
Now add to one more row and column, as a last row and column, defined by the row index , and the column index , where is a subset of size and is a subset of size , and and are disjoint from all row and column indices of . Denote the resulting matrix by .
Consider the following triangular matrix defined by all row and column indices of and (after adding and the additional row and column as described above):
where is the all-ones matrix and
is the all-zeros matrix. The size ofis , and as stated, the row and column indices are subsets of size and , respectively. Hence, as claimed.
Before we solve this recursion, we need to recall the following definitions about recursion trees that are useful to describe the expansion of a recursive formula. A rooted tree, is a directed tree that has one node designated as the root of the tree, and all edges are directed away from the root. If is a directed edge in a directed tree, then is the child of in the tree. A leaf in the tree is a node with no edges coming out of it. A node that is not a leaf is called an internal node of the tree. A rooted tree is called a full binary tree if each node that is not a leaf has exactly two children. It is well known, and easy to prove by induction, that the number of leaves of a full binary tree is one more than the total number of internal nodes of the tree.
Now, using these definitions and the recursion given in Lemma 6, we can prove the following lower bound on . From this bound follows immediately that as claimed.
For any and a large enough , .
Proof: If then by Claim 5 we have that , and similarly if . Therefore, assume that , and thus by Lemma 6, . The solution of this recursion is similar to the following recursion defined by Pascal’s identity:
The only difference is the stopping conditions and the fact that the recursive formula for has a plus one term. Therefore, if we want to solve the recursion for , we can expand instead the recursion for , and take into account the differences.
Let be a rooted binary labeled tree describing the expansion of the recursion , where is the label of the root of the tree, and and are the labels of the two children of . In general the children of a node labeled by will be labeled by and . The labels of the leaves of the tree will be the stopping conditions of the recursion.
A similar tree , with the same structure as , can be used to describe the expansion of using Pascal’s identity, where the labels are the binomial coefficients expanded by the recursion. Since describes the expansion of , then the sum of the labels of its leaves is exactly .
Now in order to solve the recursion for , note that for each stopping term we loose compared to the expansion of Pascal’s identity, since , whereas . Thus, we have to subtract for each leaf of the tree from the sum , for a total of ones, where is the number of leaves of .
However, the recursion for has a plus term in each step of the recursion that is not a stopping condition, whereas the recursion of does not have such a term. Thus, we should sum these ones and add them to the total summed in the leaves. The number of such ones that we should add is equal to the number of internal nodes of , since each internal node corresponds to a recursive step. But is a full binary tree, and thus, it has internal nodes. Summarizing the above discussion, we get that:
4 Constructions of large isolation sets for
In this section we prove Theorem 3, and give constructions of families of large isolation sets in , where for a large enough , the constructions are the best possible, as we get an isolation set of size . The proof of the theorem contains several parts, according to the range of values of compared to . We first provide a basic construction of isolations sets of size for , and then use this construction to build large isolations sets for , for , and finally for .
4.1 A construction of isolation sets of size for
We now prove that if then there exists an isolation set of size in . We first need to show that there exists an isolation matrix, not necessarily in , of a certain structure, such that each row and column of this matrix has the same number of ones.
For any , there exists an isolation matrix of size , such that there are ones and zeros in each column of .
Proof: Take the circulant matrix , whose first column is .
It is not hard to verify that is an isolation matrix when (when the matrix is skew-symmetric).
Also, each column of is a cyclic permutation of the first column, and thus, each column contains ones and zeros.
See for example Figure 1.
If and , there exists an isolation set of size in .
Proof: Let be the isolation matrix described in Claim 8, with and . Let be the identity matrix of size , the all-ones matrix of size , and the all-zeros matrix of size . Finally, let and be the following matrices achieved by concatenating the above matrices as follows:
Observe that .
Furthermore, since each row of and each column of are vectors of length with exactly ones,
then we can view them as the characteristic vectors of subsets in .
Thus, is an isolation submatrix of of size as required.
4.2 A construction of large isolation sets for
Let and , where . There exists an isolation matrix in of size .
Proof: Let and . Thus, , and therefore, by Lemma 9, there exists an isolation matrix of size in , where the row and column indices of are subsets of size of .
Since , there are still elements from that were not used to construct the row and column indices of . Add to each row index of half of these elements, and to each column index the other half.
Now the row and column indices are subsets of of size , and the resulting matrix is an isolation matrix of size in , as .
4.3 A construction of large isolation sets for
Let and , where . There exists an isolation matrix in of size .
Proof: If then by Lemma 9, there exists an isolation matrix of size as required, since . Otherwise, and and define . Also let be the all-zero matrix, the all-one matrix, and the isolation submatrix of , of size , as promised by Lemma 9. Finally, let be another isolation matrix of size that has the following structure:
We now show how to construct an isolation matrix of size of the following structure (the dimensions of the submatrices of are specified alongside the figure):
Since the sum of dimensions of and is , then is a matrix of size as claimed.
In what follows we show that there is a way to assign row and column indices that are all subsets of ,
such that we get the above structure of and .
Then we can conclude that is an isolation submatrix of ,
since this structure of and , guaranties that any two ones on the diagonal of are not
in an all-ones submatrix of size .
The row and column indices of : Denote the row and column indices of by and , respectively. According to the construction described in Lemma 9, both the row and column indices of are subsets of size of defined as follows:
For : , where .
Note that the largest element in a column index of is .
Furthermore, it appears in exactly the last column indices of .
The row and column indices of : Let and denote the row and column indices of by and , where:
For : where .
where if , and otherwise, . Note that the indices are well defined as , and the maximal element in is . Furthermore, each index is a subset of size as required.
It is not hard to verify that has the structure described above, and that all row and column indices are subsets of . Therefore, is an isolation submatrix of of size as required.
Now if we consider the matrix defined by all the row and column indices and , then we get the matrix as above. To verify that has the structure claimed, note that the first row indices of , that is, , do not intersect with any of the column indices of , since the largest element in a column index of is , and the smallest element in these row indices is , as , and so .
As to the row indices , they intersect the last column indices of ,
whereas, the column indices intersect with row indices of .
See also Figure 2 for an example.
4.4 A construction of maximal isolation sets for
Let and . There exists an isolation matrix in of size .
Proof: Let and let be an isolation matrix of size , with row and column indices that are subsets of size of as defined in the proof of Lemma 11. Now add rows and columns to with the following indices:
For , add the row indices
For , add the column indices .
The resulting matrix is an isolation matrix of size . See Figure 3 for an example.