Network coding has been attracting increased attention for almost two decades since the seminal papers [1, 16]. Multicast networks have received most of this attention. An recent survey on the foundation of multicast network coding can be found in . The multicast network-coding problem can be formulated as follows: given a network with one source which has messages, for each edge find a function of the packets received at the starting node of the edge, such that each receiver can recover all the messages from its received packets. Such an assignment of a function to each edge is called a solution for the network. Therefore, the received packets on an edge can be expressed as functions of the source messages. If these functions are linear, we obtain a linear network coding solution, otherwise we have a nonlinear solution. In linear network coding, each linear function on an edge consists of coding coefficients for each incoming packet. If the coding coefficients and the packets are scalars, it is called a scalar network coding solution. If the messages and the packets are vectors and the coding coefficients are matrices then it is called a vector network coding solution. A network which has a solution is called a solvable network. It is well-known that a multicast network with one source, messages, and receivers, is solvable if and only if the min-cut between the source and each receiver is at least .
The functions on the edges of the network form the network code. The coding coefficients form the network coding vectors on the edges. The vector of coding coefficients is called the local coding vector when the function on the edge is considered as a linear combination of the packets received at the starting node of the edge. When we consider the function on the edge as a linear combination of the messages, the vector of coding coefficients (for the messages) is called the global coding vector. To recover the messages, a receiver should obtain global coding vectors whose linear span has dimension . In other words, the matrix formed by these global coding vectors should be invertible. This matrix is called a transfer matrix of . The previous description constitutes the framework for scalar linear network coding. The framework for vector network coding was presented in . Each message and each packet is a vector of length and the coding coefficients are matrices. The global coding vectors, on the edges, consist of matrices of size , which together form matrices. W.l.o.g., we assume that each matrix of a global coding vector is a generator matrix of a -subspace of . To recover the messages, a receiver should have on its incoming edges, , such global coding vectors which form together an transfer matrix of rank .
The field size of the solution is an important parameter that directly influences the complexity of the calculations at the network nodes. It is known that any field size suffices for a solution. However, it is conjectured that the smallest field size allowing a solution is much smaller [12, 13]. An efficient algorithm to find such a field size and the related network code was given in . It is conjectured that the minimum alphabet size is much smaller, but this was proved only for two messages . For this purpose we distinguish between the smallest alphabet size required for each one of the three types of network coding solutions. Given a network , we define to be the smallest field size for which has a scalar linear solution. Similarly, is the smallest alphabet size ( not necessarily a prime power) for which has a scalar nonlinear solution, and is the smallest value , a prime power, such that has a vector solution over . By definition, , and we define the vector gap by
Two other gaps ( and ) are defined similarly, but this paper will be mostly devoted to the vector gap.
One of the most celebrated families of networks is the family of combination networks , which were used for various topics in network coding. The combination network, where , is shown in Fig. 1. The network has three layers: the first layer consists of a single source with messages. The source transmits packets to the nodes of the middle layer. Any nodes in the middle layer are connected to a receiver, and each one of the receivers demands all the messages. It was proved in  that a solution for such a network exists if and only if a related error-correcting code exists. This network was also generalized to compare scalar and vector network coding . Its sub-networks were used to prove that finding the minimum required field size of a (linear or nonlinear) scalar network code for a certain multicast network is NP-complete .
The goal of this work is to consider two problems which are related to vector coding solutions for combination networks and their sub-networks. In Section II, we describe network coding solutions (vector, scalar, linear and nonlinear) for the combination network. In particular, we consider the combination network and the maximum number of nodes in the middle layer for such a network. This number is related to the largest length of certain MDS codes. While there exists a proof on the upper bound of such length for linear and nonlinear codes, we are not aware on any proof based on the properties of the subspaces. These codes are also MDS array codes which were considered in the past for storage  and are very popular today as distributed-storage codes, e.g., see [8, 24] and references therein. In Section III, the vector gap is considered. Such vector gaps, which are very large, were considered in  for any number of messages . The networks which were used for the proof are generalizations of the combination networks in which for each receiver there are some redundant edges on the paths between the source and the receiver. The extra edges were used to distribute the -space formed by the vector messages of length on more than edges. This enables some edges to transmit only a fraction of a one-dimensional space. However, a similar idea cannot be used for scalar linear network coding. The question whether such gaps can be obtained if there are no such redundant edges remained open. In Section III, we give a positive answer to this question and prove that there exists a vector gap in such networks called minimal multicast networks, for any number of messages. The gap is increasing with the number of messages. This also proves the existence of a gap for two messages which was left open in . The networks which will be used for this purpose are sub-networks of the combination networks. The proof will be based on the chromatic number of the -Kneser graph and a generalized version of it, the -Kneser hypergraph, which was not defined before. The coloring problem raises an intriguing combinatorial problem which has independent intellectual merit. Several more related problems will be presented in Section IV and will be considered in the full version of this paper. The same is true for some proofs of claims in the paper.
Ii Vector Solution and Bound for MDS Codes
In this section, we first describe the three types of solutions for the combination network. The key result is the following theorem proved in . Let denote a code over of length with codewords and minimum Hamming distance . If this code is linear, it is denoted by .
. () The combination network is solvable over if and only if there exists an code.
In view of Theorem 1, what are the functions on the edges of the combination network in the three types of solutions?
For the scalar nonlinear solution, an code, each coordinate in a codeword is a function of information symbols which are represented by the messages. The function for the th symbol of a codeword is the function on the link from the source to the th node in the middle layer.
For the scalar linear solution, an code is required. It has an generator matrix and the entries of its th column are the coding coefficients of the linear function on the link from the source to the th node in the middle layer.
In both cases, the nodes of the middle layer transmit their information to the related receivers. Each receiver obtains symbols from the middle-layer nodes, each one has the same global coding vector on its incoming and outgoing edges. Since the minimum Hamming distance of the code is , it follows that for each two different sets of messages, each receiver obtains a different -tuple of symbols from the middle layer nodes. Hence, it can recover the messages.
For the vector network coding solution, the matrices of size on the edges from the source to the middle-layer nodes form together a matrix which has dimension , i.e., it represents a -subspace of . Now, to have a solution for the combination network, each subspaces, related to the edges between the source and the middle-layer nodes, span the -space defined by the messages of the source.
A fundamental combinatorial structure that underpins some of the generalized combination networks is a structure we call a -independent configuration. We use to denote all the -dimensional subspaces of a vector space , and to denote the Gaussian coefficient (where the field size is understood from context).
. Let be a prime power, be positive integers, , and denote . A -independent configuration (IC) is a set , such that for all ,
We say is the size of the IC.
. Let be a -IC. If then
If the claim is immediate by considering the size of a -spread .
Assume now , and denote . Let us write , and define
where . By the definition of an IC, , where . It follows that any vector , , may be written uniquely as where and . We now define
for all . It is easily seen that .
Furthermore, for any ,
Thus, the set contains pairwise disjoint -subspaces of . Thus,
We now make the connection between ICs and a certain family of combination networks.
. The combination network has a vector solution over with messages of length if and only if there exists a -IC of size .
In the first direction assume that a vector solution over with messages of length exists. We note that by construction, any node in the middle layer has a subspace , with . If the terminal gets from the middle layer the subspaces , then
which implies that . Thus, is a -IC.
In the other direction, assume is a -IC. We can easily construct a vector network coding solution to the combination network. Simply send to the th middle layer node. Since is a -IC it follows that receiver has a full rank transfer matrix from which it can recover the messages.
Lemmas 3 and 4 form a generalization for an upper bound on the length of MDS code (use in Lemma 3). The related results for (scalar) linear codes are given in . Corollary 7 [18, p. 321] asserts that for an MDS code, we have that . This result is strengthened in Theorem 11 [18, p. 326] by using a more complicated proof based on projective geometry. The theorem asserts that if and
is odd then. A more complicated proof for the same result is given for nonlinear codes in [20, pp. 12-13].
Lemmas 3 and 4 can be generalized for a family of networks which generalize the combination network . Some interesting consequences implied by this generalization will be discussed in the full version of this paper.
We can use Lemma 4 to upper bound the vector gap in the combination networks. For this we will use Bertrand’s postulate (e.g., see ) that the interval contains a prime power for any integer ; and that the interval contains a prime for all large enough . This implies the following result.
. For all positive integers and , let denote the combination network. Then , and for all large enough , .
Iii Minimal Multicast Networks
In this section we will prove that for each number of messages , there exists a minimal multicast network for which vector network coding outperforms scalar network coding. A minimal multicast network can deliver messages from the source to the receivers, but if any edge is removed, it can deliver at most messages to at least one of the receivers. From a practical point of view, considering such minimal networks is interesting as it minimizes the used network resources. From a theoretical point of view, minimal networks can be regarded as a fair setting for a comparison between the three types of network coding solutions.
. A multicast network is said to be minimal if every edge crosses a cut of size .
Thus, in a minimal network, the removal of any edge makes at least one cut have size strictly less than , and therefore the new network is incapable of a solution.
To achieve the goal of this section, a sub-network of the combination network, denoted by , will be used. The network has one source in the first layer, and nodes in the middle layer, each node represents a different -subspace of . From each nodes in the middle layer which represent the -subspaces for which
there are links to a unique receiver.
For the remainder of this work, let be any generator matrix for a -subspace of . Also, the splitting of a matrix is the matrices of size obtained by taking the first columns of , then the next columns, and so on.
It is obvious from the definition of that for vector network coding, the minimum alphabet size for which it is solvable is . The coding coefficients on the edge from the source to the node represented by the -subspace are formed by splitting of into matrices of size . The global coding vector from a node of the middle layer to a receiver is the same one as the global coding vector (which coincides with the local coding vector) from the source to . It implies by (1) that the transfer matrix of each receiver is of full rank. It is not difficult to prove that a smaller alphabet size is impossible.
For the scalar solution we form a new hypergraph , where is the set of middle-layer vertices of . Each set of vertices from the middle layer from which there are links to a joint receiver of the third layer (i.e (1) is satisfied), are connected in by a hyperedge. When this hypergraph is the well-known -Kneser graph . Hence, we will denote the general hypergraph by and call it the -Kneser hypergraph. (This is not to be confused with the -Kneser graph )
A coloring of a graph is an assignment of a set of colors to the set of vertices such that for each edge , the vertices and are assigned different colors. The chromatic number of a graph , denoted , is the minimum number of colors in which we can color . Before we discuss the -Kneser hypergraph we will concentrate on the -Kneser graph [6, 7] which is related to , i.e. a sub-network of a combination network with two messages.
The network has two messages and in a scalar network coding solution on the link between the source and each node in the middle layer there is a global coding vector from . The two vectors on two distinct such edges, which transmit information to two middle layer nodes (that represent two disjoint -subspaces of ), must be linearly independent. The set of such pairs of nodes is exactly the pairs of vertices which define edges in . Hence, each color of a vertex in will be associated with a vector of
, such that two different colors will be associated with two linearly independent vectors. Since the largest set of vectors inwhich are pairwise linearly independent is , it follows that if the chromatic number of is then the alphabet size for the linear scalar solution is the smallest prime power greater than or equal to .
. For a prime power and an integer , or , there exists a minimal network with two messages for which .
The scope of Theorem 7 is somewhat limited due to the restrictions on the value of . We can remove these restrictions, but severely reduce the guaranteed vector gap to merely .
. For a prime power and any integer , there exists a minimal network with two messages for which .
We will prove that . Recall that the vertex set of is , where . Assume a coloring of with colors. Let , , be the set of vertices colored with color . Then each is a -intersecting family in the language of , and an anticode of diameter in the language of . Also, the set forms a tiling (partition) of .
For messages, the vector network code for the network is exactly as in the network. The coding coefficients on the edge from the source to the middle-layer node represented by the -subspace is formed by splitting to matrices of size . For the scalar linear network code we consider the -Kneser hypergraph . Our generalization is different from other generalizations, e.g.  and references therein. As for the coloring of the graph, we generalize the definition of coloring to hypergraphs as follows. The vertices are colored by a set of colors in a way that each vertex in an hyperedge has a different color. The chromatic number of such a hypergraph , denoted , is the smallest number of colors required to color .
. For any prime power and integers , ,
Let be a set of pairwise-disjoint subspaces of . Such a set is called a spread and it exists for all and . The vertices of related to the -subspaces that are in should be colored in a different colors, which implies the claim of the theorem.
. For any prime power and integers , , there exists a minimal network with messages for which .
In  it was proved that for even there exists a multicast network (not minimal) for which , and for odd there exists a multicast network (not minimal) for which . Corollary 10 implies that if is fixed a vector gap larger than for any function can be obtained. It is well-known [9, 11] that a scalar linear network coding solution can be translated to a vector coding solution with vectors of length over . Corollary 10 and the vector gaps proved in  imply that a translation from a vector coding solution with vectors of length over to a scalar linear solution will require an alphabet of size , with an interesting trade-off between and in .
Finally, Theorem 9 can be improved and as a consequence also Corollary 10. A lower bound on the chromatic number of is obtained by using a normal spread (also called a geometric spread) [4, 17] and the chromatic number of as given in [6, 7]. For example we have:
. For a prime power and an integer , or , there exists a minimal network with messages for which .
In the full version of the paper we will also prove that the vector-gap problem, for minimal multicast networks with two messages can be reduced to sub-networks of the combination network. The proof is based on a few reductions, where the first one is similar to the one in .
The family of combination networks and their sub-networks to prove two results. The first one is an upper bound on the number of nodes in the middle layer for a vector network coding solution. The second one is that for any number of messages vector network coding outperforms scalar network coding for minimal multicast networks with respect to the field size. The first result is an MDS bound for vector spaces and the proof is based on vector spaces and is simpler than the one for nonlinear MDS codes. The second result induces an interesting question on the chromatic number of -Kneser hypergraphs.
There are a few more problems which are induced directly from our discussion.
Can the vector gap in minimal multicast networks with more than two messages be reduced to subgraphs of the combination networks?
What is the maximum vector gap for minimal multicast networks, with messages and vectors of length ? Is it the one obtained by using the chromatic number ?
Can vector gaps for multicast networks with two messages be larger than the one obtained for minimal multicast networks?
What is the largest possible vector gap as a function of and for a multicast network with messages?
A. Wachter-Zeh and M. Schwartz were supported in part by a German Israeli Project Cooperation (DIP) grant under grant no. PE2398/1-1 and KR3517/9-1.
-  R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, “Network information flow”, IEEE Trans. Inform. Theory, vol. 46, pp. 1204–1216, 2000.
-  T. M. Apostol, Introduction to Analytic Number Theory. Springer-Verlag, NY, 1976.
-  R. C. Baker, G. Harman, and J. Pintz, “The difference between consecutive primes, II”, Proceedings of the London Mathematical Society, vol. 83, pp. 532–562, 2001.
-  A. Beutelspacher and J. Ueberberg, “A characteristic property of geometric -spreads in finite projective spaces”, Eur. J. Comb., vol. 12, pp. 277–281, 1991.
-  M. Blaum, J. Bruck, and A. Vardy, “MDS array codes with independent parity symbols”, IEEE Trans. Inform. Theory, vol. 42, pp. 529–542, 1996.
-  A. Blokhuis, A. E. Brouwer, and T. Szőnyi, “On the chromatic number of -Kneser graphs”, Designs, Codes and Crypt., vol. 65, pp. 187–197, 2012.
-  A. Chowdhury, C. Godsil, and G. Royle, “Colouring lines in projective space”, J. Combin. Theory Ser. A, vol. 113, pp. 39–52, 2006.
-  A. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran, “Network coding for distributed storage systems”, IEEE Trans. Inform. Theory, vol. 56, pp. 4539–4551, 2010.
-  J. B. Ebrahimi and C. Fragouli, “Algebraic algorithms for vector network coding”, IEEE Trans. Inform. Theory, vol. 57, pp. 996–1007, 2011.
-  T. Etzion and A. Vardy, “Error-correcting codes in projective space”, IEEE Trans. on Inform. Theory, vol. 57, pp. 1165–1173, 2011.
-  T. Etzion and A. Wachter-Zeh, “Vector network coding based on subspace codes outperforms scalar linear network coding”, IEEE Trans. on Inform. Theory, vol. 64, pp. 2460–2473, 2018.
-  C. Fragouli and E. Soljanin, “Information flow decomposition for network coding”, IEEE Trans. Inform. Theory, vol. 52, pp. 829–848, 2006.
-  C. Fragouli and E. Soljanin, “(Secure) Linear network coding multicast”, Designs, Codes, and Crypt., vol. 78, pp. 269–310, 2016.
-  P. Frankl and R. M. Wilson, “The Erdős-Ko-Rado theorem for vector spaces”, J. Combin. Theory Ser. A, vol. 43, pp. 228–236, 1986.
-  S. Jaggi, P. Sanders, P. A. Chou, M. Effros, S. Egner, K. Jain, L. Tolhuizen, “Polynomial time algorithms for multicast network code construction”, IEEE Trans. Inform. Theory, vol. 51, pp. 1973–1982, 2005.
-  S.-Y. R. Li, R. W. Yeung, and N. Cai, “Linear network coding”, IEEE Trans. Inform. Theory, vol. 49, pp. 371–381, 2003.
-  G. Lunardon, “Normal spreads”, Geometriae Dedicata, vol. 75, pp. 245–261, 1999.
-  F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes, North-Holland, 1978.
-  F. Meunier, “Colorful subhypergraphs in Kneser hypergraph”, The Electronic Journal of Combinatorics, vol. 21, #P1.8, 2014.
-  D. Raghavarao, Constructions and Combinatorial Problems in the Design of Experiments, John Wiley, 1971.
-  A. Rasala Lehman and E. Lehman, “Complexity classification of network information flow problems”, Proc. 15th Annu. ACM/SIAM Symp. Disc. Alg. (SODA), pp. 142–150, New Orleans, LA, January 2004.
-  S. Riis and R. Ahlswede, “Problems in network coding and error correcting code”, Lecture Notes in Computer Science, vol. 4123, pp. 861–897, 2006.
-  M. Schwartz and T. Etzion, “Codes and anticodes in the Grassman graph”, J. Combin. Theory Ser. A, vol. 97, pp. 27–42, 2002.
-  N. Silberstein, T. Etzion, and M. Schwartz, “Locality and availability of array codes constructed from subspaces”, IEEE Trans. Inform. Theory, to appear.