New Bounds on the Field Size for Maximally Recoverable Codes Instantiating Grid-like Topologies

01/21/2019 ∙ by Xiangliang Kong, et al. ∙ Zhejiang University 0

In recent years, the rapidly increasing amounts of data created and processed through the internet resulted in distributed storage systems employing erasure coding based schemes. Aiming to balance the tradeoff between data recovery for correlated failures and efficient encoding and decoding, distributed storage systems employing maximally recoverable codes came up. Unifying a number of topologies considered both in theory and practice, Gopalan Gopalan2017 initiated the study of maximally recoverable codes for grid-like topologies. In this paper, we focus on the maximally recoverable codes that instantiate grid-like topologies T_m× n(1,b,0). To characterize the property of codes for these topologies, we introduce the notion of pseudo-parity check matrix. Then, using the hypergraph independent set approach, we establish the first polynomial upper bound on the field size needed for achieving the maximal recoverability in topologies T_m× n(1,b,0), when n is large enough. And we further improve this general upper bound for topologies T_4× n(1,2,0) and T_3× n(1,3,0). By relating the problem to generalized Sidon sets in F_q, we also obtain non-trivial lower bounds on the field size for maximally recoverable codes that instantiate topologies T_4× n(1,2,0) and T_3× n(1,3,0).

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

With rapidly increasing amounts of data created and processed in internet scale companies such as Google, Facebook, and Amazon, the efficient storage of such copious amounts of data has thus become a fundamental and acute problem in modern computing. This resulted in distributed storage systems relying on distinct storage nodes. Modern large scale distributed storage systems, such as data centers, used to store data in a redundant form to ensure reliability against node failures. However, this strategy entails large storage overhead and is nonadaptive for modern systems supporting the “Big Data” environment.

To ensure the reliability with better storage efficiency, erasure coding schemes are employed, such as in Windows Azure [19] and in Facebook’s Hadoop cluster [33]. However, in traditional erasure coding scheme, if one node fails, which is the most common failure scenario, we may recover it by accessing a large amount of the remaining nodes. This is a time consuming recovery process. To address this efficiency problem, a lot of works have emerged in two aspects: local regeneration and local reconstruction.

The concept of local regeneration was introduced by Dimakis et al. [8]. They established a tradeoff between the repair bandwidth and the storage capacity of a node, and introduced a new family of codes, called regenerating codes, which attained this tradeoff. The concept of local reconstruction was introduced by Gopalan et al. [14], and they initiated the study of Local Reconstruction Codes (LRCs). We say a certain node has locality if it can be recovered by accessing only other nodes, and LRCs are linear codes with all-symbol locality . In recent years, the theory of regenerating codes and LRCs has developed rapidly. There have been a lot of related works focusing on the bounds and the constructions of optimal codes, see [45, 46, 32, 38, 41, 40, 43, 30, 39, 31, 20, 34] and the reference therein.

The notion of maximally recoverable property was first introduced by Chen et al. [7] for multi-protection group codes, and then extended by Gopalan et al. [13] to general settings. In [13], the authors introduced the topology of the code to specify the supports for the parity check equations, and they also obtained a general upper bound on the minimal size of the field over which maximally recoverable (MR) codes exist.

Different from the parity check matrix, the topology of the code only specifies the number of redundant symbols and the data symbols on which the redundant ones depend. This makes the topology a crucial characterization of the structure of the code used under distributed storage settings. With the purpose of deploying longer codes in storage, Gopalan et al. [15] proposed a family of topologies called grid-like topologies, which unified a number of topologies considered both in theory and practice.

Consider an matrix, each entry storing a data from a finite field . Every row satisfies a given set of parity constraints, and every column satisfies a given set of parity constraints. In addition, there are global parity constraints that involve all entries from the matrix. The topology of the code under these three constraints is denoted by . In [15], the authors considered the maximal recoverable codes for general grid-like topologies, and they established a super-polynomial lower bound on the field size needed for achieving maximal recoverability in any grid-like topologies with . They also tried to characterize correctable erasure patterns for grid-like topologies of the form , and obtained a full combinatorial characterization for the case of .

The general lower bound given in [15] is obtained from the case of a basic topology , where the lower bound requires field size . Recently, by relating the problem to the independence number of the Birkhoff polytope graph, Kane et al. [21] improved the lower bound to using the representation theory of the symmetric group. They also obtained an upper bound using recursive constructions.

As for other related works, Gandikota et al. [12] considered the maximal recoverability for erasure patterns of bounded size. Shivakrishna et al. [35] considered the recoverability of a special kind of erasure patterns called extended erasure patterns for topologies . It is worth noting that, Gopi et al. [16] recently obtained a super-linear lower bound for maximally recoverable LRCs which can be viewed as the MR codes for topology .

In this paper, we focus on the maximally recoverable codes that instantiate topologies of the form

, which can be regarded as tensor product codes of column codes with a single parity constraint and row codes with

parity constraints. In order to describe the parity constraints globally, we introduce the notion of pseudo-parity check matrix, which can be viewed as a generalization of the parity check matrix. Based on this, using tools from extremal graph theory and additive combinatorics, we prove the following results:

  • The first polynomial upper bound on the minimal size of the field required for the existence of MR codes that instantiate topologies :

    where ;

  • Further improved upper bounds on the field size required for MR codes instantiating topologies and :

  • A polynomial lower bound on the minimal size of the field required for MR codes instantiating topologies :

    and a linear lower bound on the minimal size of the field required for MR codes instantiating topologies :

The paper is organised as follows: In Section II, we give the formal definitions for general topologies, grid-like topologies and maximal recoverability, we also include some known results for topologies and the tools from hypergraph independent set. In Section III, we introduce the notion of pseudo-parity check matrix and regular irreducible erasure patterns. In Section IV, we present our proof for the general polynomial upper bound on the minimal size of the field required for the existence of MR codes that instantiate topologies . In Section V, we improve the general upper bound for MR codes that instantiate topologies and , and we also establish non-trivial lower bounds for both cases. In Section VI, we conclude our work and list some open problems.

Ii Preliminaries

Ii-a Notation

We use the following standard mathematical notations throughout this paper.

  • Let be the power of a prime , be the finite field with elements,

    be the vector space of dimension

    over and be the collection of all matrices with elements in .

  • For any vector , let and For a set define , where for and .

  • denotes a linear code of length , dimension and distance over the field . We will write instead of when the particular choice of the field is not important.

  • Let be an code and , . We say that is an information set if the restriction .

  • An code is called Maximum Distance Separable (MDS) if . Particularly, an code is MDS if and only if every subset of its coordinates is an information set. Alternatively, an code is MDS if and only if it corrects any collection of simultaneous erasures (see [24]).

  • Let be an code and be an code. The tensor product is an code such that the codewords of are matrices of size , where each column belongs to and each row belongs to . If is an information set of and is an information set of , then is an information set of (see [24]).

  • Let be the identity matrix. And let and be the all-one and all-zero vectors, respectively.

Ii-B Maximal recoverability for general topologies

Let be variables over the field . Consider an matrix where each is an affine function of the s over :

(1)

We refer the matrix as a topology. Fix an assignment , where . Viewing as a parity check matrix, then it defines a linear code which is denoted by . And we say code instantiates . A set of columns of is called potentially independent if there exists an assignment where such that the columns of indexed by are linearly independent.

Definition II.1.

[13] The code instantiating the topology is called maximally recoverable if every set of columns that is potentially independent in is linearly independent in .

Using the Sparse Zeros Lemma (see Theorem 6.13 in [23]), Gopalan et al. [13] proved the following upper bound on the size of field over which the maximally recoverable codes for any topologies exist.

Theorem II.2.

[13] Let be an arbitrary topology. If , then there exists an MR instantiation of over the field .

Ii-C Grid-like topologies

Unifying and generalizing a number of topologies considered both in coding theory and practice, Gopalan et al. [15] proposed the following family of topologies called grid-like topologies via dual constraints.

Definition II.3.

[15] Let be integers. Consider an array of symbols over the field . Let , , and . Let denote the topology where there are parity check equations per column, parity check equations per row, and global parity check equations that depend on all symbols. Topologies of the form are called grid-like topologies.

Furthermore, we say a collection of arrays in to be a code that instantiates the topology , if there exist , and in such that for each codeword :

1. Each column satisfies the constraints

(2)

2. Each row satisfies the constraints

(3)

3. All the symbols satisfy global constraints

(4)
Definition II.4.

An erasure pattern is a set of symbols. Pattern is correctable for the topology if there exists a code instantiating the topology where the variables can be recovered from the parity check equations (2), (3) and (4).

Clearly, constraints in (2) and (3) guarantee the local dependencies in each column and row respectively, and constraints in (4) ensure some additional recoverability. Notably, constraints (2) specify a code and constraints (3) specify a code . If , i.e., there are no extra global constraints for all symbols, then the code specified with the settings from Definition II.3 is exactly the tensor product code .

Definition II.5.

A code that instantiates the topology is Maximally Recoverable (MR) if it can correct every failure pattern that is correctable for the topology.

The maximally recoverability requires a code that instantiates the topology to have many good properties, especially the MDS property.

Proposition II.6.

[15] Let be an MR instantiation of the topology . We have

1. The dimension of is given by

(5)

Moreover,

(6)

2. Let , and , be arbitrary. Then is an

MDS code. Any subset , is an information set.

3. Assume

(7)

then the code is an MDS code and the code is an MDS code. Moreover, for all , restricted to column is the code , and for all , restricted to row is the code .

Considering the topology , the MR code that instantiates this topology can be viewed as the tensor product code . Based on the MDS properties for both and , for a corresponding erasure pattern, we know that if some column has less than erasures or some row has less than erasures, we can decode it. Therefore, the erasure pattern that really matters shall have at least erasures in each column and at least erasures in each row.

Definition II.7.

An erasure pattern for the topology is called irreducible, if for any , and .

These kinds of patterns were originally mentioned in [15] and also appeared in [35]. While Gopalan et al. [15] were trying to characterize the correctable erasure patterns for grid-like topologies, they considered the natural question: are irreducible patterns uncorrectable? In order to address this question, they introduced the following notion of regularity for erasure patterns.

Definition II.8.

[15] Consider the topology and an erasure pattern . We say that is regular if for all , and , we have

(8)

By reducing the regular erasure patterns to the irreducible case, the authors proved the following equivalent condition of the correctable erasure patterns for the topology .

Theorem II.9.

[15] An erasure pattern is correctable for the topology if and only if it is regular for .

Ii-D Independent sets in hypergraphs

A hypergraph is a pair , where is a finite set and is a family of subsets of . The elements of are called vertices and the subsets in are called hyperedges. An independent set of a hypergraph is a set of vertices containing no hyperedges and the independence number of a hypergraph is the size of its largest independent set.

There are many results on the independence number of hypergraphs obtained through different methods (see [2], [3], [9], [22]). In the following section, we will apply the general lower bound derived by Kostochka et al. [22]. Before stating their theorem, we need a few definitions and notations. Let be a hypergraph with vertex set and hyperedge set . We call a -uniform hypergraph, if all the hyperedges have the same size , i.e., . For any vertex , we define the degree of to be the number of hyperedges containing , denoted by . The maximum of the degrees of all the vertices is called the maximum degree of and denoted by . The independence number of is denoted by . For a set of vertices, define the -degree of to be the number of hyperedges containing .

Theorem II.10.

[22] Fix . There exists such that if is an -graph on vertices with maximum -degree , then

(9)

where and as .

Iii Pseudo-parity check matrix and Regular irreducible erasure patterns

In this section, we shall introduce two important notions: pseudo-parity check matrix and regular irreducible erasure patterns, which are crucial in the proofs of both upper and lower bounds.

Iii-a Pseudo-parity check matrix

Let be an linear code with a parity check matrix , then we have the following well-known fact about .

Fact III.1.

[24] Assume a subset of the coordinates of are erased, then they can be recovered if and only if the parity check matrix restricted to coordinates in has full rank.

Take as the tensor product code that instantiates the topology , where and are codes specified by (2) and (3), respectively. For simplicity, for each codeword write

where for each is a codeword in and for each is a codeword in .

Denote and as the parity check matrices of and respectively, assume

Then consider the following matrix:

(10)

where

(11)

From the above construction, we can see that includes all the parity check constraints of , and it can be easily verified that for each codeword . Since the size of is , instead of the parity check matrix of , it can only be regarded as an approximation of the parity check matrix. Therefore, we call a pseudo-parity check matrix of the code .

Similar to III.1, using basic linear algebra arguments, we have the following proposition for pseudo-parity check matrix of code .

Proposition III.2.

Assume a subset of the coordinates of are erased, then they can be recovered if and only if the pseudo-parity check matrix restricted to coordinates in has full column rank.

When , if is MR, from the MDS property of the code , we know that has rank 1. Especially, when considering the existence of MR codes for topologies , w.l.o.g, we can fix to be the simple parity code , i.e., . Hence, the pseudo-parity check matrix of has the form:

(12)
Remark III.3.

Let and , an -MR LRC (for specific definition, see [16]) can be viewed as an MR code for topology . Therefore, it has simpler erasure patterns compared to the tensor product cases. And instead of using the pseudo-parity check matrix, it can be verified that the parity check matrix of any -MR LRC admits the form

where for each , is a parity check matrix of an MDS code and is an matrix over corresponding to the global parities.

Compared to MR LRCs, MR codes for topologies have another difference. For an -MR LRC, the MDS codes within each local group can be different, this results in that the corresponding parity check matrix above can admit different s. However, since an MR code for topology is actually a tensor product code . Thus, for each , if we take coordinates in as a local group, once the code is fixed, the corresponding MDS codes within each local group are all and the corresponding parity check matrices in are all .

Iii-B Regular irreducible erasure patterns

Let be an erasure pattern of the topology , then it can be presented in the following form:

where stands for the erasure and stands for the non-erasure. Give two different erasure patterns and , we say that and are of the same type, if can be obtained from by applying elementary row and column transformations.

For a reducible erasure pattern , there exists some or , such that the number of the erasures in or is less than or . Therefore, from the MDS properties of the code and , erasures in or can be simply repaired by using only the parities within or . Hence, the very erasure patterns that affect the MR property of the code are irreducible erasure patterns. In other words, if we can construct a code instantiating the topology that can correct all correctable irreducible erasure patterns, then this code is an MR instantiation for the topology .

Now, we focus on the irreducible erasure patterns that are correctable. Given an irreducible erasure pattern , denote as the number of s in , and . From the irreducibility of , we have

Meanwhile, from Theorem II.9, we know that for topology , an erasure pattern is correctable if and only if is regular. Thus we have

Combining the above three inequalities together, we have

(13)

for every correctable irreducible erasure patterns in . Therefore,

(14)

which indicates that once (or ) is given, the magnitude of can not be too large.

Denote as the set of all the types of regular irreducible erasure patterns for topology , i.e., for each , one can regard as a representative of all the erasure patterns that have the same type as . Since and , for each regular irreducible erasure pattern , we have and . For convenience, we can take each type of erasure patterns in as a submatrix of an matrix with elements from . Therefore, we can obtain the following upper bound of :

(15)

Iv A polynomial upper bound on the maximum field size required for MR codes

In this section, we take the prime , which is the natural setting for distributed storage. And we will establish our polynomial upper bound on the minimal field size required for MR codes that instantiate the topology .

Theorem IV.1.

Let . Then for any , there exists an MR code that instantiates the topology over the field , where .

In order to do this, we will exhibit a column code and a row code over a relative small field, so that for every correctable irreducible erasure pattern , the code can correct . Thus the tensor product code is an MR code that instantiates the topology . We also need the following lemma known as the Combinatorial Nullstellensatz.

Lemma IV.2.

(Combinatorial Nullstellensatz) [1] Let be an arbitrary field, let be a polynomial of degree which contains a non-zero coefficient at with , and let be subsets of such that for all . Then there exist such that .

Proof of Theorem iv.1.

Since the case is trivial, w.l.o.g., we assume . For simplicity, we fix as the simple parity code and focus on obtaining the code .

Denote as the set of all the types of regular irreducible erasure patterns for topology . Assume the parity check matrix of the code is , then the pseudo-parity check matrix is of the form in (12). Thus, our goal is to construct a matrix such that:

  1. Every distinct columns of are linearly independent.

  2. For each regular irreducible erasure pattern , the pseudo-parity check matrix of satisfies: .

Given a regular irreducible erasure pattern , w.l.o.g., assume and , then has the form

where represents the sub-erasure pattern of over the row. Thus

Let . Since , by applying elementary row and column transformations, we have

(16)

where consists of all the columns in corresponding to an in and consists of all the rest columns in by substituting columns of with the same parts in . Thus a non-zero element of equals to some in and a non-zero element of equals to or for some in . For example, take , and

then

From the above simplified form of in (16), we have

By Definition II.8 and (14), we have . Thus if and only if there exists an minor in such that .

Now, take

where each is a variable over . Therefore, our goal is to find a proper valuation of these over such that the resulting matrix satisfies both requirement (i) and requirement (ii).

  • For requirement (i)

For any , let be the submatrix of formed by the columns indicated by , i.e.,

Define