I Introduction
This survey article deals with the use of erasure coding for the reliable and efficient storage of large amounts of data in settings such as that of a data center. The amount of data stored in a single data center can run into tens or hundreds of petabytes. Reliability of data storage is ensured in part by introducing redundancy in some form, ranging from simple replication to the use of more sophisticated erasurecoding schemes such as ReedSolomon codes. Minimizing the storage overhead that comes with ensuring reliability is a key consideration in the choice of erasurecoding scheme. More recently a second problem has surfaced, namely, that of node repair.
In [1], [2] the authors study the Facebook warehouse cluster and analyze the frequency of node failures as well as the resultant network traffic relating to node repair. It was observed in [1] that a median of nodes are unavailable per day and that a median of TB of crossrack traffic is generated as a result of node unavailability. It was also reported that of the cases have exactly one block missing in a stripe. The erasure code that was deployed in this instance was an Reed Solomon (RS) code. Here denotes the block length of the code and the dimension. The conventional repair of an RS code is inefficient in that the repair of a single node, calls for contacting other (helper) nodes and downloading times the amount of data stored in the failed node, which is clearly inefficient. Thus there is significant practical interest in the design of erasurecoding techniques that offer both low overhead and which can also be repaired efficiently.
Coding theorists have responded to this need by coming up with two new classes of codes, namely ReGenerating (RG) and Locally Recoverable (LR) codes. The focus in a RG code is on minimizing the amount of data download needed to repair a failed node, termed the repair bandwidth while LR codes seek to minimize the number of helper nodes contacted for node repair, termed the repair degree. In a different direction, coding theorists have also reexamined the problem of node repair in RS codes and have come up with new and more efficient repair techniques. This survey provides an overview of these recent developments. An outline of the survey itself appears in Fig. 1.
RG codes are discussed in Section II. The two principal classes of RG codes, namely Minimum Bandwidth Regenerating (MBR) and Minimum Storage Regeneration (MSR) appear in the two sections that follow. These two classes of codes are at the two extreme ends of a tradeoff known as the storagerepair bandwidth (SRB) tradeoff. A discussion on codes that correspond to the interior points of this tradeoff appears in Section V. The theory of regenerating codes has been extended in several directions and these are explored in Section VI. Section VII examines LR codes. There have been several approaches at extending the theory of LR codes to handle multiple erasures and these are dealt with in Section VIII. A class of codes known as Locally ReGenerating (LRG) codes that offer both low repair bandwidth and low repair degree within a single erasure code is discussed in Section IX. This is followed by Section X that discusses recent advances in the repair of ReedSolomon codes. A brief description of a different approach based on capacity considerations and leading to the development of a liquid cloud storage system appears in Section XI. The final section, discusses practical evaluations and implementations.
Disclaimer: This survey is presented from the perspective of the authors and is biased in this respect. Given the explosion of research activity in this area, the survey also does not claim to be comprehensive and we offer our apologies to the authors whose work has inadvertently or for lack of space, not been appropriately cited. We direct the interested reader to some of the excellent surveys of codes on distributed storage contained in the literature including [3], [4], [5] and [6].
Ii Regenerating Codes
Definition 1 ([7]).
Let denote a finite field of size . Then a regenerating (RG) code over having integer parameter set where , , , maps a file on to a collection of tuples over using an encoding map
with the components of stored on the th node in such a way that the following two properties (see Fig. 2) are satisfied:
Data Collection: The message can be uniquely recovered from the contents of any nodes.
Node Repair: If the th node storing fails, then a replacement node can

contact any subset of the remaining nodes of size ,

map the contents of each helper node on to a collection of repair symbols ,

pool together the
repair symbols thus computed to use them to create a replacement vector
whose components are stored in the replacement node, in such a way that the contents of the resultant nodes, with the replacement node replacing the failed node, once again forms a regenerating code.
A regenerating code is said to be exactrepair (ER) regenerating code if the contents of the replacement node are exactly same as that of the failed node, ie., . Else the code is said to be functionalrepair (FR) regenerating code. A regenerating code is said to be linear if

, and

the map mapping the contents of the th helper node on to the corresponding repair symbols is linear over .
Thus a regenerating code is a code over a vector alphabet and the quantity is termed the subpacketization level of the regenerating code. The total number of symbols to be transferred for repair of failure node is called the repair bandwidth of the regenerating code. The rate of the regenerating code is given by . Its reciprocal is the storage overhead.
Iia CutSet Bound
Let us assume that is a functionalrepair regenerating code having parameter set: . Since an exactrepair regenerating code is also a functionalrepair code, this subsumes the case when is an exactrepair regenerating code. Over time, nodes will undergo failures and every failed node will be replaced by a replacement node. Let us assume to begin with, that we are only interested in the behavior of the regenerating code over a finitebutlarge number of node repairs. For simplicity, we assume that repair is carried out instantaneously. Then at any given time instant , there are functioning nodes whose contents taken together comprise a regenerating code. At this time instant, a data collector could connect to nodes, download all of their contents and decode to recover underlying message vector . Thus in all, there are at most distinct data collectors which are distinguished based on the particular set of nodes to which the data collector connects.
Next, we create a source node that possesses the message symbols , and draw edges connecting the source to the initial set of nodes. We also draw edges between the helper nodes that assist a replacement node and the replacement node itself as well as edges connecting each data collector with the corresponding set of nodes from which the data collector downloads data. All edges are directed in the direction of information flow. We associate a capacity with edges emanating from a helper node to a replacement node and an capacity with all other edges. Each node can only store symbols over . We take this constraint into account using a standard graphtheory construct, in which a node is replaced by nodes separated by a directed edge (leading towards a data collector) of capacity . We have in this way, arrived at a graph (see Fig.3) in which there is one source and at most sinks .
Each sink would like to be able to reconstruct all the source symbols from the symbols it receives. This is precisely the multicast setting of network coding. A principal result in network coding tells us that in a multicast setting, one can transmit messages along the edges of the graph in such a way that each sink is able to reconstruct the source data, provided that the minimum capacity of a cut separating from is . A cut separating from is simply a partition of the nodes of the network into sets: containing and containing . The capacity of the cut is the sum of capacities of the edges leading from a node in to a node in . A careful examination of the graph will reveal that the minimum capacity of a cut separating a sink from source is given by (Fig. 3 shows an example cut separating source from sink). This leads to the following upper bound on file size [7]:
(1) 
Network coding also tells us that when only a finite number of regenerations take place, this bound is achievable and furthermore achievable using linear network coding, i.e., using only linear operations at each node in the network when the size of the finite field is sufficiently large. In a subsequent result [8], Wu established using the specific structure of the graph, that even in the case when the number of sinks is infinite, the upper bound in (1) continues to be achievable using linear network coding.
In summary, by drawing upon network coding, we have been able to characterize the maximum file size of a regenerating code given parameters for the case of functional repair when there is constraint placed on the size of the finite field . Note interestingly, that the upper bound on file size is independent of . Quite possibly, the role played by is that of determining the smallest value of field size for which a linear network code can be found having file size satisfying (1). A functional regenerating code having parameters: is said to be optimal provided (a) the file size achieves the bound in (1) with equality and (b) reducing either or will cause the bound in (1) to be violated.
IiB StorageRepair Bandwidth Tradeoff
We have thus far, specified code parameters and asked what is the largest possible value of file size . If however, we fix parameters and ask instead what are the smallest values of for which one can hope to achieve (1), it turns out, as might be evident from the form of the summands on the RHS of (1), that there are several pairs for which equality holds in (1). In other words, there are different flavors of optimality.
For a given file size , the storage overhead and normalized repair bandwidth are given respectively by and . Thus reflects the amount of storage overhead while determines the normalized repair bandwidth. The several pairs for which equality holds in (1), represent a tradeoff between storage overhead on the one hand and normalized repair bandwidth on the other as can be seen from the example plot in Fig. 4. Clearly, the smallest value of for which the equality can hold in (1) is given by . Given , the smallest permissible value of is given by . This represents the minimum storage regeneration point and codes achieving (1) with and are known as minimum storage regenerating (MSR) codes. At the other end of the tradeoff, we have the minimum bandwidth regenerating (MBR) code whose associated values are given by , .
Remark 1.
Since a regenerating code can tolerate erasures by the data collection property, it follows that the minimum Hamming weight of a regenerating code must satisfy . By the Singleton bound, the largest size of a code of block length and minimum distance is given by , where is the size of alphabet of the code. Since in the case of regenerating code, it follows that the size of a regenerating code must satisfy , or equivalently , i.e., . But in the case of an MSR code and it follows that an MSR code is an MDS code over a vector alphabet. Such codes also go by the name MDS array code.
From a practical perspective, exactrepair regenerating codes are easier to implement as the contents of the nodes in operation do not change with time. Partly for this reason and partly for reasons of tractability, with few exceptions, most constructions of regenerating codes belong to the class of exactrepair regenerating codes. Examples of functionalrepair regenerating code include the construction in [9] as well as the construction in [10].
Early constructions of regenerating codes focused on the two extreme points of the storagerepair bandwidth (SRB) tradeoff, namely the MSR and MBR points. The various constructions of MBR and MSR codes are described in Sections III, IV. Not surprisingly, given the vast amount of data stored, the storage industry places a premium on low storage overhead. In this connection, we note that the maximum rate of an MBR code is given by:
which can be shown to be upper bounded by and is achieved when . In the case of MSR codes, there is no such limitation and MSR codes can have rates approaching .
An RG code is said to be a a HelpByTransfer (HBT) RG code if repair of a failed node can be accomplished without incurring any computation at a helper node. If no computation is required at either helper node or at the replacement node, then the code is termed a RepairbyTransfer (RBT) RG code. Clearly, an RBT RG code is also an HBT RG code.
Iii MBR Codes
Remark III.1.
If the
message symbols are drawn randomly with uniform distribution from
, it can be shown that in any regenerating code achieving the cutset bound, the contents of each node correspond to a random variable that is uniform over
. In an MBR code, repair is accomplished by downloading a total of just symbols which clearly, is the minimum possible.Remark III.2.
Let be an MBR code. If has the RBT property, it trivially follows that all scalar codesymbols of are replicated at least twice. In [11], it is shown that for an MBR code it is not possible to have even a single scalar codesymbol replicated more than twice. Thus the RBT property implies that the collection of scalar codesymbols associated with a codeword represent a set of distinct code symbols, each repeated twice. The converse is not true in general. However when , it can be shown that the two properties are equivalent.
Remark III.3.
In [12], it is shown that for , it is not possible to construct an MBR code that has the HBT property.
Iiia Polygonal MBR Codes
In the following, we describe with the help of an example, one of the first explicit families of MBR codes [13]. We term these codes as polygonal MBR codes. The construction holds for parameters and the constructed MBR codes possess the RBT property.
Example 1.
Consider the parameters and . Thus . First construct a complete graph with vertices and edges. The nine message symbols are then encoded using a MDS code to produce ten codesymbols. Each codesymbol is then uniquely assigned an edge. Each node of the MBR code stores the codesymbols corresponding to the edges incident on that node (see Fig. 5). The data collection property follows as any collection of nodes yields nine distinct (MDS) codesymbols. If a node fails, the replacement node can download from each of the remaining four nodes, the codesymbol corresponding to the edge it shares with the failed node. Hence repair is accomplished by merely transferring the data without any computation (RBT).
Remark III.4.
For the general construction, in order to construct an MBR code, one first forms the complete graph on vertices. Each edge is then mapped to a codesymbol of an MDS code, where and is the file size parameter. An fieldsize requirement is thus imposed by the underlying scalar MDS code.
IiiB ProductMatrix (PM) MBR codes
A second, general construction for MBR codes is the PM construction [14] which derives its name from the fact that the contents of nodes can be expressed in the form of a product of two matrices. The two matrices are respectively an encoding matrix and a second, message matrix containing the message symbols. This construction yields MBR codes for all feasible parameters , , with an fieldsize requirement. The encoding matrix is of the form: , where , are , matrices respectively. Let the th row of be denoted by . The submatrices and are here chosen such that any rows of and any rows of are linearly independent. The symmetric message matrix is derived from the message symbols as follows:
The th node, under the PMMBR construction, stores the matrix product . The repair data passed on by helper node to replacement node is given by .
IiiC Other Work
In [15], the authors introduce a family of RBT MBR codes for
, that are constructed based on a congruent transformation applied to a skewsymmetric matrix of message symbols. In comparison with the
field requirement of polygonal MBR codes, in this construction, a fieldsize of suffices. In [16], the authors stay within the PM framework, but provide a different set of encoding matrices for MSR and MBR codes that have leastpossible update complexity within the PM framework. The authors of [16] also analyze the codes for their ability to correct errors and provide corresponding decoding algorithms. The paper [12] proves the nonexistence of HBT MBR codes with . The paper also provides PMbased constructions for two relaxations, namely (i) any failed node which is a part of a collection of systematic nodes can be recovered in HBT fashion from any other nodes and (ii) for every failed node, there exists a corresponding set of helper nodes which permit HBT repair. The paper [11] provides binary MBR constructions for the parameters , and studies the existence of MBR codes with inherent double replication, for all parameters. In [17], the authors provide regeneratingcode constructions that asymptotically achieve the MSR or MBR point as increases and these codes can be constructed over any field, provided the file size is large enough. In [18], the authors introduce some extensions to the classical MBR framework by permitting the presence of a certain number of errorprone nodes during repair/reconstruction and by introducing flexibility in choosing the parameter during node repair.Open Problems 1.
Determine the smallest possible field size of an MBR code for given .
Iv MSR Codes
Among the class of RG codes, MSR codes have received the greatest attention, and the reasons include: the fact that (a) the storage overhead of an MSR code can be made as small as desired, (b) MSR codes are MDS codes and (c) MSR codes have been challenging to construct.
Iva Introduction
As noted previously, an MSR code with parameters has file size and . Although MSR codes are vector MDS codes that have optimum repairbandwidth of for the repair of any node among the nodes, there are papers in the literature that refer to a code as an MSR code even if optimal repair holds only for the systematic nodes. In the current paper, we refer to such codes as systematic MSR codes. While only symbols are sent by each of the helper nodes, the number of symbols accessed by the helper node in order to generate these symbols could be . The class of MSR codes that access at each helper node, only as many symbols as are transferred, are termed optimalaccess MSR codes. MSR codes that alter a minimum number of parity symbols while updating a single, systematic symbol, are called updateoptimal MSR codes.
There are several exactrepair (ER) MSR constructions available in the literature. In [9], Shah et al. show that interference alignment (IA) is necessarily present in every exactrepair MSR code, and use IA techniques to construct systematic MSR codes, known as MISER codes, for . The IA condition in the context of MSR codes (observed earlier in [19]) demands that the interference components in the data passed by helper nodes must be aligned so that they can be cancelled at the replacement node by data received from the systematic helper nodes. In [20], Suh et al. build on [9] to construct MSR codes for with optimal repair bandwidth for all nodes, under the condition that the helpernode set necessarily includes systematic nodes. In [21], the wellknown Product Matrix (PM) framework is introduced to provide MSR constructions for , thereby settling the problem of MSR code construction in the lowrate regime, . While the method adopted in [21] to provide a construction for is to suitably shorten a code for , an extension of the PM framework that yields constructions for any in a single step is provided in [22]. Apart from a few notable constructions such as the Hadamarddesignbased code [23] for and its generalization for for systematic noderepair, the problem of highrate constructions (i.e., ) for allnode repair remained open. The first major result in this direction, is due to Cadambe et al. [24] where the authors apply the notion of symbol extension in interference alignment where multiple symbols are grouped together to form a single vector symbol, to jointly achieve interference alignment. The symbolextension viewpoint is then used to show that ER MSR codes exist for all , as goes to infinity. The second major development was the zigzag code construction [25, 26], the first nonasymptotic highrate MSR code construction with permitting rates as close as as desired, with additional desirable properties such as optimal access and optimal update. Zigzag codes however, require a subpacketization level () that grows exponentially with and a very large finite field size, while the earlier PM codes for the lowrate regime, have and fieldsize that is linear in . In a subsequent work [27], the authors present a systematic MSR construction having and rate . A second systematic MSR code with is presented in [28]. A lower bound on subpacketization level of a general MSR code is derived in [29]. The same paper shows that in the case of an optimalaccess MSR code. An improved lower bound for general MSR codes
(2) 
appears in [30]. These developments made it clear that the ultimate goal in MSR code construction was to construct a highrate MSR code that simultaneously had low subpacketization level , low fieldsize , arbitrary repair degree and the optimalaccess property.
In [31], a paritycheck viewpoint is adopted to construct a highrate MSR code for with a subpacketization level , requiring however, a large fieldsize. The construction was extended in [32], to satisfying . In [33], the authors provide a construction of MSR codes that holds for all , but which once again required large field size. In [34], the authors provide a construction for an optimalaccess systematic MSR code that holds for any parameter set having subpacketization matching the lower bound given in [29]. In [25, 26, 28, 27, 31, 34, 32, 33], Combinatorial Nullstellansatz (see [35]) is used to prove the MDS property due to which the codes are nonexplicit and have large field sizes.
In [36], an explicit optimalaccess, systematic MSR code is constructed with optimal , but for limited values of . In [37], the authors present two different classes of explicit MSR constructions, one of which possessed the optimalaccess property. Both constructions are for any with subpacketization level growing exponential in .
In a major advance, in [38], Ye and Barg present an explicit construction of a highrate, optimalaccess MSR code with , field size no larger than , and . Essentially the same construction was independently rediscovered in [39] from a different coupledlayer perspective, where layers of an arbitrary MDS codes are coupled by a simple pairwise coupling transform to yield an MSR code. Just prior to the appearance of these two papers, in an earlier version of [40], the authors show how a systematic MSR code can be converted into an MSR code by increasing the subpacketization level by a factor of using a pairwise symbol transformation. This result is then extended in [40], to present a technique that takes an MDS code, increases subpacketization level by a factor of and converts it into a code in which the optimal repair of nodes can be carried out. By applying this transform repeatedly times, it is shown that any scalar MDS code can be transformed into an MSR code. It turns out that the three papers [38, 39, 40], either explicitly or implicitly, employed as a key part of the construction, essentially the same pairwisecoupling transform.
Let . More recently, the lower bound was derived in [41] for optimalaccess MSR codes. The same paper also shows that the subpacketization level of an MDS code that can optimally repair any of the nodes must satisfy . These results established that the earlier constructions in [31, 32, 38, 39, 40, 42] were optimal in terms of subpacketization level . It is also shown in [41], that a vector MDS code that can repair failed nodes belonging to a fixed set of nodes with minimum repair bandwidth and in optimalaccess fashion, and having minimum subpacketization level must necessarily have a coupledlayer structure, similar to that found in [38, 39, 40]. An explicit construction of MSR codes for with achieving the lower bound for was recently provided in [42].
Open Problems 2.
Derive a tight lower bound on the subpacketization level of MSR codes and provide matching constructions.
Open Problems 3.
Constructions for explicit optimalaccess MSR codes for any with optimal subpacketization.
IvB Constructions of MSR Codes
Product Matrix Construction [21]:
We provide a brief description of the PM construction for parameter set . The message symbols are arranged in the form of a matrix : ,
where the are symmetric matrices containing the message symbols.
Encoding is carried out using a matrix , where is an matrix and is a diagonal matrix. Let the th row of be , the th row of be and the th diagonal element in be . The symbols stored in node are given by:
The matrix is required to satisfy the properties: 1) any rows of are linearly independent, 2) any rows of are linearly independent and 3) the diagonal elements of are distinct.
Node Repair: Let be the index of failed node, thus the aim is to reconstruct . The th helper node, , , passes on the information: . Upon aggregating the repair information we obtain the vector,
As any rows of are linearly independent, the vector can be recovered. From , we can obtain and . Since and are symmetric, we can recover the contents of the replacement node.
Data Collection: Let be the sub matrix of corresponding to the nodes contacted for data collection. We wish to retrieve from . This can be done in three steps:

First compute and set , .

It is clear that are symmetric. Thus we know both and . Since for , we can recover and for all .

Since we know for , we can compute the vector . Since any rows of are linearly independent, we can recover . For any set of distinct elements , we can compute , from which can be recovered. can be similarly recovered from . The present description assumes data collection from the first nodes, while a similar argument holds true for any arbitrary set of nodes.
Coupled Layer Code:
We present here the constructions in [38, 39, 40] from a coupledlayer perspective. We explain the construction here only for parameter sets of the form:
where . (The construction can however, be extended to yield MSR codes for any using a technique called shortening). The coupledlayer code can be constructed in two steps: (a) in the first step, we layer , MDS codewords to form an uncoupled datacube, (b) in the second step, the symbols within the uncoupleddata cube are transformed using a pairwiseforwardtransform (PFT) to obtain the coupled layer code. While we discuss only the case when the MDS code employed in the layers is a scalar MDS code, there is a straightforward extension that permits the use of vector MDS codes (see [39]).
Let us first consider the symbols of an uncoupled code where each code symbol is a vector of symbols in . These symbols can be organized to form a threedimensional (3D) data cube (see Fig.7), where is the node index and where serves to index the contents of a node. For fixed , we think of the symbols as forming a plane or a layer and thus the value of may be regarded as identifying a plane or layer. The symbols in each layer of the uncoupled data cube form an MDS code.
Let, be the parity check (pc) matrix of an arbitrarily chosen scalar MDS code defined over . Let denote the element of lying in the th row, and th column. Then the symbols of the uncoupled code satisfy the pc equations:
(3) 
Next, consider an identical datacube (see Fig. 7) containing the symbols
corresponding to the coupledlayer code. This datacube will be referred to as the coupled data cube. The symbols of the coupled data cube are derived from the symbols of the uncoupled data cube as follows. Let be an element in , . Let us define . Each symbol which is such that is paired with a symbol . The values of the symbols so paired, are derived from those of their counterparts in the uncoupled data cube as per the linear transformation given below, termed as the PFT:
(4) 
In the case of the symbols when , the relation between symbols in the two data cubes is even simpler and given by: . The pairwise reverse transform (PRT) is simply the inverse of the PFT and is used to obtain the uncoupled symbols from the coupled symbols . The pc equations satisfied by the coupledlayer code can be derived using the pc equations (3) satisfied by the symbols in the uncoupled data cube and the PRT :
(5) 
Node Repair: Let be the failed node. To recover the symbols , each of the remaining nodes sends helper information: . Focusing on (5) for such that and retaining on the left side the unknown symbols, leads to equations of the form:
(6) 
where is a known value. These equations can be solved for the contents of the replacement node.
Data Collection: Please refer to [39] for the proof of data collection property.
YeBarg Codes [37]:
In [37] the authors present two constructions, for non optimalaccess MSR and optimalaccess MSR codes respectively. These are the only known MSR constructions that are explicit and yield MSR codes for any parameter set . The same codes are also optimal for the repair of multiple nodes. We describe here, for simplicity, the construction of MSR codes having parameters: where , defined over finite field for . Let be the collection of symbols of a codeword, where is the node index and is the scalar symbol index. The code is defined via the pc equations given below:
(7) 
where the are all distinct, thereby requiring a field size .
Node Repair: Let be the failed node, be the set of helper nodes. The helper information sent by a node is given by: . Next, fixing and summing equations (7) over the values of , we get:
(8) 
It can be shown that the collection of symbols form an MDS code. Therefore, all the can be computed from the known values supplied by the helper nodes and the symbols can thus be recovered from (8).
Data Collection: For every , the collection forms an MDS code. Therefore, any erased symbols can be recovered.
Multiple Node Repair Let be the number of erasures to be recovered. It was shown in [24] that the minimum repair bandwidth required to repair erasures in an MDS code having subpacketization level is lower bounded by . Given that is the number of helper nodes that need to be contacted during the repair of nodes, is lower bounded by: . The YeBarg code presented above achieves this bound [37]. The node repair discussed here assumes a centralized repair setting whereas an alternate, cooperative repair approach is discussed in Section VIA.
V On the StorageRepair Bandwidth Tradeoff under Exact Repair
We distinguish between the SRB tradeoffs for exact and functionalrepair RG code, by referring to them as the ER and FR tradeoff respectively. The file size under exact repair cannot exceed that in the FR case since ER may be regarded as a trivial instance of FR. However, unlike in the case of functionalrepair codes, the data collection problem in the exactrepair setting, cannot be identified with a multicast problem simply because each replacement node for a failed node acts as a sink for a different set of data. Thus it is not clear that the cutset bound for FR can be achieved under ER, leaving the door open for an SRB tradeoff in the case of ER that lies strictly above and to the right of the FR tradeoff in the plane. There do exist constructions of exactrepair MBR and MSR codes meeting the cutset bound with equality, showing that the ER tradeoff coincides with the FR tradeoff at the extreme MSR and MBR points.
Va The Nonexistence of ER Codes Achieving FR tradeoff
The first major result on the ER tradeoff was the result in [44], showing that apart from the MBR point and a small region adjacent to the MSR point, there do not exist ER codes whose values lie on the interior point of the FR tradeoff. We set to be the value of at the MSR point.
Theorem V.1.
For any given values of , ER codes having parameters corresponding to an interior point on the FR tradeoff do not exist, except possibly for in the range
(9) 
corresponding to a small region in the neighborhood of the MSR point.
Proof.
(Sketch) By restricting attention to any symbols of an RG code having parameter set one obtains a second RG code with parameter set in which all the remaining nodes participate in the repair of a failed node. This simplifies the analysis of the repair setting and with this in mind, in the proof, we set . When the message vector is picked uniformly at random, we have associated nodal random variables and repair data variables , where denotes the data passed from node to replacement node . The repair matrix (see Fig. 8) is an matrix whose th entry , is . The diagonal elements of do not figure in the discussion and maybe set equal to . Given subsets , we set , . We introduce the index sets , and for . The file size can be expressed in terms of the joint entropy of the node and repairdata variables (with logs computed to base ):
(10)  
(11)  
(12) 
The cutset bound in (1) corresponds to the inequalities: . For the bound to hold with equality, the joint random variables and must have maximum entropy. However it can be shown that the entropy of a row in the repair matrix is limited by if the cutset bound holds with equality. This leads to a contradiction, concluding the proof. ∎
Theorem V.1 does not however, rule out the possibility of an ER code having tradeoff approaching the FR tradeoff asymptotically i.e., as the file size .
VB The SRB Tradeoff for
It is possible that the entropies of the random variables involved satisfy Shannon inequalities other than the ones we have noted and which shed light on the ER tradeoff. For the particular case , Tian [45] was able to identify such an inequality with the help of a modified version of the Information Theory Inequality Prover (ITIP) [46, 47].
Let , represent the normalization of and with respect to file size . A point is said to be achievable if for any , there exists an ERRG code whose is close to . The normalized tradeoff, i.e., the tradeoff expressed in terms of and allows comparison of codes across file sizes . In the limit as , the SRB tradeoff becomes a smooth curve. Let , be RG codes over having respective parameter sets and . Consider a codeword array obtained by vertically stacking codeword arrays of and codeword arrays of . The code comprising of all such arrays is said to be the spaceshared code of and . Then is also an RG code with parameter set . The notion of spacesharing clearly extends to multiple codes.
Theorem V.2.
For , the achievable region is given by
(13) 
Proof.
Of the four inequalities listed, the first follow the entropy constraints listed in (12) above. The last inequality does not follow from (12), and was found in [45] using an ITIP. It remains to construct a code that operate on points on the plane, satisfying the inequalities with equality. A single paritycheck code serves as an MSR code for . A MBR code can be constructed using the polygonal construction described in Sec. III. A handcrafted code operating at the interior point of deflection (see Fig. 9) is given in [45]. Every point on the lines determined by equality in (13) is achieved by a code obtained by spacesharing among and . ∎
VC Layered Codes for Interior Points
1  2  3  4  5 
⋮  ⋮  ⋮  ⋮  ⋮ 
A simple codeconstruction technique based on the layering (see Fig. 10 for an example) of MDS codes turns out to provide codes that perform well with respect to file size in the interior region of the SRB tradeoff. Let be an MDS code having parameters . Let be such that and . Let denote an ordering of the collection of all possible subsets of . Let , be message vectors, not necessarily distinct, and be the codeword in associated with . We create an array in which we place the symbols of codeword in the location specified by subset . It turns out that this array represents an array code which possesses the data collection property of an RG code, but not the repair property. By replicating the array a certain number of times, it turns out that one obtains a regenerating code with parameters , operating between the MSR and MBR points. Further details can be found in [48]. We will refer to this code as the canonical layered code . The canonical layeredcode construction has been extended to construct codes with by making use of an outer code designed using linearized polynomials. An alternate generalization of the canonical code to the case of involved adding additional layers consisting of carefully designed parity symbols. Such an approach leads to the improved layered codes in [49], that turn out to be optimal for the set of parameters .
VD ER Tradeoff Strictly Away from FR Tradeoff for all
In [50], it was shown that the ER tradeoff cannot approach the FR tradeoff even when for any value of . This was established by deriving a positive lower bound on the gap between the ER and FR tradeoffs.
Theorem V.3.
The ER tradeoff between and for any exactrepair regenerating code, with is strictly separated from the FR tradeoff, apart from the MSR and MBR endpoints as well as the region surrounding the MSR point appearing in (9).
The proof the theorem involves identifying contradicting bounds on the entropy of various trapezoidalshaped subsets within the repair matrix. Subsequent papers [51],[52] derive better bounds, thereby improving the gap to go beyond . In [53], the authors adopt a different approach by first providing three different expression for the entropy of the data file involving mutual information between various repairdata variables, and taking a linear combination of these expressions that leads to a significantly tighter bound on :
(14) 
The authors in [54] improve upon the result in (14) using repairmatrix techniques, in combination with the bound in Thm. V.3, leading to the bestknown outer bound on the ER tradeoff. For the case of , the outer bound is achieved by the improved layered codes, thus characterizing the ER tradeoff. The bound also characterizes certain interior points when [50].
VE Determinant Codes for Interior Points
The construction given in [55] has parameters , and file size , where
Comments
There are no comments yet.