Access-optimal Linear MDS Convertible Codes for All Parameters

In large-scale distributed storage systems, erasure codes are used to achieve fault tolerance in the face of node failures. Tuning code parameters to observed failure rates has been shown to significantly reduce storage cost. Such tuning of redundancy requires "code conversion", i.e., a change in code dimension and length on already encoded data. Convertible codes are a new class of codes designed to perform such conversions efficiently. The access cost of conversion is the number of nodes accessed during conversion. Existing literature has characterized the access cost of conversion of linear MDS convertible codes only for a specific and small subset of parameters. In this paper, we present lower bounds on the access cost of conversion of linear MDS codes for all valid parameters. Furthermore, we show that these lower bounds are tight by presenting an explicit construction for access-optimal linear MDS convertible codes for all valid parameters. En route, we show that, one of the degrees-of-freedom in the design of convertible codes that was inconsequential in the previously studied parameter regimes, turns out to be crucial when going beyond these regimes and adds to the challenge in the analysis and code construction.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

08/28/2020

Bandwidth Cost of Code Conversions in Distributed Storage: Fundamental Limits and Optimal Constructions

Erasure codes have become an integral part of distributed storage system...
07/30/2019

Convertible Codes: Efficient Conversion of Coded Data in Distributed Storage

Large-scale distributed storage systems typically use erasure codes to p...
07/16/2021

A Generic Transformation to Generate MDS Storage Codes with δ-Optimal Access Property

For high-rate maximum distance separable (MDS) codes, most of them are d...
07/21/2019

Multi-Layer Transformed MDS Codes with Optimal Repair Access and Low Sub-Packetization

An (n,k) maximum distance separable (MDS) code has optimal repair access...
09/15/2020

Partial MDS Codes with Regeneration

Partial MDS (PMDS) and sector-disk (SD) codes are classes of erasure cor...
02/22/2018

A New Design of Binary MDS Array Codes with Asymptotically Weak-Optimal Repair

Binary maximum distance separable (MDS) array codes are a special class ...
12/20/2019

Analyzing the Download Time of Availability Codes

Availability codes have recently been proposed to facilitate efficient s...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Erasure codes are an essential tool for providing resilience against node failures in a distributed storage system [2, 3, 4, 5, 6, 7, 8]. When using an erasure code, chunks of data are encoded into chunks, called a stripe. These chunks are then distributed among different “nodes” in the system, where nodes correspond to distinct storage devices typically residing on distinct servers. For the purposes of theoretical study, each stripe can be viewed as a codeword, by viewing each of the chunks as one of the codeword symbols. The parameters and are usually chosen based on node failure rate, which might vary over time. Redundancy tuning, i.e., changing and in response to fluctuations in the failure rate of storage devices can achieve significant savings (11% to 44%) in storage space [9]. Due to practical system constraints, changing alone is typically insufficient and both and have to be changed simultaneously [9]. The resource cost of changing and on already encoded data can be prohibitively high and is a key barrier in the practical adoption of redundancy tuning [1]. Other reasons to change and on already encoded data might include variations in data popularity, failure rate uncertainty, or restrictions on the total amount of used storage.

The code conversion problem defined in [1] involves converting multiple stripes of an code (denoted by ) into (potentially multiple) stripes of an code (denoted by ), along with desired constraints on decodability such as both codes being Maximum Distance Separable (MDS). Considering multiple stripes enables code conversions to allow for changes in the code dimension (from to ). Convertible codes [1] are code pairs that enable code conversion, usually designed to minimize the cost of conversion. A detailed description of the convertible codes framework is provided in Section II-A.

There are several ways in which one might measure the cost of conversion. We focus on the access cost of conversion, which is measured in terms of the total number of nodes that need to be accessed during conversion. In [1], the authors focus on the so-called merge regime, wherein multiple initial stripes are merged into one. Specifically, they consider the case where for some integer , and propose explicit constructions for converible codes that achieve optimal access cost for the merge regime. We review these results for the merge regime in Section II-B.

The results presented in this work are two fold. (1) We present lower bounds on the access cost of conversion for linear MDS codes for all valid parameters, that is, all such that and . (2) We show that the proposed lower bounds are tight by presenting an explicit construction of linear MDS convertible codes that is access optimal for all parameter regimes. To achieve this, we first define and study the split regime in Section III, where for an integer , that is, a single initial stripe is split into multiple final stripes. We prove a (tight) lower bound on the access cost of conversion in the split regime, and describe a conversion procedure which has optimal access cost when used with any systematic MDS code. We then present in Section IV a tight lower bound on the access cost of conversion for linear MDS convertible codes for all valid parameters (termed general regime) by reducing conversion in the general regime to a combination of generalizations of conversions in the split and merge regimes. While the split and the merge regimes might seem somewhat restrictive, we show that, perhaps surprisingly, the proposed conversion procedure for the general regime that builds on top of the generalized split and merge regime is optimal. Interestingly, one of the degrees-of-freedom in the design of convertible codes (called “partitions” described subsequently in Section II-A), which is inconsequential in the split and merge regimes, turns out to be crucial in the general regime. The proposed construction for access-optimal convertible codes for the general regime builds on the constructions for split and merge regimes, while separately optimizing along this additional degree-of-freedom.

Ii Background and Related work

Ii-a Convertible codes  [1]

A conversion from an initial code to an final code is a procedure that takes as input a set of initial stripes from and outputs a set of final stripes from , such that the final stripes together encode the same information as the initial stripes. To avoid degeneracy, and is assumed. Let be a finite field, and consider a message , where . The number of initial stripes is and the number of final stripes is . Let , and . Let denote the projection of onto the coordinates in the set , and let denote the encoding of under code . Consider an initial partition of such that , and a final partition of such that . These partitions determine how message symbols are mapped to each of the initial and final stripes. For example, the -th initial stripe will only encode the symbols of indexed by .

Definition 1 (Convertible code [1]).

An convertible code over is defined by:

  1. a pair of codes over such that is and is ;

  2. a pair of partitions of such that for all and for all ; and

  3. a conversion procedure which, for any , takes the set of initial codewords as input, and outputs the corresponding set of final codewords .

In this paper, we will restrict our focus to the case where and are both linear and MDS.

The access cost of a conversion procedure is the total number of nodes that are read or written during conversion. Recall that each node in a stripe corresponds to a single symbol from the corresponding codeword, therefore access cost is equivalent to the number of codeword symbols that are read or written during conversion. We distinguish three types of nodes during conversion: unchanged nodes, which remain as is during the conversion process, and are present in both the initial and final configuration (possibly in different stripes); retired nodes, which are present in the initial configuration and throughout the conversion, but not in the final configuration; and new nodes, which are introduced during conversion, and are present in the final configuration, but not in the initial configuration. Unchanged and retired nodes may be accessed for reading during conversion, and new nodes are always accessed for writing during the conversion. A convertible code that maximizes the number of unchanged nodes is said to be stable.

The read access set of an convertible code is a set of tuples , where corresponds to the -th node in initial stripe . After a conversion, each new node holds a fixed linear combination of the contents of the nodes indexed by . We denote the accessed nodes from initial stripe as . Thus, the access cost of a conversion with read access set of size and new nodes is . Clearly, there always exists a conversion procedure with read access cost , which reconstructs the original message and re-encodes according to . We refer to this procedure as the default approach.

An convertible code is access-optimal if and only if it achieves the minimum access cost over all convertible codes.

Ii-B Merge regime [1]

The merge regime is the subset of valid parameter values for convertible codes where , for some integer . Thus, in this regime we have and . This regime was the focus of [1], wherein the following lower bound on access cost was shown.

Theorem 1 ([1]).

For all linear MDS convertible code, the access cost of conversion is at least . Furthermore, if , the access cost of conversion is at least .

An explicit construction for access-optimal convertible codes for all values in the merge regime was also provided in [1].

Ii-C Other related works

The closest related work [1] proposes the convertible codes framework considered in this work (discussed at length above). Several other works in the literature [10, 11, 12, 13, 14] have considered variants of the code conversion problem, largely within the context of so-called “regenerating codes” [15]. The study on regenerating codes, which are a class of codes that optimize for recovery for a small subset of nodes within a stripe (as opposed to decoding all original data), was initiated by Dimakis et al. [15]. Subsequently numerous works have studied and constructed optimal regenerating codes (e.g., [16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]). Specific instances of code conversion can be viewed as instances of the repair problem, for example, increasing while keeping fixed as studied in [10, 14].

In a recent work [31], Su et al. study a related problem in the context of coded computation for distributed matrix multiplication. In [31], Su et al. propose a coding scheme for coded matrix-multiplication with the property that certain changes to the code parameters only require local re-encoding of the data stored in each server.

Ii-D Notation

This subsection introduces notation that generalizes the notation used in [1] and is used throughout this paper. Let be a generator matrix of MDS code for

. An encoding vector in relation to

is associated to each node in the initial or final stripes. The encoding vector of node in stripe with partition set is defined such that , and 0 everywhere outside of . The difference between and is that the former describes the encoding of the -th symbol relative to the information encoded in a single initial (resp. final) stripe, while the latter describes the encoding of the -th symbol of the -th initial (resp. final) stripe relative to the information jointly encoded by all initial (resp. final) stripes (i.e., the message ).

Let be the encoding vectors for a particular stripe, and let . Let be the encoding vectors of unchanged nodes, and define , where the index or is dropped if or , respectively. Let be the encoding vectors of nodes that are read from initial stripe , and define as the set of all encoding vectors of nodes that are read. Finally, let be the encoding vectors of new nodes, and define as the encoding vectors of new nodes of a particular stripe . Notice that it must hold that . For simplicity, we sometimes refer to a node and its encoding vector interchangeably.

Iii Split regime

The split regime of convertible codes corresponds to the case where a single initial stripe is split into multiple final stripes. This regime is, in some sense, the opposite of the merge regime, in which multiple initial stripes are combined into one final stripe. Specifically, an convertible code is in the split regime if for an integer , with arbitrary and . Notice that in this regime we have that and thus and .

First, in Section III-A, we show a lower bound on access cost for the split regime. In Section III-B we show a matching upper bound on access cost by showing that for every systematic MDS code there exists an access-optimal convertible code having as its initial code by presenting a conversion procedure whose cost matches the lower bound.

Iii-a Access cost lower bound for the split regime

In this subsection, we lower bound the access cost of conversion in the split regime. This is done by first showing a lower bound on write access cost, and then showing a lower bound on the read access cost of conversion.

The following fact simplifies the analysis of the split regime.

Proposition 2.

For a linear MDS convertible code, all possible pairs of initial and final partitions are equivalent (up to relabeling).

There is only one possible initial partition , hence any two final partitions can be made equivalent by relabeling nodes. Therefore, we do not need to consider differences in partitions in our analysis of the split regime.

Proposition 3.

In a linear MDS convertible code, there are at most unchanged nodes in each of the final stripes (i.e., at least new nodes per stripe). Hence, there are at most unchanged nodes in total.

For any final stripe , any subset of size at least is linearly dependent due to the MDS property. Thus, contradicts the fact that is MDS. Hence, each final stripe has at most unchanged nodes.

Therefore, the total write access cost in the split regime is at least .

Now we focus on bounding the read access cost. The general strategy we use to obtain bounds on read access cost is to consider a specially chosen set of nodes from a final stripe, which by the MDS property of the final code is enough to decode all data in that stripe. We then use the fact that final stripes are the result of conversion to identify a set of initial nodes that contain all the information contained in . The MDS property of the initial code constrains the information available in , which allows us to derive a lower bound on its size and thus a lower bound on the number of read nodes.

Lemma 4.

For all linear MDS convertible codes, the read access set satisfies .

If , then all data should be decodable by accessing only new nodes in the final stripes, and the result follows easily since all data must have been read to create the new nodes. Therefore, assume for the rest of this proof that .

Suppose, for the sake of contradiction, that . Let be a node in some final stripe which is neither read nor written. Such a stripe and node exist since otherwise every node in the final stripes would be accessed (for either read or write) and thus would be in the span of , which is a contradiction since .

Let be a subset of nodes of the same final stripe such that and . Such a subset exists by virtue of Proposition 3. Further, let be such that . Clearly is of size and can reconstruct the contents of , by the MDS property of the final code. In other words, .

Let be the unchanged nodes in . Since and only have new nodes, they are both contained in , therefore . Notice that the subset consists only of initial nodes. Furthermore, it holds that and . Thus:

This implies that is spanned by less than initial nodes (which do not include ). However, by the MDS property of the initial code, any subset of less than initial nodes that does not contain node , has no information about . This causes a contradiction with the fact that . Thus, we must have .

It is easy to show that if we only read unchanged nodes, it is not possible to do better than the default approach. This follows from the fact that unchanged nodes are already present in the final stripes and hence using them to create the new nodes will contradict with the MDS property. Retired nodes, on the other hand, do not have this drawback. Thus, intuitively, based on Lemma 4, one might expect to achieve an efficient conversion by reading from the retired nodes. However, we next show that it is not possible to achieve lower read access cost than the default approach when .

Lemma 5.

For all linear MDS convertible codes, if then the read access set satisfies .

Suppose, for the sake of contradiction, that . Let be a node in some final stripe which is neither read nor written. Such a stripe and node always exist as described in the proof of Lemma 4. We will choose a subset of nodes of size . By the MDS property of the final code, node is decodable from , i.e., . There are two cases for the choice of depending on the total number of accessed nodes in stripe :

Case 1: If , then let . That is, only contains nodes that are read or written. It is easy to see that .

Clearly, contains only initial nodes, and the following holds:

However, this is a contradiction with the fact that , since by the MDS property of the initial code, contains no information about node .

Case 2: If , then choose , where and is any subset of of size . That is, contains all the nodes of final stripe that are read or written (in addition to other unchanged nodes distinct from ). It is easy to see that and thus . Furthermore, the subset consists only of initial nodes.

Notice that there are at most read nodes outside of final stripe (i.e., in ). Therefore, we can bound by . On the other hand, it is clear that . Combining these, we get:

However, this is a contradiction with the fact that , since by the MDS property of the initial codes, contains no information about node .

By combining all the results in this subsection, we obtain the following lower bound on the access cost of conversion in the split regime.

Theorem 6.

The total access cost of any linear MDS convertible code is at least if , and at least otherwise.

Follows from Lemmas 5, 4 and 3.

As we show in the next subsection, this lower bound is tight since it is achievable.

Iii-B Access-optimal convertible codes for the split regime

In this subsection we present a construction of access-optimal convertible codes in the split regime. Under this construction, any systematic MDS code can be used as the initial code. The final code corresponds to the projection of the initial code onto the coordinates of any systematic nodes. Since our construction can be applied to existing codes and only specifies the conversion procedure, we introduce the following definition capturing the property of codes that can be converted efficiently.

Definition 2.

A code is if and only if there exists an code (along with partitions and conversion procedure) that form an access-optimal convertible code.

The conversion procedure that leads to optimal access cost (meeting the lower bound in Theorem 6) is as follows.

Conversion procedure: All the systematic nodes are used as unchanged nodes. When or , the conversion is trivial since one cannot do better than the default approach. The conversion procedure for the nontrivial case proceeds as follows. For all but one final stripe, all unchanged nodes are read ( in total), and the new nodes are naively constructed from them. For the remaining final stripe, retired nodes are read, and then the unchanged nodes from the other final stripes are used to remove their interference from the retired nodes to obtain new nodes.

Theorem 7.

Every systematic linear MDS code is .

If , then the default approach achieves the bound stated in Theorem 6. Thus, assume . Let be the generator matrix of and assume nodes are numbered in the same order as the columns of . Define as the code generated by the matrix formed by taking the first rows of , and columns and . Let be the columns of the unchanged nodes corresponding to final stripe . Consider the following conversion procedure: read the the subset of unchanged nodes and the retired nodes . To construct the new nodes for stripe 1, simply project the nodes of onto their first coordinates by using nodes . To construct the new nodes for stripe , simply use then nodes in . This conversion procedure reads a total of nodes and writes a total of new nodes, which matches the bound from Theorem 6.

Notice that convertible codes created using the construction above are stable. We show this property is, in fact, necessary.

Lemma 8.

All access-optimal convertible codes for the split regime are stable.

Theorem 7 shows that there exist stable access-optimal codes for the split regime. Since any unstable convertible code must incur higher write access cost and at least as much read access cost, it cannot be access-optimal.

Iv General regime

In this section, we will study the general regime of convertible codes with arbitrary valid parameter values (i.e. any and ). Recall that the choice of partition functions was inconsequential in the split and merge regimes. In contrast, it turns out that the choice of initial and final partitions play an important role in the general regime. This makes the general regime significantly harder to analyze. We deal with this complexity by reducing conversion in the general regime to generalized versions of the split and merge conversions, and by identifying the conditions on initial and final partitions to minimize total access cost.

In Section IV-A, we explore a generalization of the split regime and of the merge regime. In Section IV-B, these generalizations are used to lower bound the access cost of conversion in the general regime. In Section IV-C, we describe a conversion procedure and construction for access-optimal conversion in the general regime which utilizes ideas from the constructions for generalizations of split and merge regimes.

Iv-a Generalized split and merge regimes

The generalized split and merge regimes are similar to the split and merge regimes, except that the generalized variants allow for initial or final stripes of unequal sizes. This flexibility enables the generalized split and merge regimes to be used as building blocks in the analysis of the general regime. In these generalized variants, the message length is defined to be (which coincides with the definition of in the split and merge regime), but now the sets in the initial and final partitions need not be all of the same size.

Since the initial (or final) stripes might be of different lengths, we define them as shortenings of a common code .

Definition 3.

An -shortening of an code is the code formed by all the codewords in that have 0 in a fixed subset of positions, with those positions deleted.

Shortening a code has the effect of decreasing the length and dimension while keeping fixed. It can be shown that an -shortening of an MDS code is an MDS code. Lengthening is the inverse operation of shortening, and has the effect of increasing length and dimension while keeping fixed. For linear codes, an -lengthening of a code can be defined as adding additional columns to its parity check matrix. Similarly, it can be shown that for an MDS code, there exists an -lengthening of it that is an MDS code (assuming a large enough field size).

Iv-A1 Generalized split regime

In the generalized split regime, is fixed, is arbitrary, and the final partition is such that and . Let . Then is a MDS code, and the code corresponding to each final stripe is some fixed shortening of . In this case, we define .

Definition 4.

A convertible code for the generalized split regime is a variant of a convertible code defined by:

  1. and as and codes, where ,

  2. a partition where , and

  3. a conversion procedure such that each final stripe , is an -shortening of where .

The generalized split regime has an access cost lower bound similar to the split regime presented in Section III. We show this by showing that a more efficient conversion procedure for the generalized split regime would imply the existence of a conversion procedure for split regime violating Theorem 6.

Theorem 9.

For all linear MDS convertible codes, the read access set satisfies:

Suppose, for the sake of contradiction, that there exists a conversion procedure with read access cost for some convertible code in the generalized split regime with codes and . We modify the initial code by lengthening it to an MDS code , such that and . This adds extra “pseudo-nodes” to the initial code, which we denote with .

We then define a new conversion procedure from code to code which uses the conversion procedure for the generalized split regime convertible code as a subroutine, and then simply reads all the added pseudo-nodes to construct the new nodes. This procedure only reads the read access set from along with the pseudo-nodes.

Hence, the total read access is,

However, the codes and with the new conversion procedure clearly form an MDS convertible code. Therefore, this is in contradiction to Theorem 6. Then, it must hold that .

This lower bound is achievable for all pairs of initial and final parameters. Similar to the case of the split regime, shown in Section III-B, we can use any systematic MDS codes as initial and final codes, and access all but a set of nodes of size (forming the largest final stripe) to perform this conversion, as described below.

Conversion procedure: All the systematic nodes are used as unchanged nodes. When or , the conversion is trivial since one cannot do better than the default approach. The conversion procedure for the nontrivial case proceeds as follows. For all but the largest final stripe, all unchanged nodes are read ( in total), and the new nodes are naively constructed from them. For the largest final stripe, the retired nodes are read, and then the unchanged nodes from the other final stripes are used to remove their interference from the retired nodes to obtain new nodes.

Iv-A2 Generalized merge regime

In the generalized merge regime, the sets in the initial partition need not be all of the same size. In this case, we fix and , while is arbitrary. The initial partition is such that and . Let . Then is a MDS code, , and the code corresponding to each initial stripe is some fixed shortening of .

Definition 5.

A convertible code for the generalized merge regime is a variant of a convertible code defined by:

  1. as and codes, where

  2. partition where , and

  3. a conversion procedure such that each initial stripe , is an -shortening of where .

The next theorem gives a lower bound on the read access cost of a convertible code.

Theorem 10.

For all convertible code, for all . Furthermore, if , then for all .

Follows from the proofs of Lemmas 10, 11, and 13 in [1], with some straightforward modifications to account for the difference in the number of nodes of each initial stripe.

We can achieve this lower bound by shortening an access-optimal convertible code, where and .

Iv-B Access cost lower bound for the general regime

In this subsection, we study the access cost lower bound for conversions in the general regime (i.e., for all valid parameter values, and ). As in the merge and split regime, we show that when , significant reduction in access cost can be achieved. However when , one cannot do better than the default approach.

For an convertible code with and partitions , let for and let .

Lemma 11.

For all linear MDS convertible codes with :

Moreover, if then for all .

Let be an initial stripe. There are two cases.

Case : In this case, we can reduce this conversion to a conversion in the generalized split regime by focusing on initial stripe , and considering messages which are zero everywhere outside of . This is equivalent to a convertible code. Then, the result follows from Theorem 9.

Case : Let . In this case, we can reduce this conversion to conversion in the generalized merge regime by focusing on final stripe , and considering messages which are zero everywhere outside of . This is equivalent to a convertible code. Then, the result follows from Theorem 10.

We prove a lower bound on the total access cost of conversion in the general regime by using Lemma 11 on all initial stripes and finding a partition that minimizes the value of the sum.

Theorem 12.

For every linear MDS convertible code such that , it holds that:

if . Furthermore, if or , then .

Clearly, it holds that . Then, the case follows directly from Lemma 11. Otherwise, by the same lemma we have:

(1)

First, we consider the case . Notice that in this case and . If , then the result is trivial, so assume . Since for all , we have:

which proves the result.

Now, we consider the case . Assume, for now, that the right hand side of Eq. 1 is minimized when:

(2)

Then, from Eq. 1 we have:

(3)

If , then the result is trivial, so assume . Then, by manipulating the terms of Eq. 3, the result is obtained.

It only remains to prove that the right hand side of Eq. 1 is minimized when Eq. 2 holds.

Notice that this is equivalent to showing that