A Unified Form of EVENODD and RDP Codes and Their Efficient Decoding

03/09/2018 ∙ by Hanxu Hou, et al. ∙ The Chinese University of Hong Kong NetEase, Inc 0

Array codes have been widely employed in storage systems, such as Redundant Arrays of Inexpensive Disks (RAID). The row-diagonal parity (RDP) codes and EVENODD codes are two popular double-parity array codes. As the capacity of hard disks increases, better fault tolerance by using array codes with three or more parity disks is needed. Although many extensions of RDP codes and EVENODD codes have been proposed, the high decoding complexity is the main drawback of them. In this paper, we present a new construction for all families of EVENODD codes and RDP codes, and propose a unified form of them. Under this unified form, RDP codes can be treated as shortened codes of EVENODD codes. Moreover, an efficient decoding algorithm based on an LU factorization of Vandermonde matrix is proposed when the number of continuous surviving parity columns is no less than the number of erased information columns. The new decoding algorithm is faster than the existing algorithms when more than three information columns fail. The proposed efficient decoding algorithm is also applicable to other Vandermonde array codes. Thus the proposed MDS array code is practically very meaningful for storage systems that need higher reliability.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Array codes have been widely employed in storage systems, such as Redundant Arrays of Inexpensive Disks (RAID) [1, 2], for the purpose of enhancing data reliability. In the current RAID-6 system, two disks are dedicated to the storage of parity-check bits, so that any two disk failures can be tolerated. There are a lot of existing works on the design of array codes which can recover any two disks failures, such as the EVENODD codes [3] and the row-diagonal parity (RDP) codes [4].

As the capacities of hard disks are increasing in a much faster pace than the decreasing of bit error rates, the protection offered by double parities will soon be inadequate [5]. The issue of reliability is more pronounced in solid-state drives, which have significant wear-out rates when the frequencies of disk writes are high. In order to tolerate three or more disk failures, the EVENODD codes were extended in [6], and the RDP codes were extended in [7, 8]. All of the above coding methods are binary array codes, whose codewords are arrays with each entry belonging to the binary field , for some positive integers and . Binary array codes enjoy the advantage that encoding and decoding can be done by Exclusive OR (XOR) operations. The disks are identified as columns, and the bits in each column are stored in the corresponding disk. A binary array code is said to be systematic if, for some positive integer less than , the right-most columns store the parity bits, while the left-most columns store the uncoded data bits. If the array code can tolerate arbitrary erasures, then it is called a maximum-distance separable (MDS) array code. In other words, in an MDS array code, the information bits can be recovered from any columns.

I-a Related Works

There are many follow-up studies on EVENODD codes [3] and RDP codes [4] along different directions, such as the extensions of fault tolerance [6, 9, 7], the improvement of repair problem [10, 11, 12, 13] and efficient decoding methods [14, 15, 16, 17] of their extensions.

Huang and Xu [14] extended the EVENODD codes to be STAR codes with three parity columns. The EVENODD codes were extended by Blaum, Bruck and Vardy [6, 9] for three or more parity columns, with the additional assumption that the multiplicative order of 2 mod is equal to . A sufficient condition for the extended EVENODD codes to be MDS with more than eight parity columns is given in [18]. Goel and Corbett [7] proposed the RTP codes that extend the RDP codes to tolerate three disk failures. Blaum [8] generalized the RDP codes that can correct more than three column erasures and showed that the extended EVENODD codes and generalized RDP codes share the same MDS property condition. Blaum and Roth [19] proposed Blaum-Roth codes, which are non-systematic MDS array codes constructed over a Vandermonde matrix. Some efficient systematic encoding methods for Blaum-Roth codes are given in [19, 20, 21]. We call the existing MDS array codes in [3, 4, 6, 9, 7, 8, 19, 14, 15, 16, 17] as Vandermonde MDS array codes, as their constructions are based on Vandermonde matrices.

Decoding complexity in this work is defined as the number of XORs required to recover the erased no more than columns (including information erasure and parity erasure) from surviving columns. There are many decoding methods for extended EVENODD codes [15] and generalized RDP codes; however, most of them focus on . Jiang et al. [15] proposed a decoding algorithm for extended EVENODD codes with . To further reduce decoding complexity of the extended EVENODD codes with , Huang and Xu [14] invented STAR codes. One extension of RDP codes with three parity columns is RTP codes, whose decoding has been improved by Huang et al.[17]

. Two efficient interpolation-based encoding algorithms for Blaum-Roth codes were proposed in

[20, 21]. However, the efficient algorithms in [20, 21] are not applicable to the decoding of the extended EVENODD codes and generalized RDP codes. An efficient erasure decoding method that solves Vandermonde linear system over a polynomial ring was given in [19] for Blaum-Roth codes, and the decoding method is also applicable to the erasure decoding of extended EVENODD codes if the number of information erasures is no larger than the number of continuous surviving parity columns. There is no efficient decoding method for arbitrary erasures and one needs to employ the traditional decoding method such as Cramer’s rule to recover the erased columns.

I-B Contributions

In this paper, we present a unified form of EVENODD codes and RDP codes that include the existing RDP codes and their extensions in [4, 8], along with the existing EVENODD codes and their extensions in [3, 6, 9]. Under this unified form, these two families of codes are shown having a close relationship between each other. Based on this unified form, we also propose a fast method for the recovery of failed columns. This method is based on a factorization of Vandermonde matrix into very sparse lower and upper triangular matrices. Similar to the decoding method in [19], the proposed fast decoding method can recover up to erasures such that the number of information erasure is no larger than the number of continuous surviving parity columns. We then illustrate the methodology by applying it to EVENODD codes and RDP codes. We compare the decoding complexity of the proposed method with those presented in [19] for the extended EVENODD codes and generalized RDP codes. The proposed method has lower decoding complexity than that of the decoding algorithm given in [19], and is also applicable to other Vandermonde MDS array codes.

Ii Unified Form of EVENODD Codes and RDP Codes

In this section, we first present EVENODD codes and RDP codes. Then, we give a unified form of them and illustrate that RDP codes are shortened EVENODD codes under this form.

The array codes considered in this paper contain rows and columns, where

is an odd number. In the following, we let

and be positive integers which are both no larger than . Let be an -tuple consisting of distinct integers that range from 0 to , where . The -th entry of column are denoted as and for EVENODD codes and RDP codes respectively. The subscripts are taken modulo throughout the paper, if it is not specified.

Ii-a EVENODD Codes

For an odd , we define the EVENODD code as follows. It is a array code, with the first columns storing the information bits, and the last columns storing the parity bits. For , column is called information column that stores the information bits , and for , column is called parity column that stores the parity bits .

Given the information array for and , we add an extra imaginary row , for , to this information array. The parity bits in column are computed by

(1)

and the parity bits stored in column , , are computed by

(2)

where

(3)

We denote the EVENODD codes defined in the above equations as . The default values in are , and we simply write if the values in are default. An example of is given in Table I. Under the above definition, the EVENODD code in [3] is with , and the extended EVENODD code in [6] is with .

TABLE I: Encoding of . Note that, by (3), and

Ii-B RDP Codes

RDP code is an array code of size . Given the parameters that satisfy , we add an extra imaginary row to the information array , for and , as in . The parity bits of the are computed as follows:

(4)
(5)

Like , the default value of are . The first 4 rows in Table II are the array of . The RDP code in [4] is with and is the extended RDP in [8].

Ii-C Unified Form

There is a close relationship between and when both array codes have the same number of parity columns. The relationship can be seen by augmenting the arrays as follows. For RDP codes, we define the corresponding augmented array as a array with the top rows the same as in , and the last row defined by for and

(6)

Note that (6) is the extension of (5) when . The auxiliary row in the augmented array is defined such that the column sums of columns are equal to zero. The above claim is proved as follows.

Lemma 1.

For , we have .

Proof.

The summation of all bits in column of the augmented array is the summation of all bits in columns 0 to . Since the summation of all bits in column is the summation of all bits in columns 0 to , we have that the summation of all bits in column is equal to 0. ∎

By the above lemma, we can compute for as

An example of the augmented array code of is given in Table II.

0 0 0 0
TABLE II: The augmented array of .

Similarly, for an , the augmented array is a array defined as follows. The first columns are the same as those of , i.e., for and , . For , we define the parity bits in column as

(7)

We note that is the same as defined in (3). According to (2), the parity bits in column of can be obtained from the augmented array by

Lemma 2.

The bits in column for of the augmented array can be obtained from by

(8)
(9)
Proof.

Note that

(11)

where (II-C) comes from (1) and (2), (11) comes from that for , and (II-C) comes from the fact that

for , . Therefore, we can obtain the bit by (8) and the other bits in parity column by (9). ∎

The augmented array of is given in Table III.

0 0 0 0
TABLE III: The augmented array of .

The augmented array of can be obtained from shortening the augmented array of and we summarize this fact in the following.

Proposition 3.

Let of be the same as of . The augmented array of can be obtained from shortening the augmented array of as follows: (i) imposing the following additional constraint on the information bits

(13)

for ; (ii) removing column of the augmented array of .

Proof.

Consider the augmented array of and assume that the information bits of column satisfy (13). By (1), the parity bits in column are all zeros. After deleting column from the augmented array of and reindexing the columns after this deleted column by reducing all indices by one, we have a new array with columns of a shortened . Let the augmented array of with the information columns being the same as the first information columns of the augmented array of such that these columns are the same as those of the array of the shortened . Then column of the augmented array of is the same as column of the array of the shortened according to (13) and (4). Recall that the bit in column , and , of the augmented array of is computed by (5) (or (6)). Since for and , is the same as in the array of the shortened that is defined by (7). Therefore, we can obtain the augmented by shortening the augmented by imposing the condition (13) and removing column , and this completes the proof. ∎

By Proposition 3, the unified form of and is the augmented array of . In the following, we focus on , as the augmented array of can be viewed as the shorten augmented array of .

Iii Algebraic Representation

Let be the ring of polynomials over binary field , and be the quotient ring . An element in can be represented by a polynomial of degree strictly less than with coefficients in , we will refer to an element of as a polynomial in the sequel. Note that the multiplication of two polynomials in is performed under modulo .

The ring has been discussed in [22, 23] and has been used in designing regenerating codes with low computational complexity. Let

is isomorphic to a direct sum of two finite fields and 111When 2 is a primitive element in , is a finite field. if and only if 2 is a primitive element in  [24]. In [25], was used for performing computations in , when is a prime such that 2 is a primitive element in . In addition, Blaum et al. [6, 9] discussed the rings in detail.

We will represent each column in a augmented array of by a polynomial in , so that a array is identified with an -tuple

(14)

in , where . Under this representation, the augmented array of can be defined in terms of a Vandermonde matrix.

In the array, the bits in column can be represented as a polynomial

for . The first polynomials are called information polynomials, and the last polynomials are the parity polynomials. The parity bit of augmented array of defined in (7) is equivalent to the following equation over the ring

(15)

where is the Vandermonde matrix

(16)

and additions and multiplications in the above calculations are performed in . (15) can be verified as follows:

(17)

Let , we have

which is the same as (7). In other words, each parity column in the augmented array of EVENODD codes is obtained by adding some cyclically shifted version of the information columns.

Recall that is a polynomial over for . When we reduce a polynomial modulo , it means that we replace the coefficient with for . When , we have that . If we reduce modulo , we obtain itself, of which the coefficients are the bits of column of , for . Recall that the coefficients of for are computed by (7). If we reduce modulo , i.e., replace the coefficients for by , which are that are bits in column of . In fact, we have shown how to convert augmented array of into original array of .

By Proposition 3, we can obtain the augmented array of by multiplying

and removing the -th component, which is always equal to zero, in the resultant product. If we arrange all coefficients in the polynomials with degree strictly less than , we get the original array of .

When for , the MDS property condition of is the same as that of the extended EVENODD codes [6], and the MDS property condition of and was given in [6] and [18], respectively. Note that the MDS property condition depends on that 2 is a primitive element in . This is the reason of the assumption of primitivity of 2 in . In the following of the paper, we assume that with for and with for , and the proposed and are MDS codes. We will focus on the erasure decoding for these two codes.

When some columns of are erased, we assume that the number of erased information columns is no larger than the number of continuous surviving parity columns. Note that one needs to recover the failure columns by downloading surviving columns. First, we represent the downloaded columns by some information polynomials and continuous parity polynomials. Then, we can subtract all the downloaded information polynomials from the parity polynomials to obtain a Vandermonde linear system. Although can be described by the Vandermonde matrix given in (16) over and we can solve the Vandermonde linear system over to recover the failure columns, it is more efficient to solve the Vandermonde linear system over . First, we will show in the next section that we can first perform calculation over and then reduce the results modulo in the decoding process. An efficient decoding algorithm to solve Vandermonde linear system over based on LU factorization of Vandermonde matrix is then proposed in Section V.

Iv Vandermonde Matrix over

Before we focus on the efficient decoding method of and , we first present some properties of Vandermonde matrix. As the decoding algorithm hinges on a quick method in solving a Vandermonde system of equations over , we discuss some properties of the linear system of Vandermonde matrix over in this section.

Let be an Vandermonde matrix

(18)

where are distinct integers such that the difference of each pair of them is relatively prime to . The entries of are considered as polynomials in . We investigate the action of multiplication over by defining the function :

for . Obviously, is a homomorphism of abelian group and we have for .

The function

is not surjective. If a vector

is equal to for some , it is necessary that

(19)

This is due to the fact that each polynomial is obtained by adding certain cyclically shifted version of ’s. In other words, if is in the image of , then either there are even number of nonzero terms in all , or there are odd number of nonzero terms in all for .

The function is also not injective. We can see this by observing that if we add the polynomial to a component of , for example, adding to , then

Hence, if we add to two distinct components of input vector , then the value of does not change. We need the following lemma before discussing the properties of the Vandermonde linear system over .

Lemma 4.

[6, Lemma 2.1] Suppose that is an odd number and is relatively prime to , then and are coprime in , and and are relatively prime in for any positive integer .

If the vector satisfies (19), in the next theorem, we show that there are many vectors such that .

Theorem 5.

Let