On the Optimal Minimum Distance of Fractional Repetition Codes

05/14/2020 ∙ by Bing Zhu, et al. ∙ Central South University The Chinese University of Hong Kong 0

Fractional repetition (FR) codes are a class of repair efficient erasure codes that can recover a failed storage node with both optimal repair bandwidth and complexity. In this paper, we study the minimum distance of FR codes, which is the smallest number of nodes whose failure leads to the unrecoverable loss of the stored file. We consider upper bounds on the minimum distance and present several families of explicit FR codes attaining these bounds. The optimal constructions are derived from regular graphs and combinatorial designs, respectively.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Modern distributed cloud storage systems that built on large number of independent storage devices can provide large-scale storage services in a cost-efficient manner. However, due to the commodity nature of physical storage nodes, component failures may occur unexpectedly (and even frequently) in such systems. To ensure data availability, distributed storage systems need to introduce a certain level of data redundancy in order to protect the stored data against node failures. A simple solution that is deployed in realistic systems is to store several replicas of each data object, and upon failure of a single storage node, the lost data can be exactly recovered by downloading the remaining replicas from other nodes. Data replication supports efficient repair and is easy to manage in practice, yet it suffers from the main drawback of high storage overhead. Instead, erasure coding has emerged as a promising technology for distributed storage systems with the advantage of achieving higher storage efficiency [1]. For example, an maximum distance separable (MDS) code encodes a data file consisting of symbols into coded blocks in such a manner that the original data can be reconstructed by accessing any out of the coded blocks (this property is called the MDS property). In this traditional coding framework, a failed node can be repaired by contacting surviving nodes and then re-encoding the lost data. This recovery method turns out to be bandwidth consuming since the repair of one block incurs the transfer of coded blocks over the storage network.

Regenerating codes are introduced in [2] with the capability of minimizing the network traffic required for repairing a failed storage node. In an regenerating code, a data object is encoded into coded packets, which are distributed across storage nodes, each containing packets. The source data can be reconstructed by contacting any nodes in the system, where is called the reconstruction degree. When a node fails, the lost packets can be regenerated by connecting to any subset of surviving nodes and downloading packets from each node. The number of nodes contacted for repair (i.e., ) is called the repair locality, and the total number of packet transmissions (i.e., ) is called the repair bandwidth. If the repair bandwidth equals to the storage capacity of the failed node, the corresponding code is called a minimum-bandwidth regenerating (MBR) code. In the regenerating framework proposed in [2], each set of packets is obtained as linear combinations of the packets stored in the connected node, which increases the computational complexity of repair. For the scenario that the transferred packets are taken as subsets of the packets in the helper nodes, the repair regime is called repair-by-transfer [3] or uncoded repair [4]. In other words, this repair approach enjoys the same efficiency as data replication. Erasure codes with this desirable property can be found in [3][10].

Fractional repetition (FR) codes are a special class of MBR codes proposed in [4] that enable uncoded repairs of failed storage nodes. It has a tailor-made encoding architecture in which an outer MDS code first performs the encoding operation of data objects and then an inner FR code replicates and distributes the coded packets across the storage nodes in a sophisticated manner. The stored file can be recovered by downloading a sufficient number of distinct coded packets and decoding the original data according to the MDS property. Moreover, the repair of a failed storage node can be completed by contacting some specific sets of helper nodes that contain the remaining replicas of lost packets, which differs from conventional MBR codes wherein any set of surviving nodes are eligible for node repair.

Consider a distributed storage system consisting of nodes, where each node stores the same number of packets. Suppose that the size of stored data objects is . The minimum distance of this storage system, denoted by , is the number of nodes such that

  1. there exists at least one set of nodes whose erasure leads to the unrecoverable loss of the source data;

  2. the stored file can be recovered for any subset of node erasures.

In [11], Kamath et al. derived a Singleton-like bound on the minimum distance as follows

(1)

Furthermore, Papailiopoulos and Dimakis showed in [12] that erasure codes with the local repair property (i.e., ) will incur a penalty on the maximum possible minimum distance. They proved that for an erasure code with repair locality , the minimum distance of the corresponding storage system is upper bounded by

(2)

Clearly, the upper bound in (2) reduces to that in (1) when . To the best of our knowledge, existing FR codes attaining the upper bounds above are known for some scattered examples [13] or small reconstruction degrees (e.g. [14, 15].

Method Code parameter File size Requirements
Theorem 4
Corollary 5
Theorem 6
Theorem 7
Theorem 8 , where is the largest integer
such that Inequality (19) holds.
Corollary 9
Table I: Summary of Optimal FR Codes Attaining the Singleton-like Bound in (1), Where Is a Prime and Is a Prime Power

Contributions: In this paper, we study upper bounds on the minimum distance of FR codes and present explicit code constructions that attain these bounds. Specifically, the main contributions are summarized as follows.

  • We show that the Singleton-like upper bound in (1) is equivalent to a simple lower bound on the reconstruction degree of FR codes. Based on this relation, we present several families of optimal FR codes attaining the bound in (1), which are summarized in Table I. For the given file size, the minimum distance of the corresponding system is optimal with respect to the Singleton-like bound in (1).111In [13, 14], the authors showed via examples that the constructed FR codes attain the bound in (1), where the corresponding reconstruction degrees are mainly within the range . As summarized in Table I, our proposed constructions support a wide range of code parameters.

  • We propose explicit constructions of FR codes attaining the minimum distance bound in (2), i.e, optimal FR codes that also have the local repair property. The proposed constructions are derived from regular graphs with large girth.

  • We derive an improved upper bound on the minimum distance of FR codes, which is tighter than the existing bounds especially for some large file sizes. Moreover, we also discuss some optimal FR codes attaining the proposed upper bound.

Organization: The remainder of this paper is organized as follows. Section II provides a brief overview of FR codes, including the necessary definitions, properties, and related works. Section III introduces several families of FR codes that achieve the Singleton-like bound in (1). Section IV proposes optimal FR codes attaining the upper bound in (2). Section V presents a new upper bound on the minimum distance of FR codes and compares this bound to the existing bounds. Finally, Section VI concludes the paper.

Ii Preliminaries

Ii-a Fractional Repetition Codes

An incidence structure is a triple , where  and  are nonempty (finite) sets and is a subset of , i.e., . The elements in  are called points, and the elements in are called blocks. We say that a point is incident with a block if . An -FR code is an incidence structure with such that each point is incident with blocks in and each block is incident with points in . Thus, the number of points in is . Moreover, the dual code of , denoted by , is the incidence structure , where is the subset of defined by

Since the roles of points and blocks are reversed in , it follows that forms a -FR code [16].

To store a data object of size , we first perform the encoding operation by adopting a MDS code and then spread the coded packets across storage nodes using an -FR code in the following manner: each coded packet is associated with a point in and each storage node is associated with a block in . Therefore, each packet is equally replicated times and each node contains exactly packets. According to the MDS property, we need to collect distinct coded packets for data retrieval, and the smallest number of nodes in that cover at least distinct packets gives the reconstruction degree of . In this sense, the value of (i.e., the dimension of the outer MDS code), is closely related to the reconstruction degree of the inner FR code, which motivates the following definition.

For a given reconstruction degree , the supported file size of , denoted by , is defined as

Intuitively, the value of refers to the smallest number of distinct points in any subset of blocks in . Thus by setting as the dimension of the outer MDS code, we can decode the stored data by contacting arbitrary storage nodes in . For , let be the reconstruction degree such that . Then, the minimum distance of -based system is when the size of stored data object is .

However, it is a non-trivial task to calculate the file size of FR codes with given parameters. In [4], a tight upper bound on the supported file size of an -FR code is given as

(3)

where is defined recursively by

(4)

For example, consider now the -FR code with , and . It can be computed that

We note here that achieves the upper bound on the file size in (3) for , yet it attains the Singleton-like bounds in (1) and (2) only for .

In [16], the complementary supported file size of an -FR code , denoted by , is defined as follows

In contrast, refers to the size of the largest set of packets that are not covered by storage nodes in . By virtue of , the elementary relationship between the supported file size of an FR code and its dual is revealed, as detailed in the following lemma.

Lemma 1.

([16]) Let be an -FR code with , and let be the dual code. Then, the supported file size of can be determined as

(5)

Ii-B Related Work

The construction of FR codes has attracted considerable attention over the past decade, which is mainly derived from regular graphs and combinatorial designs. In the pioneer work in [4], the authors provided explicit constructions of FR codes based on regular graphs and Steiner systems. The constructions from bipartite cage graphs in [17] have the advantage that the corresponding system can be easily expanded without frequent reconfigurations. In [18], Silberstein and Etzion presented some optimal FR codes that achieve the upper bound on the supported file size in (3), which are constructed from incidence structures including extremal graphs, transversal designs and generalized polygons. Constructions of FR codes from partially ordered sets are considered in [19], wherein the resulting codes can store larger files than MBR codes for the same system parameters. In [13], Olmez and Ramamoorthy investigated constructions of FR codes based on resolvable combinatorial designs including affine geometries, Hadamard designs, and mutually orthogonal Latin squares. Constructions of new FR codes from existing codes can be obtained by using techniques such as Kronecker product [13]

, tensor product 

[16], and symbol extension [20]. FR codes that support local repair are studied in [15] and [21], which are devised from bipartite graphs and symmetric designs, respectively. Moreover, the FR codes proposed in [22] enjoy the additional property that each helper node contributes the same amount of data during the repair of multiple failed nodes, and FR codes attaining lower bounds on the reconstruction degree are discussed in [23].

Despite the rich source for code constructions, it remains a challenging task to calculate the supported file size of FR codes, which is essentially equivalent to the problem of determining the expansion of bipartite graphs [13]. Among the constructions above, only a few have determined the supported file size of designed FR codes for certain parameter ranges due to the algebraic properties of corresponding constructions.

Iii Optimal FR Codes Attaining the Singleton-like Bound in (1)

In this section, we consider explicit constructions of FR codes that achieve the Singleton-like bound in (1). We begin with the following useful theorem that establishes a connection between the Singleton-like bound on the minimum distance and a lower bound on the reconstruction degree of FR codes.

Theorem 2.

Let be an -FR code. Then, we have

(6)

with equality holds if and only if .

Proof:

Since each storage node in contains coded packets, we obtain packets from any collection of nodes, which gives that . As is an integer, the inequality in (6) follows. Because the stored file can be recovered from any nodes, thus the system can tolerate node failures, i.e., .

Suppose . We can write for some integer . In this case, it follows that , meaning that any nodes do not contain enough packets for successful decoding. This implies .

Conversely, suppose now . Because holds in general, we obtain , which is equivalent to . ∎

Based on the result above, our main objective in this section is to design FR codes such that the following relation holds

(7)

Clearly, the following result holds for .

Lemma 3.

An FR code attains the Singleton-like bound in (1) for if there are no repeated blocks in .

In the following discussions, we mainly focus on optimal FR codes with reconstruction degree . Since an incidence structure is a basic concept in the theory of graphs and combinatorial designs, we discuss code constructions from the two methods separately.

Iii-a Optimal Constructions from Regular Graphs

A graph consists of a vertex set and an edge set, and two vertices are said to be incident if there exists an edge between them. The degree of a vertex is defined as the number of edges that are incident with it, and if every vertex in has the same degree of , then is called an -regular graph. In particular, Turán graphs are a special family of regular graphs defined as follows. Let and be two integers such that divides . An -Turán graph is formed by partitioning a set of vertices into distinct subsets, and connecting two vertices by an edge if they belong to different subsets. It follows that the degree of each vertex is .

In graph theory, a cycle is a nonempty trail in which the only repeated vertices are the first and last vertices (i.e., each vertex has degree two). The length of a cycle is its number of edges, and the girth of a graph is the minimum length of a cycle in the graph. As a concrete example, the girth of an -Turán graph with and is , and if , the girth is .

The construction rationale of regular graph based method is to treat each vertex as a block and each edge as a point. Hence, each packet in the resulting FR code is replicated twice and the degree of each vertex determines the capacity of storage nodes, i.e., an -regular graph with vertices can be adopted to yield an -FR code. In [18], Silberstein and Etzion derived the supported file size of the FR codes based on Turán graphs and regular graphs with large girth. Specifically, the file size of the FR code constructed from an -regular graph with girth is

(8)

and an -Turán graph based FR code has file size

(9)

for .

Theorem 4.

Let be an -regular graph with girth . Then, the FR code based on attains the Singleton-like bound in (1) for .

Proof:

Since the file size of is  for , we have

(10)

The right-hand side term in (10) equals to if

(11)

which gives that

(12)

This completes the proof. ∎

Remark 1. It is shown in [18] that the FR code based on an -regular graph with girth is optimal with respect to the upper bound on the supported file size in (3). We note that in this case, is also optimal with respect to the Singleton-like bound in (1) for .

The following corollary is an immediate consequence from the discussion above.

Corollary 5.

Suppose that is an -regular graph with girth . Then, the FR code based on attains the Singleton-like bound in (1) for .

We illustrate the power of the results above by evaluating FR codes from the incidence graph of projective planes. Let be a prime power. The incidence graph of a projective plane of order is a -regular graph that consists of vertices [24]. Furthermore, the girth of this incidence graph is . Then,

  1. if , there exists a -FR code attaining the Singleton-like bound in (1) for ;

  2. if , there exists a -FR code attaining the Singleton-like bound in (1) for .

Next, we proceed to consider constructions of FR codes derived from Turán graphs. From a practical perspective, we focus on -Turán graphs with and .

Theorem 6.

An -Turán graph based FR code attains the Singleton-like bound in (1) for .

Proof:

Note that each node contains packets, thus we need to prove that

(13)

Since the right-hand side term in the inequality is a positive integer, we can remove the floor operator, i.e.,

(14)

which completes that proof. ∎

For example, a -Turán graph based FR code attains the Singleton-like bound in (1) for .

Iii-B Optimal Constructions from Combinatorial Designs

A combinatorial design (or design) is an incidence structure , in which the blocks in are a collection of subsets of whose intersections have specified numerical properties. In design theory, different types of combinatorial designs have been introduced with the block intersection numbers satisfying certain requirements [25]. For example, a Steiner system is a set of points together with a family of -subsets of with the property that every pair of points in is contained in exactly one block. By a simple counting argument, it follows that each point is incident with blocks. Moreover, the largest subset of which intersects every block in in either zero or two points is called the maximal arc in .

A design is said to be resolvable if its block set  can be partitioned into several parallel classes, each of which is a set of blocks that partition the point set . If any two blocks from different parallel classes intersect in a constant number of points, then such a design is called an affine resolvable design [25].

By comparing the definitions of FR codes and combinatorial designs, we observe that any design satisfying the property that each block contains the same number of points and each point occurs in the same number of blocks can be leveraged to yield an FR code.

We start by investigating FR codes constructed from Steiner systems. Note that the dual of an based FR code  is a -FR code with . If the applied Steiner system has a maximal arc of size , then the supported file size of is shown in [13] to be

(15)

where .

Theorem 7.

Let be an FR code constructed from a Steiner system with , such that it has a maximal arc of size . Then, achieves the Singleton-like bound in (1) for .

Proof:

Considering the file size of in (15), it remains to prove that

(16)

since each node in contains packets. Therefore, we obtain

(17)

The proof is completed since the right-hand side term in (17) is strictly smaller than . ∎

Steiner system Code parameter Range of
Table II: Optimal FR Codes Derived from Steiner Systems

Remark 2. Except for finitely many constructions of with , Steiner systems are mainly known to exist for small values of , i.e., . Furthermore, there are no many general results regarding the existence of maximal arcs in Steiner systems. However, it is worth noting here that a Steiner system has at least one maximal arc if and  [26], and a Steiner system has a maximal arc of size if , , and is a prime power [27]. Indeed, these two infinite families of Steiner systems that have maximal arcs can be of practical interest for real-world systems since the duals of designed FR codes have an applicable repetition degree of or .

Table II lists several explicit optimal FR codes derived from Steiner systems and respectively. In addition to the results above, we can also construct FR codes that are optimal with respect to the Singleton-like bound in (1) for relatively large values of , e.g., there exists a -FR code achieving the Singleton-like bound in (1) for .

In what follows, we discuss another construction of FR codes from affine resolvable designs. Let be a prime power and be an integer. Based on affine geometries, a family of affine resolvable designs is introduced in [13], which contains parallel classes. Using this design, the authors devised a -FR code whose file size is

(18)

where . In the derivation of , the parameter should be chosen in such a manner that if , then and if , then .

Theorem 8.

Let be a prime power and be an integer. Let denote the largest integer such that

(19)

Then, there exists a -FR code that attains the Singleton-like bound in (1) for .

Proof:

By substituting the file size expression in (18) and into (7), we obtain

(20)

which can be rewritten as

(21)

For simplicity, we define the following function

where . It can be computed that

(22)

which suggests that is an increasing function of . Since , we have

(23)

Furthermore, the left-hand side term in (21) follows from the condition given in (19), which completes the proof. ∎

For any given prime power , we can find the appropriate  such that Inequality (19) holds. By choosing the parameters and , we can construct FR codes that achieve the Singleton-like bound in (1). For example, if , we obtain in this case. By setting and , there exists a -FR code attaining the Singleton-like bound in (1) for . If , then it follows that . We can obtain a -FR code that achieves the Singleton-like bound in (1) for when and .

Remark 3. One of the key advantages of an affine resolvable design based FR code is that each helper node contributes the same number of packets for node repair, thus achieving load-balancing between helper nodes [22]. Besides the construction based on affine geometries, we can also study affine resolvable designs from mutually orthogonal Latin squares (MOLS) [28]. A Latin square of order is an array in which each cell contains one point from the set , such that each point occurs exactly once in each row and exactly once in each column. For a Latin square of order , we denote the -entry by , where . We say that two Latin squares and of order are orthogonal if, for any , there is a unique cell such that and . A set of Latin squares of order , say , are said to be mutually orthogonal if  and are orthogonal for all . Suppose that is a prime and is a positive integer. In [13], the authors presented an explicit construction of MOLS of order , which can be used to design a -FR code with file size

(24)

where . Since the expression for supported file size is similar to that in (15), we have the following result.

Corollary 9.

Let be a prime and be a positive integer. There exists a -FR code attaining the Singleton-like bound in (1) for .

Consider now . In this case, we can obtain a -FR code attaining the Singleton-like bound in (1) for .

Remark 4. All the optimal FR codes evaluated in this section satisfy the condition that , i.e., the repair locality is larger than (or equal to) the reconstruction degree. In this scenario, the Singleton-like bound in (1) is essentially the same as the upper bound in (2). Since node repair locality is also an important metric in practical distributed storage systems, we consider in the following section locally repairable FR codes that achieve the upper bound in (2).

Iv Optimal FR Codes Attaining the Minimum Distance Bound in (2)

In this section, we investigate FR codes with repair locality that attain the upper bound in (2), i.e., optimal FR codes that also enjoy the desirable local repair property. Our observation is that for a given FR code , the repair locality can be uniquely determined based on the code structure.222For example, an explicit algorithm for computing the repair locality of FR codes is presented in [29]. On the other hand, the reconstruction degree depends on the size of stored files. As is a non-decreasing function of , we can employ to store data objects of certain large sizes such that the corresponding reconstruction degrees are larger than the repair locality. Although this method yields locally repairable FR codes, whether they achieve the minimum distance bound in (2) remains to be studied. Unfortunately, for the code constructions based on resolvable designs in [13], the authors showed that simply increasing the value of may result in suboptimal FR codes with respect to the bound in (2). However, this is not always the case and we present in the subsequent discussion explicit locally repairable FR codes that attain the upper bound in (2).

Similar to Theorem 2, we have the following result.

Lemma 10.

Let be an -FR code with repair locality . If the reconstruction degree satisfies

(25)

then attains the minimum distance bound in (2).

In the following, we consider optimal FR codes attaining the bound in (2) from regular graphs with large girth. Moreover, the dual codes can also achieve the bound in (2) for certain scenarios.

Theorem 11.

Let be an FR code constructed from an -regular graph with girth . Then, attains the upper bound in (2) if the reconstruction degree satisfies the requirements listed in Table III.

Proof:

Recall that is an -FR code with repair locality , and its file size is for . We will focus on the scenario wherein the reconstruction degree is strictly larger than , and we consider two different cases depending on the relation between and .

Case 1: , where is a positive integer. In this case, we have

(26)

which equals to if

(27)

It follows that . Thus, we obtain since is an integer.

Relation among , , and Requirements
or
Define
or
Define
Table III: The Requirements on for A Given Tuple

Case 2: , where and are positive integers. Similarly, we can compute that

(28)

and

(29)

Note that if , then , and . Therefore, we must have in order to generate an FR code attaining the bound in (2). If , we obtain

(30)

Then, holds if

(31)

The left-hand side term in (31) suggests that

(32)

and the right-hand side term can be rewritten as

(33)

Let . If , the inequality above always holds, and if , we have

(34)

The proof is completed by further taking the supported file size of for (i.e., ) into consideration. ∎

Remark 5. In [13, 15], the authors considered FR codes designed from regular graphs with given girth that attain the minimum distance bound in (2), where they focused on either the case (see e.g. Lemma 15 [13]) or (see e.g. Theorem 6 [15]). We note that there exist two main distinctions between the codes in [13, 15] and those evaluated in Theorem 11. The first distinction is that the reconstruction degree is restricted to in [13] and in [15], yet the optimal FR code in this section is discussed for . The second distinction lies in that the relation among the parameters and is not investigated in [13, 15], which is shown to be important in forming an optimal FR code. Furthermore, their results do not cover the case when divides . Indeed, both the regular graph based optimal FR codes in [13] and [15] can be viewed as special cases as those in Theorem 11.

Theorem 12.

Let be an FR code constructed from an -regular graph with vertices and girth . Then, the dual code attains the upper bound in (2) if one of the following conditions holds

  1. if is an integer such that , and the size of stored file is ;

  2. if is an integer such that , and the size of stored file is ;

  3. if is an integer such that , and the size of stored file is ;

  4. if is an integer such that , and the size of stored file is .

Proof:

According to Lemma 1, the dual is an -FR code with file size

(35)

In other words, the reconstruction degree of is when decoding a data object of size . Since each storage node in has a repair locality , we need to prove that

(36)

Indeed, we can discuss four different cases depending on the value of . For example, if , then we let , where is a positive integer. Since , Equation (36) can be rewritten as

(37)

which gives that

(38)

Therefore, if is an integer such that , then attains the upper bound in (2) when storing a data object of size . In addition, the other three cases can be obtained following a similar procedure, which completes the proof. ∎

For example, we consider a cycle graph with vertices. There exists an -FR code that attains the upper bound in (2) when the size of stored file is . If , then there exists a -FR code that achieves the upper bound in (2) for and , respectively.

Corollary 13.

Let be an FR code derived from an -regular graph with vertices and girth . Then, attains the upper bound in (2) if one of the followings holds

  1. if is an integer such that , and the size of stored file is ;

  2. if is an integer such that , and the size of stored file is ;

  3. if is an integer such that , and the size of stored file is ;

  4. if is an integer such that , and the size of stored file is .

Proof:

The proof follows similar steps as that given in Theorem 12, where the only difference is that the supported file size of is for