 # Capacity of Locally Recoverable Codes

Motivated by applications in distributed storage, the notion of a locally recoverable code (LRC) was introduced a few years back. In an LRC, any coordinate of a codeword is recoverable by accessing only a small number of other coordinates. While different properties of LRCs have been well-studied, their performance on channels with random erasures or errors has been mostly unexplored. In this note, we analyze the performance of LRCs over such stochastic channels. In particular, for input-symmetric discrete memoryless channels, we give a tight characterization of the gap to Shannon capacity when LRCs are used over the channel.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

A code

, a collection of vectors, is called locally recoverable with locality

, if content of any coordinate can be recovered by accessing only other coordinates [6, 10].

Formally, a -ary code of length cardinality and distance is a set of length- vectors over an alphabet , with minimum pairwise Hamming distance . The quantity is called the dimension of and is called the rate of the code. If is a finite field and is a linear subspace of then is the dimension of as a vector space. Below, , and for any , is the projection of in the th coordinate. By extension, for any , is the projection of onto the coordinates of .

###### Definition.

A code is locally recoverable code (LRC) with locality if every coordinate is contained in a subset of size such that there is a function with the property that for every codeword

 ci=ϕi(cj1,…,cjr), (1)

where are the elements of We use the notation to refer to a code of length , dimension and locality

Locally recoverable codes have been the subject of intense research, including constructions [14, 2, 16, 11], bounds [3, 1, 15] and generalizations [17, 12, 13, 8]. In this paper, we investigate the maximum achievable rate of locally repairable codes such that reliable transmission is possible over a discrete memoryless channel (DMC). While LRCs are subject to a lot of interest, surprisingly, with the exception for , no paper deals with this quite basic theoretical question.

The result of  holds for a binary erasure channel

(BEC) with erasure probability

. The Shannon capacity of such a channel is . It was shown that to achieve a rate of , the locality must scale as . While the constant within is not clear, the method therein also does not extend to binary symmetric channel (BSC) or other binary-input memoryless channels.

In this note we do a finer analysis of the gap to capacity for LRCs. For a discrete memoryless channel given by a input-output stochastic transition matrix111We sometime also refer to a DMC by

to describe the input-output random variables.

, let be the Shannon capacity of the channel, and to be the capacity of the channel where we are constrained to use only a locally repairable code with locality . Let us define,

 Gap(W,r)≡Cap(W)−Cap(W,r).

An impossibility result in this regard gives a lower bound on the gap, while an achievability scheme gives an upper bound on the gap. Our results are summarized in Table 1. Here, is the binary entropy function. While the results hold for binary-input channels, it is not difficult to extend the for the -ary case. For the BEC and BSC, the results are also plotted in Fig. 1 for . Note that, we are able to exactly calculate the capacity for BEC, while we have tight upper and lower bounds for BSC.

[!t] The gap to capacities of LRCs over binary-input symmetric DMCs Channel Lower Bound on Upper Bound on BEC() BSC() General

• also achievable by linear codes.

• we conjecture this bound to be tight.

To prove the lower (converse) and upper bounds for BEC we rely on simple information inequalities and random coding methods. It is difficult to extend the converse bounding arguments to other channels. However in some sense BEC is the ‘best’ channel among all binary input memoryless symmetric channels . We can use that fact to lower bound the gap to capacity for more general channels including BSC. A random coding method for BSC also gives the upper bound on gap to capacity for any binary input channels by the same argument, as BSC is the ‘worst’ among all in the same sense. Figure 1: Capacities of Locally Recoverable Codes with locality 2 over BEC and BSC.

Our results holds for some extended definition of locally recoverable codes .

###### Definition.

A code of cardinality is said to have the locality property (to be an LRC code) where , if each coordinate is contained in a subset of size at most such that the restriction of the code to the coordinates in forms a code of distance at least . Notice that the values of any coordinates of are determined by the values of the remaining coordinates, thus enabling local recovery. is called the repair group of coordinate .

As an example, we show an upper bound on gap to capacity for LRCs with , and give directions for the general case (see, Sec. 5).

Sections 2 and 3 deal with the binary erasure and binary symmetric channels respectively, while Sec. 4 deals with other binary input channels.

## 2 LRC Capacity of the Binary Erasure Channel

For a binary erasure channel with erasure probability , the Shannon capacity is . Suppose when we are constrained to use a locally recoverable code with locality as the input, the capacity is .

###### Theorem 1.

The capacity of LRC with locality over BEC() is given by:

 CapBEC(p,r)=1−p−(1−p)r+1r+1.

In the remainder of this section we prove this theorem.

### 2.1 Converse Bound

First we show the converse result.

###### Lemma 1.

Capacity of LRC codes over a BEC with erasure probability ,

 CapBEC(p,r)≤1−p−(1−p)r+1r+1.
###### Proof.

Assume that a code is used over BEC. The random codeword was sent over the channel. The received vector is . Let denote the erased coordinates.

Using Fano’s inequality, the probability of error is given by,

 Pe≥H(Xn1∣Zn1)log|C|.

Now, note that . Therefore,

 H(Xn1∣Zn1) =H(Xn1∣Zn1,I) =H(Xn1,Zn1,I)−H(Zn1,I) =H(Xn1)+H(Zn1,I∣Xn1)−H(I)−H(Zn1∣I) =H(Xn1)+H(I∣Xn1)+H(Zn1∣I,Xn1)−H(I)−H(Zn1∣I) =H(Xn1)+H(Zn1∣I,Xn1)−H(Zn1∣I) =H(Xn1)+0−H(Zn1∣I) =log|C|−H(Zn1∣I).

This implies,

 Pe≥1−H(Zn1∣I)log|C|.

Now,

 H(Zn1∣I=u)≤n−|u|−Lur+1,

where is the number of coordinates that are not in as well as their entire recovery group is not in . Hence,

 H(Zn1∣I)≤n−EBEC|I|−1r+1EBECLu=n−np−1r+1EBECLI,

where the subscript BEC denote that the average is with respect to the randomness in BEC. Let us now derive Let

is the indicator random variable that denotes that the

th coordinate as well as its recovery group are not in (not erased). We have

 Pr(χi=1)=(1−p)r+1.

Therefore, . Therefore,

 Pe≥1−1−p−(1−p)r+1r+1R.

To achieve vanishing probability of error, one must have

 R≤1−p−(1−p)r+1r+1.

### 2.2 Achievability

###### Lemma 2.

There exists a family of LRC codes with rate

 R≥1−p−(1−p)r+1r+1,

that when used over a BEC() results in a probability of error that goes to with .

###### Proof.

We will show this by constructing a code. Partition the set of coordinates into groups of size each (we assume that divides ). Now, consider the bits of a group as a super-symbol. Consider the input-output channel induced by these super-symbols instead of the BEC. We find the capacity of this channel, and then normalize by .

Let us choose the codewords in the following way. Within each group symbols are uniformly and independently (Bernoulli()) chosen. The last symbol of each group is the modulo-2 sum of the other symbols. The rate of this code such that the probability of error being vanishing is given by222Here we assume that we employ a joint-typicality decoder that considers the each block of bits as a super-symbol over an alphabet of size .

 1r+1I(Xr+11;Yr+11),

where represents the -bit input and output. Now we have,

 1r+1I(Xr+11;Yr+11) =1r+1(H(Yr+11)−(r+1)h(p)) =1r+1H(Yr+11)−h(p).

We can now calculate Let the number of erasures in be . There are two cases to consider.

First case, . Then,

 Pr(Yr+11=yr+11)={12r(1−p)r+1 if wt(yr+11) even 0 if wt(yr+11) odd.

Second, . Then,

 Pr(Yr+11=yr+11)=pt(1−p)r+1−t2t−12r.

Therefore,

 H(Yr+11) =−2r(1−p)r+12rlog(1−p)r+12r −r+1∑t=1(r+1t)pt(1−p)r+1−t2t−12r2r+1−tlog(pt(1−p)r+1−t2t−12r) =−(r+1)(1−p)r+1log(1−p)+(1−p)r+1r −r+1∑t=1(r+1t)pt(1−p)r+1−t(tlogp+(r+1−t)log(1−p)+t−1−r) =−(r+1)(1−p)r+1log(1−p)+(1−p)r+1r −(r+1)(log(1−p)−1)(1−(1−p)r+1)−(logp−log(1−p)+1)p(r+1).

We have,

 1r+1I(Xr+11;Yr+11) =−(1−p)r+1log(1−p)+(1−p)r+1rr+1 −(log(1−p)−1)(1−(1−p)r+1)−(logp−log(1−p)+1)p−h(p) =1−p−(1−p)r+1r+1.

It turns out that the above method extends to other channels. The achievability result for BEC also holds with linear code.

###### Proposition 1.

There exists a family of linear LRC codes with rate

 R≥1−p−(1−p)r+1r+1,

that when used over a BEC() results in a probability of error that goes to with .

###### Proof.

To see this, randomly choose a generator matrix in the following way. Partition the set of coordinates into groups of size each. For each group chose columns randomly and uniformly from . The st column of each group is just the coordinate-wise modulo-2 sum of all the other columns of the group.

Now let us choose each columns of this matrix with probability and form a submatrix. We would like to find the rank of this submatrix. As long as less than or equal to columns are chosen from a group, it is equivalent to choosing columns randomly and uniformly from . Let be the set of chosen columns. Let be the number of groups from where all the elements are chosen. Therefore, the submatrix will have rank at least equal to the rank of a matrix where columns are randomly and uniformly chosen from . The rank of the submatrix is with probability as long as Now with probability we have . Therefore as long as , the rank of the submatrix is with probability at least . Therefore there must exist a matrix in the ensemble with rank . ∎

## 3 LRC Capacity of the Binary Symmetric Channel

For a binary symmetric channel with error probability , the Shannon capacity is . Suppose when we are constrained to use a locally recoverable code with locality as the input, the capacity is .

###### Theorem 2.

The capacity of LRC with locality over BSC() follows:

 1−h(p)−1r+1(1−h(1−(1−2p)r+12))≤CapBSC(p,r)≤1−h(p)−(1−h(p))r+1r+1.

In the remainder of this section we prove this theorem.

### 3.1 Converse

The upper bound of theorem 2 follows from the more general results about binary-input symmetric discrete memoryless channels. We postpone the proof till next section.

### 3.2 Achievability

###### Lemma 3.

There exists a family of LRC codes with rate

that when used over a BSC() results in a probability of error that goes to with .

###### Proof.

We will show the above by constructing a code. Again, partition the set of coordinates into groups of size each. Now, consider the bits of a group as a super-symbol. Consider the input-output channel induced by these super-symbols instead of the BSC. We find the capacity of this channel.

Let us choose the codewords in the following way. Within each group symbols are uniformly and independently (Bernoulli()) chosen. The last symbol of each group is the modulo-2 sum of the other symbols. The rate of this code such that the probability of error being vanishing is given by

 1r+1I(Xr+11;Yr+11),

where represents the -bit input and output. Note that we arrive at this rate by considering the group of bits as a supersymbol from an alphabet of size , and using a joint-typicality decoder. Now we have,

 1r+1I(Xr+11;Yr+11) =1r+1(H(Yr+11)−(r+1)h(p)) =1r+1H(Yr+11)−h(p).

We can now calculate

 Pr(Yr+11 =yr+11)=∑xr+11Pr(Yr+11=yr+11|Xr+11=xr+11)Pr(Xr+11=xr+11) =12r∑xr+11:wt(xr+11) is even Pr(Yr+11=yr+11|Xr+11=xr+11) =12r∑xr+11:wt(xr+11) is even pdH(xr+11,yr+11)(1−p)r+1−dH(xr+11,yr+11) ={12r∑w even (r+1w)pw(1−p)r+1−w, when wt(yr+11) even % 12r∑w odd (r+1w)pw(1−p)r+1−w,%whenwt(yr+11) odd =⎧⎨⎩12r+1(1+(1−2p)r+1), when % wt(yr+11) even 12r+1(1−(1−2p)r+1), when wt(yr+11) odd

Therefore,

 H(Yr+11) =−2r2r+1(1+(1−2p)r+1)log(1+(1−2p)r+1)2r+1 −2r2r+1(1−(1−2p)r+1)log(1−(1−2p)r+1)2r+1.

After some simplifications, we have

 1r+1I(Xr+11;Yr+11)=1−h(p)−1r+1(1−h(1−(1−2p)r+12)).

## 4 General binary input-symmetric channels

The results for general binary input-symmetric channels follow from the converse and achievability results for BEC or BSC because in some sense these channels are the best and worst among the general cases respectively. To formalize this, we need the notion of more capable channel. All the channels below are discrete memoryless channels.

###### Definition.

A channel is said to be more capable than another channel if for any input distribution on ,

 I(X;Y)≥I(X;Z).

It is known that among the binary-input symmetric discrete memoryless channels of same capacity BSC is the least capable and BEC is the most capable . The following can be derived from . This result also follows from [4, ex. 16, p. 116].

###### Proposition 2.

Suppose the channel is more capable than the channel , and a code of rate achieves a probability of error over the channel . Then there exists a code of rate that achieves a probability of error over , where as .

Since we have an impossibility (converse) result for BEC and an achievability result for BSC, using Prop. 2, we can obtain the following result.

###### Theorem 3.

For any binary-input symmetric discrete memoryless channel ,

 Cap(W)−1r+1(1−h(1−(1−2h−1(1−Cap(W)))r+12))≤Cap(W,r)≤Cap(W)−Cap(W)r+1r+1.
###### Proof.

For a channel , suppose . Therefore, a BEC with erasure probability must be more capable than the channel . There exists an LRC of rate that achieves a vanishing probability of error over the channel . Therefore, there exists an LRC of rate that achieves a vanishing probability of error over the BEC of erasure probability . This implies,

 Cap(W,r)≤1−p−(1−p)r+1r+1,

which proves the upper bound.

On the other hand, suppose Therefore, a BSC with flip probability must be less capable than the channel . We know that there exists a code of rate

 1−h(p′)−1r+1(1−h(1−(1−2p′)r+12)),

that achieves a vanishing probability of error over the BSC with error probability . Therefore there must exist a code of same rate that achieves a vanishing probability of error over the channel . ∎

## 5 Generalizing local repair: repairing multiple failures

It is now a natural question to ask whether our results extend to the general definition of locality. Indeed, the converse result for BEC extends quite straightforwardly, and , the capacity of BEC when we are restricted to use a code with locality, is bounded by,

 CapBEC(p,ρ,r)≤1−p−(ρ−1)(1−p)r+1r+1.

However, it is not straightforward to extend the achievability result for erasure channel. Indeed, the codewords restricted to each repair group must form a code with minimum distance . Therefore it makes sense to choose random codewords of a code of distance as disjoint repair blocks to form the overall LRC. For this we need to figure out where is the output of a BEC where the input is a randomly chosen codeword of a fixed code of distance . We need to know how the complete statistics of the distribution of values in each set of coordinates for to evaluate this quantity.

On the other hand, if is a linear code and the channel is BSC, then the entropy of the output of the channel can be computed if we know the coset weight distribution of the code.

To construct a code with locality we first choose a fixed linear code of length and distance . Next we construct a random code of length . A codeword of is formed by concatenating randomly and uniformly chosen codewords of side-by-side. Again, if we use a joint-typicality decoding then the achievable rate of transmission is given by,

 1r+1I(Xr+11;Yr+11),

where is a randomly and uniformly chosen codeword of and is the output of a BSC with flip probability when the input to the BSC is . Now we have,

 1r+1I(Xr+11;Yr+11) =1r+1(H(Yr+11)−(r+1)h(p)) =1r+1H(Yr+11)−h(p).

We can calculate when is a linear code.

 Pr(Yr+11=yr+11) =∑xr+11Pr(Yr+11=yr+11|Xr+11=xr+11)Pr(Xr+11=xr+11) =1|A|∑xr+11∈APr(Yr+11=yr+11|Xr+11=xr+11) =1|A|∑xr+11∈ApdH(xr+11,yr+11)(1−p)r+1−dH(xr+11,yr+11) =1|A|r+1∑w=0A(i)wpw(1−p)r+1−w,

if belongs to the th coset of the code, where is the number of vectors of Hamming weight in the th coset of the code , . Let us define the coset weight enumerator of the code

 A(i)(x,y)=r+1∑w=1A(i)wxr+1−wyw.

Then,

 H(Yr+11) =−2r+1|A|−1∑i=0|A|⋅1|A|A(i)(1−p,p)logA(i)(1−p,p)|A| =−2r+1|A|−1∑i=0A(i)(1−p,p)logA(i)(1−p,p)+log|A|2r+1|A|−1∑i=0A(i)(1−p,p).

Now,

 2r+1|A|−1∑i=0A(i)(1−p,p) =2r+1|A|−1∑i=0r+1∑w=0A(i)wpw(1−p)r+1−w =r+1∑w=0(2r+1|A|−1∑i=0A(i)w)pw(1−p)r+1−w =r+1∑w=0(r+1w)pw(1−p)r+1−w=1.

Therefore,

 H(Yr+11)=log|A|+H({A(i)(1−p,p)}2r+1|A|−1i=0),

where . Overall,

 1r+1I(Xr+11;Yr+11)=log|A|+H({A(i)(1−p,p)}2r+1|A|−1i=0)r+1−h(p).

#### Hamming code as local codes: two erasure per block

By taking the code to be the Hamming code of length , we can therefore have the following result for , as the coset-weight distribution of Hamming code is known:

 CapBSC(p,ρ=3,r) ≥1−h(p)−1r+2(1−(1−2p)r+22)log(1−(1−2p)r+22) −1+(r+1)(1−2p)r+22(r+1)(r+2)log(1+(r+1)(1−2p)r+22).

This automatically gives a lower bound on since BEC is a more capable channel.

 CapBEC(p,ρ=3,r) ≥1−p−1r+2(1−(1−2h−1(p))r+22)log(1−(1−2h−1(p))r+22) −1+(r+1)(1−2h−1(p))r+22(r+1)(r+2)log(1+(r+1)(1−2h−1(p))r+22).

At this bound evaluates to . Note that, from the upper bound we have, . Therefore the bounds are not tight even at .

## 6 Open problems

There are some compelling open problems left to study regarding capacity of LRCs. First of all, for a BSC, the gap to capacity is not exactly characterized. We conjecture that the upper bound on the gap (see Table 1) is tight.

Not much is known regarding the capacity of generalized notions of LRCs. Even for an LRC that corrects two erasures per repair group, the capacity is unknown in the BEC (the bounds are not tight even when the erasure probability is zero).

Finally, while we do not foresee an obstacle to extend the results for larger alphabets, it would be good to have them documented.

Acknowledgement: The author is grateful to Hamed Hassani for letting him know about the notion of ‘more capable’ channels and their potential use in this context, and to Chandra Nair for discussions on the more capable channels.

## References

•  A. Agarwal, A. Barg, S. Hu, A. Mazumdar, and I. Tamo. Combinatorial alphabet-dependent bounds for locally recoverable codes. IEEE Transactions on Information Theory, 64(5):3481–3492, 2018.
•  A. Barg, K. Haymaker, E. W. Howe, G. L. Matthews, and A. Várilly-Alvarado. Locally recoverable codes from algebraic curves and surfaces. In Algebraic Geometry for Coding Theory and Cryptography, pages 95–127. Springer, 2017.
•  V. R. Cadambe and A. Mazumdar. Bounds on the size of locally recoverable codes. Information Theory, IEEE Transactions on, 61(11):5787–5794, 2015.
•  I. Csiszár and J. Körner. Information Theory: Coding Theorems for Discrete Memoryless Systems. Academic Press, 1981.
•  Y. Geng, C. Nair, S. S. Shitz, and Z. V. Wang. On broadcast channels with binary inputs and symmetric outputs. IEEE Transactions on Information Theory, 59(11):6980–6989, 2013.
•  P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin. On the locality of codeword symbols. IEEE Trans. Inform. Theory, 58(11):6925–6934, Nov. 2012.
•  J. Körner and K. Marton. Comparison of two noisy channels. Topics in information theory (ed. by I. Csiszar and P.Elias), pages 411–423, 1977.
•  A. Mazumdar. Storage capacity of repairable networks. IEEE Transactions on Information Theory, 61(11):5810–5821, 2015.
•  A. Mazumdar, V. Chandar, and G. W. Wornell. Update-efficiency and local repairability limits for capacity approaching codes. Selected Areas of Communications, IEEE Journal on, 32(5), 2014.
•  D. S. Papailiopoulos and A. G. Dimakis. Locally repairable codes. In Proc. Int. Symp. Inform. Theory, pages 2771–2775, Cambridge, MA, July 2012.
•  N. Prakash, G. M. Kamath, V. Lalitha, and P. V. Kumar. Optimal linear codes with a local-error-correction property. In Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on, pages 2776–2780. IEEE, 2012.
•  A. S. Rawat, A. Mazumdar, and S. Vishwanath. Cooperative local repair in distributed storage. EURASIP Journal on Advances in Signal Processing, 2015(1):107, 2015.
•  A. S. Rawat, D. S. Papailiopoulos, A. G. Dimakis, and S. Vishwanath. Locality and availability in distributed storage. In Information Theory (ISIT), 2014 IEEE International Symposium on, pages 681–685. IEEE, 2014.
•  I. Tamo and A. Barg. A family of optimal locally recoverable codes. IEEE Transactions on Information Theory, 60(8):4661–4676, 2014.
•  I. Tamo, A. Barg, and A. Frolov. Bounds on the parameters of locally recoverable codes. IEEE Transactions on Information Theory, 62(6):3070–3083, 2016.
•  I. Tamo, D. S. Papailiopoulos, and A. G. Dimakis. Optimal locally repairable codes and connections to matroid theory. IEEE Transactions on Information Theory, 62(12):6661–6671, 2016.
•  A. Wang and Z. Zhang. Repair locality with multiple erasure tolerance. IEEE Transactions on Information Theory, 60(11):6979–6987, 2014.