# Double and Triple Erasure-Correcting-Codes over Graphs

In this paper we study array-based codes over graphs for correcting multiple node failures, with applications to neural networks, associative memories, and distributed storage systems. We assume that the information is stored on the edges of a complete undirected graph and a node failure is the event where all the edges in the neighborhood of a given node have been erased. A code over graphs is called ρ-node-erasure-correcting if it allows to reconstruct the erased edges upon the failure of any ρ nodes or less. We present a binary optimal construction for double-node-erasure correction together with an efficient decoding algorithm when the number of nodes is a prime number. Furthermore, we extend this construction for triple-node-erasure-correcting codes when the number of nodes is a prime number and two is a primitive element in _n. These codes are at most a single bit away from optimality.

## Authors

• 4 publications
• 8 publications
• 41 publications
12/02/2018

### Double and Triple Node-Erasure-Correcting Codes over Graphs

In this paper we study array-based codes over graphs for correcting mult...
09/15/2020

### Partial MDS Codes with Regeneration

Partial MDS (PMDS) and sector-disk (SD) codes are classes of erasure cor...
06/21/2021

### Storage Codes with Flexible Number of Nodes

This paper presents flexible storage codes, a class of error-correcting ...
11/08/2018

### Codes correcting restricted errors

We study the largest possible length B of (B-1)-dimensional linear codes...
01/05/2022

### Perfect Codes Correcting a Single Burst of Limited-Magnitude Errors

Motivated by applications to DNA-storage, flash memory, and magnetic rec...
08/03/2021

### On the Structure of the Binary LCD Codes having an Automorphism of Odd Prime Order

The aim of this work is to study the structure and properties of the bin...
07/08/2020

### Algorithm-Based Checkpoint-Recovery for the Conjugate Gradient Method

As computers reach exascale and beyond, the incidence of faults will inc...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Networks and distributed storage systems are usually represented as graphs with the information stored in the nodes (vertices) of the graph. In our recent work [YY17, YY18, YY18b], we have introduced a new model which assumes that the information is stored on the edges. This setup is motivated by several information systems. For example, in neural networks, the neural units are connected via links which store and transmit information between the neural units [Hopfield:1988:NNP:65669.104422]. Similarly, in associative memories, the information is stored by associations between different data items [6283016]. Furthermore, representing information in a graph can model a distributed storage system [NetworkCoding] while every two nodes can be connected by a link that represents the information that is shared by the nodes.

In [YY17, YY18, YY18b], we introduced the notion of codes over graphs, which is a class of codes storing the information on the edges of a complete undirected graph (including self-loops). Thus, each codeword is a labeled graph with  nodes (vertices) and each of the edges stores a symbol over an alphabet . A node failure is the event where all the edges incident with a given node have been erased, and a code over graphs is called -node-erasure-correcting if it allows to reconstruct the contents of the erased edges upon the failure of any nodes or less.

The information stored in a complete undirected graph can be represented by an symmetric array and a failure of the th node corresponds to the erasure of the th row and th column in the array. Hence, this problem is translated to the problem of correcting symmetric crisscross erasures in square symmetric arrays [DBLP:journals/tit/Roth91]. By the Singleton bound, the number of redundancy edges (i.e., redundancy symbols in the array) of every -node-erasure-correcting code must be at least , and a code meeting this bound will be referred as optimal. While the construction of optimal codes is easily accomplished by MDS codes, their alphabet size must be at least the order of , and the task of constructing optimal (or close to optimal) codes over graphs over smaller alphabets remains an intriguing problem.

A natural approach to address this problem is by using the wide existing knowledge on array code constructions such as [DBLP:journals/tc/BlaumBBM95, ButterflyISIT, DBLP:journals/tc/HuangX08, journals/jct/Schmidt10, RDP, DBLP:conf/icc/MohammedVHC10, Raviv16, DBLP:journals/tit/Roth91, S15, DBLP:journals/tit/TamoWB13, DBLP:journals/tit/XuBBW99]. However, the setup of codes over graphs differs from that of classical array codes in two respects. First, the arrays are symmetric, and, secondly, a failure of the th node in the graph corresponds to the failure of the th row and the th column (for the same ) in the array. Most existing constructions of array codes are not designed for symmetric arrays, and they do not support this special row–column failure model. However, it is still possible to use existing code constructions and modify them to the special structure of the above erasure model in graphs, as was done in [YY17],[YY18b]. More specifically, based upon product codes [ProductCode1],[ProductCode2], a construction of optimal codes whose alphabet size grows only linearly with  has been proposed. Additionally, using rank-metric codes [DBLP:journals/tit/Roth91, journals/jct/Schmidt10, S15], binary codes over graphs were designed, however they are relatively close—yet do not attain—the Singleton bound. In [YY17],[YY18], a construction of optimal binary codes for two node failures was also presented based upon ideas from EVENODD codes [DBLP:journals/tc/BlaumBBM95].

Another approach for handling symmetric crisscross erasures (in symmetric arrays) is by using symmetric rank-metric codes. In [journals/jct/Schmidt10], Schmidt presented a construction of linear symmetric binary array codes with minimum rank , where if is even, and otherwise. Such codes can correct any column or row erasures. Hence, it is possible to use these codes to derive -node-failure-correcting codes while setting , as the node failures translate into the erasure of columns and rows. However, the redundancy of these codes is symbols away from the Singleton bound for symmetric crisscross erasures (e.g., for , their redundancy is while the Singleton lower bound is ).

In this paper we carry an algebraic approach such as the one presented in [DBLP:journals/tit/BlaumR93] in order to propose new constructions of binary codes over graphs. In Section II, we formally define codes over graphs and review several basic properties from [YY17, YY18b] that will be used in the paper. In Section III, we present our optimal binary construction for two-node failures and its decoding procedure in Section IV. This construction is simpler than our optimal construction from [YY17],[YY18b]. Then, in Section V, we extend this construction for the three-node failures case. This new construction is only at most a single bit away from the Singleton bound, thereby outperforming the construction obtained from [journals/jct/Schmidt10]. Lastly, Section VI concludes the paper.

## Ii Definitions and Preliminaries

For a positive integer , the set will be denoted by and for a prime power , is the finite field of size . A linear code of length and dimension over will be denoted by or , where denotes its minimum distance. In the rest of this section, we follow the definitions of our previous work [YY17] for codes over graphs.

A graph will be denoted by , where is its set of nodes (vertices) and is its edge set. In this paper, we only study complete undirected graphs with self-loops, and in this case, the edge set of an undirected graph over an alphabet is defined by , with a labeling function . By a slight abuse of notation, every undirected edge in the graph will be denoted by where the order in this pair does not matter, that is, the notation is identical to the notation , and thus there are edges. We will use the notation for such graphs. For the rest of the paper, whenever we refer to a graph we refer to an undirected graph.

The labeling matrix of an undirected graph is an symmetric matrix over denoted by , where . We also use the lower-triangle-labeling matrix of to be the matrix such that if and otherwise . The zero graph will be denoted by where for all , .

## Iii Optimal Binary Double-Node-Erasure-Correcting Codes

In this section we present a family of optimal binary linear double-node-erasure-correcting codes with nodes, where is a prime number.

Remember that for the th neighborhood set of the th node is . Let be a prime number and let be a graph with vertices. For we define the neighborhood of the th node without itself self-loop by

 Sh={⟨vh,vℓ⟩ | ℓ∈[n],h≠ℓ}. (1)

We also define for , the th diagonal set by

 Dm={⟨vk,vℓ⟩|k,ℓ∈[n],⟨k+ℓ⟩n=m}. (2)

The sets for will be used to represent parity constraints on the neighborhood of each node and similarly the sets for will be used to represent parity constraints on the diagonals with slope one in the labeling matrix . We state that for all , the size of is . This holds since in each neighborhood , there is only a single edge which belongs to , which is the edge . Another important observation is that contains only a single self-loop which is the edge .

###### Example 1

. In Fig. 1 we demonstrate the sets and , where , of a graph on its lower-triangle-labeling matrix .

We introduce one more useful notation for graphs. Let be a graph. For we denote the neighborhood-polynomials of to be

 a′i(x)=ei,0+ei,1x+ei,2x2+⋯+ei,n−1xn−1,

where for , . We also denote the neighborhood-polynomial without self-loops of to be

 ai(x)=a′i(x)−ei,ixi.

We are now ready to present the construction of optimal double-node-erasure-correcting codes.

###### Construction 1

Let be a prime number. The code over graphs is defined as follows,

 C2={G=(Vn,L)∣∣ ∣∣(a)∑⟨vi,vj⟩∈Shei,j=0,h∈[n](b)∑⟨vi,vj⟩∈Dmei,j=0,m∈[n]}.

Note that for any graph over the binary field, it holds that

 ∑h∈[n]∑⟨vi,vj⟩∈Shei,j=n−1∑h=0n−1∑ℓ=0,ℓ≠heh,ℓ=2n−1∑h=0h−1∑ℓ=0eh,ℓ=0. (3)

Therefore the code has at most linearly independent constraints which implies that its redundancy is at most . Since we will prove in Theorem 1 that is a double-node-correcting codes, according to the Singleton bound we get that the redundancy of the code is exactly , and thus it is an optimal code.

According to Theorem LABEL:th:mid_dist, in order to prove that is a double-node-erasure-correcting code, we need to show that , that is, for every , . This will be proved in the next theorem.

###### Theorem 1

. For all prime number , the code is an optimal double-node-erasure-correcting code.

###### Proof:

Assume in the contrary that and let be a nonzero graph such that (a similar proof will hold in case ). Since , the graph has a vertex cover of size 2, that is, all its nonzero edges are confined to the neighborhoods of some two nodes . By symmetry of the graph, it suffices to prove the above property for the case where the two nodes are for some . During the proof, we assume that , for are the neighborhood polynomials of the graph . We first prove the following two claims.

###### Claim 1

. The following properties hold on the graph :

1. For all , .

2. For all , .

3. .

###### Proof:
1. According to the neighborhood constraint for all , we have that

 0=∑⟨vh,vℓ⟩∈Sheh,ℓ=n−1∑ℓ=0,ℓ≠heh,ℓ=eh,0+eh,i,

and since for all , we get that .

2. For , denote the set by . Therefore, we have that

 0=∑⟨vℓ,v⟨h−ℓ⟩n⟩∈Dheℓ,⟨h−ℓ⟩n =∑⟨vℓ,v⟨h−ℓ⟩n⟩∈D′heℓ,⟨h−ℓ⟩n+e0,h+ei,⟨h−i⟩n,

and since for all , we get that .

3. According to the diagonal constraint we get that

 0=∑⟨vs,vℓ⟩∈Dies,ℓ=e0,i+∑⟨vs,vℓ⟩∈Di∖{⟨v0,vi⟩}es,ℓ,

and since for all , we get that .

###### Claim 2

. The following properties hold on the graph :

1. For all , .

2. .

3. .

###### Proof:
1. By the definition of the neighborhood constraints, for all , , and therefore

 ah(1)=n−1∑ℓ=0,ℓ≠heh,ℓ=∑⟨vh,vℓ⟩∈Sheh,ℓ=0.
2.  a0(x)+ai(x)= =e0,0+ei,ixi+n−1∑ℓ=0e0,ℓxℓ+n−1∑ℓ=0ei,ℓxℓ =e0,0+ei,ixi+n−1∑ℓ=0(e0,ℓ+ei,ℓ)xℓ, =ei,0(1+xi)(b)=0,

where Step (a) holds since by Claim 1(a) for all , and Step (b) holds since by Claim 1(c), .

3.  a0(x)+ai(x)xi= =e0,0+ei,ix2i+n−1∑ℓ=0e0,ℓxℓ+n−1∑ℓ=0ei,ℓxℓ+i ≡e0,0+ei,ix2i+n−1∑ℓ=0e0,ℓxℓ+n−1∑ℓ=0ei,⟨ℓ−i⟩nxℓ(modxn−1) ≡e0,0+ei,ix2i+n−1∑ℓ=0(e0,ℓ+ei,⟨ℓ−i⟩n)(modxn−1) (a)≡e0,0+ei,ix2i+(e0,i+ei,0)xi(modxn−1) ≡e0,0+ei,ix2i(modxn−1),

where Step (a) holds since by Claim 1(b) for all , .

The summation of the equations from Claims 2(b) and 2(c) results with

 ai(x)(1+xi)≡e0,0+ei,ix2i(modxn−1).

It holds that by applying in the last equation. Assume that , so we get that

 ai(x)(1+xi)≡1+x2i(modxn−1).

Since , it holds that

 (1+xi)(1+xi+ai(x))≡0(modxn−1).

Denote by the polynomial , and since , it holds that . As stated in (LABEL:Mn_prop1), it holds that and since

 (1+xi)p(x)=(xn−1)s(x)=Mn(x)(x+1)s(x)

for some polynomial over , we deduce that . Therefore we get that , however , and so we deduce that , that is, . This results with a contradiction since the coefficient of in is 0. Thus and

 ai(x)(1+xi)≡0(modxn−1).

Notice that and by Claim 2(a) it also holds . Since , we derive that and since , we immediately get that . Finally, from Claim 2(b) we get also that and together we get that , which is a contradiction. This completes the proof. ∎

## Iv Decoding of the Double-Node-Erasure-Correcting Codes

In Section III, we proved that the code can correct the failure of any two nodes in the graph. Note that whenever two nodes fail, the number of unknown variables is , and so a naive decoding solution for the code is to solve the linear equation system of constraints with the variables. However, the complexity of such a solution will be , where it is only known that as it requires the inversion of a matrix [LG14]. Our main goal in this section is a decoding algorithm for of time complexity . Clearly, this time complexity is optimal since the complexity of the input size of the graph is .

Throughout this section we assume that is a graph in the code and for are its neighborhood polynomials. We also assume that the failed nodes are . First, we define the following two polynomials , which will be called the syndrome polynomials

 S1(x)=n−1∑ℓ=1,ℓ≠iaℓ(x), S2(x)≡n−1∑ℓ=1,ℓ≠iaℓ(x)xℓ(modxn−1).

Note that if no nodes have failed in the graph , then we can easily compute both of these polynomials since we know the values of all the edges. However in case that both failed this becomes a far less trivial problem. However, using several properties, that will be proved in this section, we will prove that it is still possible to compute entirely, and compute all the coefficients of but the ones of and , even though the nodes failed.

Our goal in this section is to prove the following theorem.

###### Theorem 2

. There exists an efficient decoding procedure to the code given any two node failures. Its complexity is , where is the number of nodes.

Before we present the proof of Theorem 2, we prove a few properties of the code that will help up to present the decoding procedure.

. It holds that

1. .

2. .

###### Proof:
1. The coefficient of some is the sum of all edges , where , and so we get that

 n−1∑ℓ=0aℓ(x)=n−1∑ℓ=0(n−1∑k=0,k≠ℓek,ℓxk)=n−1∑k=0(n−1∑ℓ=0,ℓ≠kek,ℓ)xk=0,

where the second transition is a result of changing the order of the sum and the last equality holds by the neighborhood constraint on the th node.

2. Note that

 n−1∑ℓ=0aℓ(x)xℓ =n−1∑ℓ=0(n−1∑k=0,k≠ℓek,ℓxk)xℓ =n−1∑ℓ=0n−1∑k=0,k≠ℓek,ℓxℓ+k (a)=n−1∑ℓ=0ℓ−1∑k=0ek,ℓxℓ+k+n−1∑ℓ=0n−1∑k=ℓ+1ek,ℓxℓ+k (b)=n−1∑ℓ=1ℓ−1∑k=0ek,ℓxℓ+k+n−1∑k=1k−1∑ℓ=0ek,ℓxℓ+k (c)=n−1∑ℓ=1ℓ−1∑k=0ek,ℓxℓ+k+n−1∑ℓ=1ℓ−1∑k=0ek,ℓxℓ+k=0.

Step (a) holds by splitting the sum, Step (b) is a result of changing the summation order in the second sum and noticing that in the first sum the iteration is empty, and lastly in Step (c) we simply changed the variables with each other in the second sum.

As an immediate result of Claim 3, we get the following corollary.

###### Corollary 3

. It holds that

 S1(x)=a0(x)+ai(x), S2(x)≡a0(x)+ai(x)xi(modxn−1).
###### Proof:

According to Claim 3 we get that

 S1(x) =n−1∑ℓ=1,ℓ≠iaℓ(x) =a0(x)+ai(x)+n−1∑ℓ=0aℓ(x) =a0(x)+ai(x),

and also,

 S2(x) ≡n−1∑ℓ=1,ℓ≠iaℓ(x)xℓ(modxn−1) ≡a0(x)+ai(x)xi+n−1∑ℓ=0aℓ(x)xℓ(modxn−1) ≡a0(x)+ai(x)xi(modxn−1).

Now we show that it is possible to compute , and almost compute as explained above.

###### Claim 4

. Given the two node failures , it is possible to exactly compute the polynomial .

###### Proof:

Let us consider the coefficient of in for all . For each in the sum, the edge is added to the coefficient of , and so we get that

 S1(x) =n−1∑ℓ=1,ℓ≠iaℓ(x)=n−1∑ℓ=1,ℓ≠i(n−1∑k=0,k≠ℓek,ℓxk) =n−1∑k=0(n−1∑ℓ=1,ℓ≠k,iek,ℓ)xk,

where in the last transition the summation order has been changed. For any , we can compute the coefficient of since we know all the edges in the sum. In case that , we get that the coefficient of is

 n−1∑ℓ=1,ℓ≠ieℓ,0=ei,0,

by the constraint . For , we get that the coefficient of is

 n−1∑ℓ=1,ℓ≠ieℓ,i=e0,i,

by the constraint . Lastly, we know the value of by the diagonal constraint , and therefore we know all the coefficients in . ∎

###### Claim 5

. It is possible to compute all of the coefficients of the polynomial except for the coefficients of and . Furthermore, the coefficients of these monomials are and , respectively.

###### Proof:

According to the definition of the polynomial and Corollary 3 we know that

 S2(x)≡a0(x)+ai(x)xi(modxn−1),

which implies that

 S2(x) ≡e0,0+ei,ix⟨2i⟩n+n−1∑ℓ=0e0,ℓxℓ+n−1∑ℓ=0ei,ℓxℓ+i(modxn−1) ≡e0,0+ei,ix⟨2i⟩n+n−1∑ℓ=0e0,ℓxℓ+n−1∑ℓ=0ei,⟨ℓ−i⟩nxℓ(modxn−1) ≡e0,0+ei,ix⟨2i⟩n+n−1∑ℓ=0(e0,ℓ+ei,⟨ℓ−i⟩n)xℓ(modxn−1).

Notice that for all , all the values of the edges in the set are known, and therefore, according to the diagonal constraint we get that the value of is calculated by

 e0,ℓ+ei,⟨ℓ−i⟩n=∑⟨vk,v⟨ℓ−k⟩n⟩∈D′ℓek,⟨ℓ−k⟩n.

Finally, the only coefficients in this polynomial that we can not compute are the coefficients of and which are and , respectively.

###### Claim 6

. Given the values of , we can compute the polynomials and , i.e., decode the failed nodes .

###### Proof:

Assume that the values of are known. This implies that we can compute exactly the polynomials as well as and let us denote

 S1(x)+S2(x)≡n−1∑k=0skxk(modxn−1),

that is, the coefficients for are known. According to Corollary 3 we have that

 S1(x)=a0(x)+ai(x), S2(x)≡a0(x)+ai(x)xi(modxn−1).

Adding up these two equations results with

 S1(x)+S2(x)≡ai(x)+ai(x)xi(modxn−1).

Thus, we get the following equations with the variables for . For all we get the equation

 ei,k+ei,⟨k−i⟩n=sk,

for we get the equation

 ei,0=si,

and lastly for we get the equation

 ei,⟨2i⟩n=s⟨2i⟩n.

We know that this linear system of equations has a single solution by Theorem 3. Hence, by solving it, we decode the polynomial , and by the equality we can decode as well. An important observation is that the number of non zero entries in our linear system of equations is exactly , thus the time complexity to solve this linear system of equations is  [GostavsonMatrix]. ∎

To summarize, given the values of , an efficient decoding procedure with time complexity works as follows:

1. Compute .

2. Compute .

3. Solve the linear system of equations induced from the equality

 S1(x)+S2(x)≡ai(x)+ai(x)xi(mod(xn−1))

in order to decode .

4. Use the equality in order to decode .

Now all that is left to show in order to prove Theorem 2 is the decoding of . This will be done in two steps; first we will decode the values of and then we will derive the values of . The former edges will be decoded using the following algorithm.

Using a similar algorithm we decode the value as well. To prove the correctness of Algorithm 1, it suffices that we prove the following claim.

###### Claim 7

. All steps in Algorithm 1 are possible to compute and furthermore, .

###### Proof:

First note that the edge can be decoded according to the diagonal constraint since all the edges in this constraint are known besides . The values receives in the while loop of the algorithm are and for every value of it is possible to compute by the neighborhood constraint of . Similarly, the value of is computed by the diagonal constraint .

From the while loop of Algorithm 1, we have that

 sum =e0,i+n−32∑k=1(d2k+1+f2k+1) =e0,i+n−32∑k=1(e0,⟨(2k+1)⋅i⟩n+e0,⟨(2k+2)⋅i⟩n)

Step (a) holds since is a generator of the group , and thus are all distinct elements in , and since we also added the term to this summation. Lastly, Step (b) holds by the neighborhood constraint of and we get that . ∎

We are now read to conclude with the proof of Theorem 2.

###### Proof:

Using a similar algorithm to Algorithm 1 we can decode the edge and using the diagonal constraints we can lastly decode , from , respectively. This concludes the proof of the decoding procedure and of Theorem 2. ∎

## V Binary Triple-Node-Erasure-Correcting Codes

In this section we present a construction of binary triple-node-erasure-correcting codes for undirected graphs. Let be a prime number such that is a primitive number in . Let be a graph with vertices. We will use in this construction the edge sets for which were defined in (1),(2), respectively. In addition, for we define the edge set

 Ts={⟨vk,vℓ⟩|k,ℓ∈[n],⟨k+2ℓ⟩n=s,k≠ℓ}.

In this construction we impose the same constraints from Construction 1, that is, the sets will be used to represent parity constraints on the neighborhood of each node, the sets will represent parity constraints on the diagonals with slope one of , and furthermore the sets will represent parity constraints on the diagonals with slope two of .

###### Example 2

. In Fig. 2 we present the sets , of a graph on its labeling matrix , and its lower-triangle-labeling matrix .

We are now ready to show the following construction.

###### Construction 2

For all prime number where is primitive in , let be the following code:

 C3=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩G=(Vn,L)∣∣ ∣ ∣ ∣∣(a)∑⟨vi,vj⟩∈Shei,j=0,