# The decoding failure probability of MDPC codes

Moderate Density Parity Check (MDPC) codes are defined here as codes which have a parity-check matrix whose row weight is O(√(n)) where n is the length n of the code. They can be decoded like LDPC codes but they decode much less errors than LDPC codes: the number of errors they can decode in this case is of order Θ(√(n)). Despite this fact they have been proved very useful in cryptography for devising key exchange mechanisms. They have also been proposed in McEliece type cryptosystems. However in this case, the parameters that have been proposed in MTSB13 were broken in GJS16. This attack exploits the fact that the decoding failure probability is non-negligible. We show here that this attack can be thwarted by choosing the parameters in a more conservative way. We first show that such codes can decode with a simple bit-flipping decoder any pattern of O(√(n) n/ n) errors. This avoids the previous attack at the cost of significantly increasing the key size of the scheme. We then show that under a very reasonable assumption the decoding failure probability decays almost exponentially with the codelength with just two iterations of bit-flipping. With an additional assumption it has even been proved that it decays exponentially with an unbounded number of iterations and we show that in this case the increase of the key size which is required for resisting to the attack of GJS16 is only moderate.

## Authors

• 9 publications
• ### A Code-specific Conservative Model for the Failure Rate of Bit-flipping Decoding of LDPC Codes with Cryptographic Applications

Characterizing the decoding failure rate of iteratively decoded Low- and...
12/11/2019 ∙ by Paolo Santini, et al. ∙ 0

• ### Efficient Decoding of Interleaved Low-Rank Parity-Check Codes

An efficient decoding algorithm for horizontally u-interleaved LRPC code...
08/28/2019 ∙ by Julian Renner, et al. ∙ 0

• ### Low-Rank Parity-Check Codes over Galois Rings

Low-rank parity-check (LRPC) are rank-metric codes over finite fields, w...
06/18/2020 ∙ by Julian Renner, et al. ∙ 0

• ### Attack on the Edon-K Key Encapsulation Mechanism

The key encapsulation mechanism Edon-K was proposed in response to the c...
02/16/2018 ∙ by Matthieu Lequesne, et al. ∙ 0

• ### A Statistical Explanation of the Timing Attack on QC-MDPC Code Crypto-system

The McEliece cryptosystem based on quasi-cyclic moderate-density parity-...
12/15/2019 ∙ by Han Li, et al. ∙ 0

• ### Generalizing Syndrome Decoding problem to the totally Non-negative Grassmannian

The syndrome decoding problem has been proposed as a computational hardn...
06/29/2021 ∙ by Kelechi Chuwkunonyerem Emerole, et al. ∙ 0

• ### Finite Length Analysis of Irregular Repetition Slotted ALOHA in the Waterfall Region

A finite length analysis is introduced for irregular repetition slotted ...
03/04/2018 ∙ by Alexandre Graell i Amat, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Virtually all the public key cryptography used in practice today can be attacked in polynomial time by a quantum computer [Sho94]. Even if such a quantum computer does not exist yet, finding viable solutions which would be resistant to a quantum computer is expected to be a lengthy process. This is one of the reasons why the NIST has recently launched a process for standardizing public key cryptographic algorithms that would be safe against a quantum adversary. Code-based cryptography is believed to be quantum resistant and is therefore considered as a viable solution. The McEliece system [McE78] based on binary Goppa codes, which is almost as old as RSA, is a public key cryptosystem that falls into this category. It has withstood all cryptanalyses up to now. It is well known to provide extremely fast encryption and fast decryption [BS08, BCS13], but has large public keys, about 200 kilobytes for 128 bits of security and slightly less than one megabyte for 256 bits of security [BLP08].

There have been many attempts to decrease the key size of this system. One of the most satisfying answer up to now has been to use Moderate Density Parity Check (MDPC) codes. The rows of their parity-check that defines them is of order when is the length of the code. This family is very attractive since (i) the decryption algorithm is extremely simple and is based on a very simple bit flipping decoding algorithm, (ii) direct attacks on the key really amount to a problem of the same nature as decoding a linear code. This can be used to give a security proof [MTSB13].

This work builds upon the following observation: decoding errors in a generic linear code or finding a codeword of weight in the linear code where this word has been added to are problems that are polynomially equivalent. This problem is considered very hard when or are large enough: after decades of active research [Pra62, LB88, Leo88, Ste88, Dum91, Bar97, FS09, BLP11, MMT11, BJMM12, MO15, DT17, BM17] the best algorithms solving this issue are still exponential in (or ), their complexity is of the form (replace by in the low-weight search problem) where is the number of solutions of the problem and is the rate of the code. This holds even for algorithms in the quantum computing model [Ber11, KT17]. Moreover, the relative exponent has decreased only very slowly after years of active research on the topic.The proposal made in [MTSB13] exploits this. It suggests to use MDPC codes of rate of a certain length in a McEliece scheme which are able to decode errors with parity-check equations of weight . Recovering the plaintext without knowing the secret parity-check matrix of the code amounts to decode errors in a linear code which is conjectured to be hard [Ale11], whereas recovering the secret MDPC structure amounts to find codewords of roughly the same weight as in the dual code that has the same dimension as the code that is decoded. Both problems are equally hard here. Note that this is not the case if we would have taken LDPC codes. They can decode much larger errors, however in this case finding the low weight parity-checks can be done in polynomial time which breaks the system completely as observed in [MRAS00].

However there is a problem with the security proof of [MTSB13] because it does not take into account the decoding failure probability. This is not not necessarily a problem in a setting where the scheme is used to devise ephemeral keys [BGG17, ABB17]. However, in security models where an attacker is allowed to query the decryption oracle many times, this can be a problem as observed by [GJS16] which showed how to attack the parameters proposed in [MTSB13]. This attack really exploits the non negligible decoding failure probability of the MDPC codes chosen in [MTSB13]. If this probability were as low as where is the complexity of the best attack that the scheme has to sustain, then this would not be a problem and the security proof of [BGG17] could be used to show the security of the scheme under this stronger attacking model. This raises the issue whether or not the error probability of MDPC codes can be made extremely small for affordable parameters.

We tackle this issue by giving several different answers to this issue. We study in depth this question in the regime which is particularly interesting for these cryptographic applications, namely when the weight of the parity-check equations is of order where is the length of the MDPC code. We define in the whole article MDPC codes in this way

###### Definition 1 (MDPC code).

An MDPC code is a binary linear code that admits a parity check matrix whose rows are all of weight where is the length of the code. In the case where this parity-check matrix has rows of a same weight and columns of a same weight , we say that the parity-check matrix is of type . By some abuse of terminology, we will also call the corresponding code a code of type .

We will decode these codes with an even simpler bit-flipping decoding algorithm than the one considered in [MTSB13] to simplify the analysis. One round of decoding is just majority-logic decoding based on a sparse parity-check matrix of the code. When we perform just one round of bit-flipping we call this decoder a majority-logic decoder. Recall that a majority logic decoder based on a certain parity-check matrix computes for all bits the number of parity-checks that involve the bit that are are unsatisfied. Let be the number of parity-checks involving bit . If for a bit we have (i.e if a strict majority of such parity-checks is violated) the bit gets flipped. The parity-check equations used for making the decision on a given bit may depend on the bit (in particular they may have disjoint supports outside the bit they help to decode). This is not the path we follow here. It turns out that for an MDPC code, we can use all the parity-check equations defining the MDPC code without too much penalty in doing so. We will assume here that the computation of the ’s is done in parallel so that flipping one bit does not affect the other ’s. In other words the decoder works as given in Algorithm 1 when we perform iterations.

A crucial quantity will play an important role, namely

###### Definition 2 (maximum column intersection).

Let be a binary matrix. The intersection number of two different columns and of is equal to the number of rows for which . The maximum column intersection of is equal to the maximum intersection number of two different columns of .

The point is that it is readily verified (see Proposition 1) that an MDPC code of type corrects all errors of weight by majority-logic decoding (i.e. Algorithm 1 with ) when the maximum column intersection is . What makes this result interesting is that for most MDPC codes the maximum column intersection is really small. We namely prove that for a natural random MDPC code model, the maximum column intersection of the parity-check matrix defining the MDPC code is typically of order . Computing the maximum intersection number can obviously be performed in polynomial time and this allows us to give a randomized polynomial time algorithm for constructing MDPC codes of length and fixed rate that corrects any pattern of errors with the majority-logic decoder.

Moreover, under a reasonable assumption on the first round of the bit-flipping decoder, the same MDPC codes correct errors with two iterations of a bit-flipping decoder with decoding failure probability of order . It should be noted that under an additional assumption on the subsequent iterations of the bit-flipping decoder, it has been proved in [ABB17] that MDPC codes correct errors by performing an unbounded number of bit-flipping iterations with probability of error . We also provide some concrete numbers to show that it is possible to construct MDPC codes that avoid completely the attack [GJS16] and for which it is possible to provide a security proof in strong security models with a significant key size overhead when compared to the parameters proposed in [MTSB13] if we want to stay in the no-error scenario, with a reasonable overhead if we make the first assumption mentioned above, and moderate overhead if we make both assumptions mentioned above.

## 2 Majority-logic decoding and its performance for MDPC codes

We start this section by relating the error-correction capacity of an MDPC code to the maximum column intersection of its defining parity-check matrix, then show that for typical MDPC codes the intersection number is small and that this allows to construct efficiently MDPC codes that correct all patterns of errors.

### 2.1 Error correction capacity of an MDPC code vs. maximum column intersection

As explained in the introduction the maximum column intersection can be used to lower bound the worst-case error-correction perfomance for the majority-logic decoder (Algorithm 1 with ). The precise statement is given by the following proposition.

###### Proposition 1.

Consider a code with a parity check matrix for which every column has weight at least and whose maximum column intersection is . Performing majority-logic decoding based on this matrix (i.e. Algorithm 1 with ) corrects all errors of weight .

###### Proof.

We denote by the parity-check matrix we use for performing majority-logic decoding and by the number of errors. We assume that . For in denote by the subset of positions which are in error and in the support of the -th parity check equation (i.e. ). We number the parity-check equations of the code from to . We consider now what happens to in the algorithm. There are two cases to consider.

Case 1: is erroneous. We can upper-bound the number of satisfied parity-check equations involving this bit by the number of parity-check equations involving this bit whose support contains at least errors. We consider now the graph which is a bipartite graph associated to which is constructed as follows. Its set of vertices is the union of the set of positions different from which are in error and the set of parity-check equations that involve the position and whose support contains at least errors. There is an edge between a position in and parity-check equation in if and only if the parity-check equation involves , that is . Let be the number of edges of and let be the number of parity-check equations involving . We observe now that

 sj ≤ #{i:hij=1,|Ei|≥2} ≤ ej ≤ s#Aj ≤ s(t−1) ≤ s(⌊v2s⌋−1) < v/2.

(2.1) is just the first observation whereas (2.1) follows from the fact the degree in of any vertex is at most by the assumption on the maximum intersection number of . Since it follows that the majority-logic decoder necessarily flips the bit and therefore corrects the corresponding error.

Case 2: there is no error in position . We can upper-bound the number of unsatisfied positions in a similar way. This time we consider the graph whose vertex set is the union of which is the set of positions which are in error and the set of parity-check equations involving and whose support contains this time at least one error. We put an edge between a position in and parity-check equation in if and only if the parity-check equation involves . Let be the number of edges of . Similarly to what we did we observe now that

 uj ≤ #{i:hij=1,|Ei|≥1} ≤ e′j ≤ s#A′j ≤ st ≤ s(⌊v2s⌋) ≤ v/2 ≤ nj/2.

In other words we will not flip this bit. ∎

### 2.2 A random model for MDPC codes of type (v,w)

There are several ways to build random MDPC codes of type . The one which is used in cryptography [BBC08, MTSB13, DGZ17, BGG17, ABB17, BBC17] is to construct them as quasi-cyclic codes. Our proof technique can also be applied to this case, but since there are several different types of construction of this kind, so that we have to adapt our proof technique to each of those, we will consider a more general model here. It is based on Gallager’s construction of LDPC codes [Gal63]. We will construct an random parity-check matrix of type by assuming that is a multiple of (), is a multiple of () and that (this condition is necessary in order to obtain a matrix of type ). Let be a matrix of size constructed as follows

where

denotes the identity matrix of size

,

a row vector of length

whose entries are all equal to , that is , a row vector of length whose entries are all equal to . We then choose permutations of length at random and they define a parity-check matrix of size of type as

 H(π1,…,πv)=⎛⎜ ⎜ ⎜ ⎜⎝Pπ1n,wPπ2n,w…Pπvn,w⎞⎟ ⎟ ⎟ ⎟⎠,

where denotes the matrix whose columns have been permuted with . We denote by

the associated probability distribution of binary matrices of size

and type we obtain when the ’s are chosen uniformly at random.

### 2.3 The maximum intersection number of matrices drawn according to Dr,n,v,w

The maximum intersection number of matrices drawn according to turns out to be remarkably small when and are of order , it is namely typically of order . To prove this claim we first observe that

###### Lemma 1.

Consider a matrix drawn at random according to the distribution . Take two arbitrary columns and of and let be the intersection number of and . We have for all

 P(njj′=t)=(vt)(w−1n−1)t(1−w−1n−1)v−t.
###### Proof.

Recall that is of the form

 H(π1,…,πv)=⎛⎜ ⎜ ⎜ ⎜⎝Pπ1n,wPπ2n,w…Pπvn,w⎞⎟ ⎟ ⎟ ⎟⎠,

for some permutations , chosen uniformly at random in . A row of is called a coincidence if and only if . There is obviously one coincidence at most in each of the blocks . We claim now that the probability of a coincidence in each of these blocks is . To verify this consider the row of block which is such that . The probability that there is a coincidence for this block is the probability that which amounts to the fact that takes its values in a subset of values among possible values. All of these are equiprobable. This shows the claim. Since the coincidences that occur in the blocks are all independent (since the ’s are independent) we obtain the aforementioned formula. ∎

We use this to prove the following result

###### Proposition 2.

Let and be two constants such that . Assume we draw a parity-check matrix at random according to the distribution where we assume that both and satisfy . Then for any the maximum intersection number of is smaller than with probability as tends to infinity.

###### Proof.

Let is number the columns of from to . For and in two different columns of we denote by the event that the intersection number of and is . Let be the probability that the maximum intersection number is larger than or equal to . By the union bound, and then Lemma 1 we obtain

 P(Et) = P(⋃1≤i

From this we deduce

 P(Et)≤n2v∑s=tvvss(v−s)v−s(w−1n−1)s(1−w−1n−1)v−s

where we use the well known upper-bound . This allows to write

 P(Et) ≤ n2v∑s=tvvss(v−s)v−s(wn)s ≤ n2v∑s=tvv−s(v−s)v−s(v⋅ws⋅n)s ≤ n2v∑s=t(1+sv−s)v−s(v⋅ws⋅n)s ≤ n2v∑s=t(e⋅v⋅ws⋅n)s ≤ n2v∑s=t(eβ2s)s

Choose now for some . When is large enough, we have that . In such a case we can write

 P(Et) ≤ n2v∑s=t(eβ2t)s ≤ n2(eβ2t)t1−eβ2t ≤ n2(Kt)t

for some constant . This implies that

 P(Et) ≤ n2e(2+ε)lnnlnlnnln(Klnlnnγlnn) ≤ eεlnn+γlnnln(Klnlnnγ)lnlnn = o(1)

as tends to infinity. ∎

This together with Proposition 1 implies directly the following corollary

###### Corollary 1.

There exists a randomized algorithm working in expected polynomial time outputting for any designed rate an MDPC code of rate of an arbitrarily large length and parity-check equations of weight that corrects all patterns of errors of size less than for large enough, where is some absolute constant.

###### Proof.

The randomized algorithm is very simple. We choose to be a square for some integer and let and . Then we draw a parity-check matrix at random according to the distribution . The corresponding code has clearly rate . We compute the maximum column intersection of . This can be done in time . If this column intersection is greater than we output the corresponding MDPC code, if not we draw at random again until finding a suitable matrix . By Proposition 1 we know that such a code can correct all patterns of at most errors. This implies the corollary. ∎

## 3 Analysis of two iterations of bit-flipping

We derived in the previous section a condition ensuring that one round of bit-flipping corrects all the errors. We will now estimate the probability that performing one round of bit-flipping corrects enough errors so that another round of bit-flipping will correct all remaining errors. To analyze the first round of decoding we will model intermediate quantities of the bit-flipping algorithm by binomial distributions. More precisely, consider an MDPC code of type

and length . The noise model is the following: an error of weight was chosen uniformly at random and added to the codeword of the MDPC code. For , let

be the Bernouilli random variable which is equal to

iff the -th position that was not in error initially is in error after the first round of iterative decoding. We also denote by the counter associated to the -th position which was not in error. is the sum of Bernouilli random variables associated to the parity-check equations involving this bit. A Bernoulli-random variable is equal to if and only the corresponding parity-check is equal to . Note that by definition of the bit-flipping decoder

 E0i=1{U0i>v/2}

Similarly, for we denote by the Bernoulli random variable that is equal to iff the -th bit that was in error initially stays in error after the first round of Algorithm 1. We also define the ’s and the ’s similarly. In this case

 E1i=1{U1i≤v/2}.

Let us bring in for :

 pbdef=P(Vbij=1). (5)

It is clear that these probabilities do not depend on and and that this definition is consistent. It is (essentially) proved in [ABB17] that

###### Lemma 2.

Assume that and . Then

 pb=12−(−1)bε(12+O(1√n)), (6)

where .

We will recall a proof of this statement in the appendix. We will now make the following assumption that simplifies the analysis

###### Assumption 1.

When we use Algorithm 1 on an MDPC code of type , we assume that

• for all the counters of Algorithm 1 are distributed like sums of independent Bernoulli random variables of parameter at the first iteration and the ’s are independent;

• for all the counters of Algorithm 1 are distributed like sums of independent Bernoulli random variables of parameter at the first iteration and the ’s are independent.

The experiments performed in [Cha17] corroborate this assumption for the first iteration of the bit-flipping decoder. To analyze the behavior of Algorithm 1 we will use the following lemma which is just a slight generalization of Lemma 6 in [ABB17]

###### Lemma 3.

Under Assumption 1 used for an MDPC code of type and when the error is chosen uniformly at random among the errors of weight , we have for all ,

 P(Ebi=1)=O((1−ε2)v/2√vε),

where .

For the ease of reading the proof of this lemma is also recalled in the appendix. Under Assumption 1, does not depend on , we will denote it by

 qbdef=P(Ebi=1).

We let

 S0 def= E01+⋯+E0n−t S1 def= E11+⋯+E1t

is the number of errors that were introduced after one round of iterative decoding coming from flipping the bits that were initially correct. Similarly is the number of errors that are left after one round of iterative decoding coming from not flipping the bits that were initially incorrect. Let , which represents the total number of errors that are left after the first round of iterative decoding. We quantify the probability that this quantity does not decay enough by the following theorem which holds under Assumption 1.

###### Theorem 1.

Under Assumption 1, we have for an MDPC code of type where and :

 P(S≥t′)≤1√t′et′v4ln(1−ε2)+t′8ln(n)+O(t′ln(t′/t)),

where .

From this theorem we deduce that

###### Corollary 2.

Provided that Assumption 1 holds, we can construct in expected polynomial time for any designed rate an MDPC code of rate of an arbitrarily large length and parity-check equations of weight where the probability of error after two iterations of bit-flipping is upper-bounded by when there are errors.

###### Proof.

We use the construction given in the proof of Corollary 1 to construct an MDPC code of type of length and with that allows to correct all patterns of errors of size less than for large enough, where is some absolute constant with just one round of the bit-flipping decoder of Algorithm 1. Then we use Theorem 1 to show that with probability upper-bounded by there remains at most errors after one round of Algorithm 1. This proves the corollary. ∎

In [ABB17] there is an additional assumption which is made which is that the probability of error is dominated by the probability that the first round of decoding is not able to decrease the number by some mutiplicative factor . With the notation of this section, this assumption can be described as follows.

###### Assumption 2.

There exists some constant such that the probability of error for an unbounded number of iterations of Algorithm 1 is upper-bounded by where is the number of errors that are left after the first round of Algorithm 1 and is the initial number of errors.

This assumption also agrees with the experiments performed in [Cha17]. With this additional assumption (Assumption 1 is actually also made) it is proven in [ABB17] that the probability of errors decays exponentially when . This is actually obtained by a slightly less general version of Theorem 1 (see [ABB17, Theorem 1]).

## 4 Numerical results

In this section we provide numerical results showing how much we have to increase the parameters proposed in [MTSB13] in order to obtain a probability of error which is below where is the security parameter (i.e. should be the complexity of the best attacks on the scheme). The upper-bound on the probability that the maximum column intersection is larger than some bound coming from using Lemma 1 together with an obvious union-bound is a little bit loose, and we performed numerical tests in order to estimate the maximum column intersection. To speed-up the calculation and at the same time to be closer to the cryptographic applications we considered the particular code structure used in [MTSB13, BGG17, ABB17], namely a quasi-cyclic code whose parity-chack matrix is formed by two circulant blocks and , i.e. . The weight of the rows of and was chosen to be (with even) so that we have a code of type . The maximum column intersection given in Table 1 corresponds to the smallest number such that more than of the parity-check matrices had a maximum column intersection . We considered several scenarios:
- Scenario I, Algorithm 1 with and a zero-error decoding failure probability when there are errors;
- Scenario II: Algorithm 1 with and a non-zero decoding failure probability upper-bounded by making Assumption 1 when there are errors,
- Scenario III, Algorithm 1 with and a non-zero decoding failure probability when there are errors which is upper-bounded by making Assumption 1 and Assumption 2 for ;
- Scenario IV, Algorithm 1 with and a non-zero decoding failure probability when there are errors which is upper-bounded by making Assumption 1 and Assumption 2 for . This should be compared with the original parameters proposed for a security level in [MTSB13] that were broken in [GJS16], namely , , . We have chosen in all cases.

## 5 Concluding remarks

This study shows that it is possible to devise MDPC codes with zero or very small error probability, and in the last case it comes at an affordable cost for cryptographic applications and this by making assumptions that are corroborated by experimental evidence [Cha17]. There are obviously several ways to improve our results. The first would be to use slightly more sophisticated decoding techniques and/or more sophisticated analyses when we want a zero-error probability. The maximum column intersection gives a lower bound on the expansion of the Tanner graph and this can be used to study the bit-flipping algorithm considered in [SS96]. This would not improve the lower-bound on the error-correction capacity however, but suggests that refined considerations and decoding algorithms should be able to improve the error-correction capacity in the worst case. Moreover, in order to simplify the analysis and the discussion we have considered a very simple decoder. The probability of error can already be lowered rather significantly by choosing in a slightly better way the threshold in Step 7 in Algorithm 1 and it is clear that more sophisticated decoding techniques will be able to lower the probability of error significantly (see [Cha17] for instance). This suggests that it should be possible to improve rather significantly the parameters proposed in Table 1.

## References

• [ABB17] Nicolas Aragon, Paulo Barreto, Slim Bettaieb, Loïc Bidoux, Olivier Blazy, Jean-Christophe Deneuville, Philippe Gaborit, Shay Gueron, Tim Güneysu, Carlos Aguilar Melchor, Rafael Misoczki, Edoardo Persichetti, Nicolas Sendrier, Jean-Pierre Tillich, and Gilles Zémor. BIKE. first round submission to the NIST post-quantum cryptography call, November 2017.
• [Ale11] Michael Alekhnovich. More on average case vs approximation complexity. Computational Complexity, 20(4):755–786, 2011.
• [Bar97] Alexander Barg. Complexity issues in coding theory. Electronic Colloquium on Computational Complexity, October 1997.
• [BBC08] Marco Baldi, Marco Bodrato, and Franco Chiaraluce. A new analysis of the McEliece cryptosystem based on QC-LDPC codes. In Proceedings of the 6th international conference on Security and Cryptography for Networks, SCN ’08, pages 246–262. Springer-Verlag, 2008.
• [BBC17] Marco Baldi, Alessandro Barenghi, Franco Chiaraluce, Gerardo Pelosi, and Paolo Santini. LEDAkem. first round submission to the NIST post-quantum cryptography call, November 2017.
• [BCS13] Daniel J. Bernstein, Tung Chou, and Peter Schwabe. Mcbits: Fast constant-time code-based cryptography. In Guido Bertoni and Jean-Sébastien Coron, editors, Cryptographic Hardware and Embedded Systems - CHES 2013, volume 8086 of LNCS, pages 250–272. Springer, 2013.
• [Ber11] Daniel J. Bernstein. List decoding for binary Goppa codes. In Yeow Meng Chee, Zhenbo Gu, Sang Ling, Fengjing Shao, Yuansheng Tang, Huaxiong Wang, and Chaoping Xing, editors, Coding and cryptology—third international workshop, IWCC 2011, volume 6639 of LNCS, pages 62–80, Qingdao, China, June 2011. Springer.
• [BGG17] Paulo S. L. M. Barreto, Shay Gueron, Tim Güneysu, Rafael Misoczki, Edoardo Persichetti, Nicolas Sendrier, and Jean-Pierre Tillich. CAKE: code-based algorithm for key encapsulation. In Cryptography and Coding - 16th IMA International Conference, IMACC 2017, Oxford, UK, December 12-14, 2017, Proceedings, volume 10655 of LNCS, pages 207–226. Springer, 2017.
• [BGT11] Céline Blondeau, Benoît Gérard, and Jean-Pierre Tillich. Accurate estimates of the data complexity and success probability for various cryptanalyses. Des. Codes Cryptogr., 59(1-3):3–34, 2011.
• [BJMM12] Anja Becker, Antoine Joux, Alexander May, and Alexander Meurer. Decoding random binary linear codes in : How improves information set decoding. In Advances in Cryptology - EUROCRYPT 2012, LNCS. Springer, 2012.
• [BLP08] Daniel J. Bernstein, Tanja Lange, and Christiane Peters. Attacking and defending the McEliece cryptosystem. In Post-Quantum Cryptography 2008, volume 5299 of LNCS, pages 31–46, 2008.
• [BLP11] Daniel J. Bernstein, Tanja Lange, and Christiane Peters. Smaller decoding exponents: ball-collision decoding. In Advances in Cryptology - CRYPTO 2011, volume 6841 of LNCS, pages 743–760, 2011.
• [BM17] Leif Both and Alexander May. Optimizing BJMM with Nearest Neighbors: Full Decoding in and McEliece Security. In WCC Workshop on Coding and Cryptography, September 2017. To appear, see https://www.google.fr/?gfe_rd=cr&ei=lEyVWcPPBuXU8gfAj5ygBg.
• [BS08] Bhaskar Biswas and Nicolas Sendrier. McEliece cryptosystem implementation: theory and practice. In Johannes Buchmann and Jintai Ding, editors, Post-Quantum Cryptography, Second International Workshop, PQCrypto 2008, Cincinnati, OH, USA, October 17-19, 2008, Proceedings, volume 5299 of LNCS, pages 47–62. Springer, 2008.
• [Cha17] Julia Chaulet. Étude de cryptosystèmes à clé publique basés sur les codes MDPC quasi-cycliques. PhD thesis, University Pierre et Marie Curie, March 2017.
• [CT91] Thomas M. Cover and Joy A. Thomas. Information Theory. Wiley Series in Telecommunications. Wiley, 1991.
• [DGZ17] Jean-Christophe Deneuville, Philippe Gaborit, and Gilles Zémor. Ouroboros: A simple, secure and efficient key exchange protocol based on coding theory. In Post-Quantum Cryptography - 8th International Workshop, PQCrypto 2017, Utrecht, The Netherlands, June 26-28, 2017, Proceedings, volume 10346 of LNCS, pages 18–34. Springer, 2017.
• [DT17] Thomas Debris-Alazard and Jean-Pierre Tillich. Statistical decoding. preprint, January 2017. arXiv:1701.07416.
• [Dum91] Ilya Dumer. On minimum distance decoding of linear codes. In Proc. 5th Joint Soviet-Swedish Int. Workshop Inform. Theory, pages 50–52, Moscow, 1991.
• [FS09] Matthieu Finiasz and Nicolas Sendrier. Security bounds for the design of code-based cryptosystems. In M. Matsui, editor, Advances in Cryptology - ASIACRYPT 2009, volume 5912 of LNCS, pages 88–105. Springer, 2009.
• [Gal63] Robert G. Gallager. Low Density Parity Check Codes. M.I.T. Press, Cambridge, Massachusetts, 1963.
• [GJS16] Qian Guo, Thomas Johansson, and Paul Stankovski. A key recovery attack on MDPC with CCA security using decoding errors. In Jung Hee Cheon and Tsuyoshi Takagi, editors, Advances in Cryptology - ASIACRYPT 2016, volume 10031 of LNCS, pages 789–815, 2016.
• [KL95] Gil Kalai and Nathan Linial. On the distance distribution of codes. IEEE Trans. Inform. Theory, 41(5):1467–1472, September 1995.
• [KT17] Ghazal Kachigar and Jean-Pierre Tillich. Quantum information set decoding algorithms. In Post-Quantum Cryptography 2017, volume 10346 of LNCS, Utrecht, The Netherlands, June 2017. Springer.
• [LB88] Pil J. Lee and Ernest F. Brickell. An observation on the security of McEliece’s public-key cryptosystem. In Advances in Cryptology - EUROCRYPT’88, volume 330 of LNCS, pages 275–280. Springer, 1988.
• [Leo88] Jeffrey Leon. A probabilistic algorithm for computing minimum weights of large error-correcting codes. IEEE Trans. Inform. Theory, 34(5):1354–1359, 1988.
• [Mat93] Mitsuru Matsui. Linear cryptanalysis method for DES cipher. In Advances in Cryptology - EUROCRYPT’93, volume 765 of LNCS, pages 386–397, Lofthus, Norway, May 1993. Springer.
• [McE78] Robert J. McEliece. A Public-Key System Based on Algebraic Coding Theory, pages 114–116. Jet Propulsion Lab, 1978. DSN Progress Report 44.
• [MMT11] Alexander May, Alexander Meurer, and Enrico Thomae. Decoding random linear codes in . In Dong Hoon Lee and Xiaoyun Wang, editors, Advances in Cryptology - ASIACRYPT 2011, volume 7073 of LNCS, pages 107–124. Springer, 2011.
• [MO15] Alexander May and Ilya Ozerov. On computing nearest neighbors with applications to decoding of binary linear codes. In E. Oswald and M. Fischlin, editors, Advances in Cryptology - EUROCRYPT 2015, volume 9056 of LNCS, pages 203–228. Springer, 2015.
• [MRAS00] Chris Monico, Joachim Rosenthal, and Amin A. Shokrollahi. Using low density parity check codes in the McEliece cryptosystem. In Proc. IEEE Int. Symposium Inf. Theory - ISIT, page 215, Sorrento, Italy, 2000.
• [MS86] Florence J. MacWilliams and Neil J. A. Sloane. The Theory of Error-Correcting Codes. North–Holland, Amsterdam, fifth edition, 1986.
• [MTSB13] Rafael Misoczki, Jean-Pierre Tillich, Nicolas Sendrier, and Paulo S. L. M. Barreto. MDPC-McEliece: New McEliece variants from moderate density parity-check codes. In Proc. IEEE Int. Symposium Inf. Theory - ISIT, pages 2069–2073, 2013.
• [Pra62] Eugene Prange. The use of information sets in decoding cyclic codes. IRE Transactions on Information Theory, 8(5):5–9, 1962.
• [Sho94] Peter W. Shor. Algorithms for quantum computation: Discrete logarithms and factoring. In S. Goldwasser, editor, FOCS, pages 124–134, 1994.
• [SS96] Michael Sipser and Daniel A. Spielman. Expander codes. IEEE Trans. Inform. Theory, 42(6):1710–1722, 1996.
• [Ste88] Jacques Stern. A method for finding codewords of small weight. In G. D. Cohen and J. Wolfmann, editors, Coding Theory and Applications, volume 388 of LNCS, pages 106–113. Springer, 1988.

## Appendix A The Kullback-Leibler divergence

The proofs of the results proved in the appendix use the Kullback-Leibler divergence (see see for instance

[CT91]) and some of its properties what we now recall.

###### Definition 3.

Kullback-Leibler divergence
Consider two discrete probability distributions and defined over a same discrete space . The Kullback-Leibler divergence between and is defined by

 D(p||q)=∑x∈Xp(x)lnp(x)q(x).

We overload this notation by defining for two Bernoulli distributions

and of respective parameters and

 D(p||q)def=D(B(p)||B(q))=pln(pq)+(1−p)ln(1−p1−q).

We use the convention (based on continuity arguments) that and .

We will need the following approximations/results of the Kullback-Leibler divergence

###### Lemma 4.

For any we have

 D(12∣∣∣∣∣∣12+δ)=−12ln(1−4δ2). (7)

For constant and going to by staying positive, we have

 D(α||δ)=−h(α)−αlnδ+O(δ). (8)

For and going to we have

 D(x||y)=xlnxy+x−y+O(x2). (9)
###### Proof.

Let us first prove (7).

 D(12∣∣∣∣∣∣12+δ) = 12ln1/21/2+δ+12ln1/21/2−δ P = −12ln(1+2δ)−12ln(1−2δ) = −12ln(1−4δ2).

To prove (8) we observe that

 D(α||δ) = αln(αδ)+(1−α)ln(1−α1−δ) = −h(α)−αlnδ−(1−α)ln(1−δ) = −h(α)−αlnδ+O(δ).

For the last estimate we proceed as follows

 D(x||y) = xlnxy+(1−x)ln1−x1−y = xlnxy−(1−x)(−x+y+O(x2)) = xlnxy+x−y+O(x2).

The Kullback-Leibler appears in the computation of large deviation exponents. In our case, we will use the following estimate which is well known and which can be found for instance in [BGT11]

###### Lemma 5.

Let be a real number in and be independent Bernoulli random variables of parameter . Then, as tends to infinity:

 P(X1+…Xn≥τn) = (1−p)√τ(τ−p)√2πn(1−τ)e−nD(τ||p)(1+o(1))for p<τ<1, (10) P(X1+…Xn≤τn) = p√1−τ(p−τ)√2πnτe−nD(τ||p)(1+o(1))for 0<τ

## Appendix B Proof of Lemma 2

Recall first this lemma. See 2

Before giving the proof of this lemma, observe can be viewed as the probability that the -th parity check equation involving a bit gives an incorrect information about bit . This is obtained through the following lemma.

###### Lemma 6.

Consider a word of weight and an error of weight chosen uniformly at random. Assume that both and are of order : and . We have

 Pe(⟨h,e⟩=1)=12−12e−2wtn(1+O(1√n)).
###### Remark 1.

Note that this probability is in this case of the same order as the probability taken over errors whose coordinates are drawn independently from a Bernoulli distribution of parameter . In such a case, from the piling-up lemma [Mat93] we have

 Pe(⟨h,e⟩=1) = 1−(1−2tn)w2 = 12−12ewln(1−2t/n) = 12−12e−2wtn(1+O(1√n)).

The proof of this lemma will be done in the following subsection. Lemma 2 is a corollary of this lemma since we have

 pb=P(⟨h,e⟩=1|e1=b). (12)

### b.1 Proof of Lemma 6

The proof involves properties of the Krawtchouk polynomials. We recall that the (binary) Krawtchouk polynomial of degree and order (which is an integer), is defined for by:

 Pni(X)def=(−1)i2ii∑j=0(−1)j(Xj)(n−Xi−j)where (Xj)def=1j!X(X−1)⋯(X−j+1). (13)

Notice that it follows on the spot from the definition of a Krawtchouk polynomial that

 Pnk(0)=(−1)k(nk)2k. (14)

Let us define the bias by

 δdef=1−2Pe(⟨h,e⟩=1).

In other words . These Krawtchouk polynomials are readily related to . We first observe that

 Pe(⟨h,e⟩=1)=∑wj=1j(tj)(n−tw−j)(nw).

Moreover by observing that we can recast the following evaluation of a Krawtchouk polynomial as

 (−2)w(nw)Pnw(t) = ∑wj=0(−1)j(tj)(n−tw−j)(nw) (15) = ∑wj=0j even(tj)(n−tw−j)−∑wj=1j odd(tj)(n−tw−j)(nw) = (nw)−2∑wj=1j odd(tj)(n−tw−j)(nw) = 1−2Pe(⟨h,e⟩=1) = δ.

To simplify notation we will drop the superscript in the Krawtchouk polynomial notation. It will be chosen as the length of the MDPC code when will use it in our case. An important lemma that we will need is the following one.

###### Lemma 7.

For all in , we have

 Pw(x)Pw(x−1)=(1+O(1n))n−2w+√(n−2w)2−4w(n−w)2(n−w).
###### Proof.

This follows essentially from arguments taken in the proof of [MS86][Lemma 36, §7, Ch. 17]. The result we use appears however more explicitly in [KL95][Sec. IV] where it is proved that if is in an interval of the form for some constant independent of , and , then

 Pw(x+1)Pw(x)=(1+O(1n))n−2w+√(n−2w)2−4w(n−w)2(n−w).

For our choice of this condition is met for and the lemma follows immediately. ∎

We are ready now to prove Lemma 6.

###### Proof of Lemma 6.

We start the proof by using (15) which says that

 δ=(−2)w(nw)Pnw(t).

We then observe that

 (−2)w(nw)Pnw(t) = (−2)w(nw)Pnw(t)Pnw(t−1)Pnw(t−1)Pnw(t−2)…Pnw(1)Pnw(0)Pnw(0) = (−2)w(nw)((1+O(1n))n−2w+√(n−2w)2−4w(n−w)2(