1 Introduction
To compare the hardness of mathematical problems, in complexity theory one introduces the complexity classes P, NP, NPhard and NPcomplete. A problem belongs to P if it can be solved by a deterministic Turing machine in polynomial time, whereas a problem belongs to NP if it can be solved by a nondeterministic Turing machine or, equivalently, if one can check whether an instance is a solution to the problem in polynomial time. Thus, clearly, P lies inside NP. A problem is said to be NPhard if any problem in NP can be reduced to this problem in polynomial time; thus, in some sense, they mark the hardest problems in mathematics. To show that a new problem is NPhard it suffices to find a polynomial time reduction from a known NPhard problem to the new problem. In addition, a problem is said to be NPcomplete if it is NPhard and in NP.
NPcomplete problems play a fundamental role in cryptography, as systems based on them are promising candidates for postquantum cryptography. In particular, NPcomplete problems in coding theory are the basis of codebased cryptography. Historically, codebased cryptography was initiated by the seminal works of McEliece in 1978 [2] and Niederreiter in 1986 [3]
. This area is deemed, at the moment, as one of the most consolidated and assessed ones in publickey cryptography
[4]. Codebased schemes are usually built upon the SDP, which is equivalent to the problem of decoding a random linear code. In [5] and [6], the SDP has been proven to be NPcomplete for codes defined over some finite field and endowed with the Hamming metric. An adversary could still apply the best nonstructural algorithm to attack the cryptosystem, which in the case of the SDP is called ISD algorithm. These algorithms are hence important to determine which size of the public key is needed to achieve a given security level. The first ISD algorithm was proposed by Prange in 1962 [7].Besides these classical results, there has recently been a growing interest in changing the underlying metric or changing the underlying algebraic structure (like finite rings). This is the case of the rank version of SDP, which, analogously to the Hamming metric case, has been proven to be NPcomplete [8]. Codebased cryptosystems using the rank metric provide surprisingly low key sizes (see for example [9]). This change of the classical Hamming metric to other metrics seems to be promising. Hence we want to study the impact of the Lee metric in codebased cryptography. Some cryptosystems have already been proposed over finite rings (see [10, 11, 12, 13]); in particular, HorlemannTrautmann and Weger in [13] have considered the use of codes defined over , endowed with the Lee metric.
In this paper we prove the NPcompleteness of the SDP for codes over finite rings equipped with the Lee metric by showing that the shortest path decision problem, which has been proven to be NPcomplete in [14], can be reduced (in polynomial time) to our problem.
Moreover, we extend the work in [13] and propose original algorithms that are inspired by Stern’s [15], LeeBrickell’s [16] and Prange’s [7] ISD algorithms and that solve the Lee metric variant of the SDP for any Galois ring. A detailed complexity analysis of the proposed algorithms is considered and a comparison with the Hamming case is provided.
The paper is organized as follows. In Section 2 we introduce the notation used throughout the paper, give some preliminary notions on the Lee metric and we formulate some general properties of the Lee metric. In Section 3 we prove the NPcompleteness of the Lee metric version of the SDP. In Section 4 we extend several information set decoding algorithms to , considering the Lee metric and carry out a complexity analysis of these algorithms. We provide a comparison of the ISD algorithms in the Lee metric and in the Hamming metric in Section 5. In Section 6 we draw some concluding remarks and formulate some open problems.
2 Notation and preliminaries
Let be a prime power and be a positive integer. We denote with the ring of integers modulo , and with the finite field with elements, as usual. Given an integer , we denote its absolute value as . We use capital letters to denote sets of integers; for an ordered set , we refer to its th element as . The cardinality of a set is denoted as
. We use bold lower case (respectively upper case) letters to denote vectors (respectively matrices). The identity matrix with size
is denoted as . Given a vector and a set , we denote by the vector consisting of the entries of indexed by . In the same way, for a matrix , denotes the matrix obtained by taking the columns of that are indexed by . This, of course, can be easily generalized to . The support of a vector is defined as . For , we denote by the vectors in having support in .2.1 Coding Theoretic Preliminaries
In this subsection we recall the definitions and main properties of linear codes over finite fields endowed with the Hamming metric, as well as linear codes over finite rings endowed with the Lee metric.
Definition 1
An linear code over is a linear subspace of of dimension .
The size of the code, denoted as , is the number of its codewords. Notice that, for an linear code over , we have . The generator matrix of is a matrix whose row space is . Moreover, is the null space of an paritycheck matrix, where . In classical coding theory one considers codes endowed with the Hamming metric, formally defined as follows.
Definition 2
The Hamming weight of is equal to the size of its support, i.e.,
The Hamming distance of , is defined as the Hamming weight of their difference, i.e.,
Definition 3
Let be an linear code, then we call its minimum distance the minimum Hamming weight of a nonzero codeword, i.e.,
We will sometimes refer to as an code. For a linear code over and we denote by
We will use the following definition of information set, which fits perfectly in the context of ringlinear codes.
Definition 4
For a code over of length and dimension , we call a set of size an information set if .
These definitions can be extended to finite rings.
Definition 5
Let and be positive integers and let be a finite ring. is called an linear code of length and type if is a submodule of , with .
We will restrict to the most preferred case of Galois rings , for some prime and a positive integer .
Definition 6
We say that is a ring linear code of length if is an additive subgroup of .
can be endowed with several metrics, e.g., the Hamming metric, the Lee metric, the homogeneous metric, the Euclidean metric and so on; for an overview see [17].
Definition 7
For we define the Lee value to be
Then, for , we define the Lee weight to be the sum of the Lee values of its coordinates:
As for the Hamming case, we then get a distance.
Definition 8
For , the Lee distance is defined as
Definition 9
We say that is a Lee metric code of length if is an additive subgroup of of type endowed with the Lee metric.
We can define the minimum distance and the concept of information set for Lee metric codes.
Definition 10
Let be a Lee metric code over of length ; then, we call its minimum Lee distance the minimum Lee weight of a nonzero codeword:
Definition 11
For a Lee metric code over of length and type
we call a set of size a (ringlinear) information set if .
This definition makes more sense when we look at the generator matrix and the parity check matrix of ringlinear codes.
Definition 12
Let be a linear code over of length and type . Then is permutation equivalent to a code having the following generator matrix of size , where .
Similarly, is permutation equivalent to a code that has the following parity check matrix of size
(1) 
2.2 Properties of the Lee metric
In this subsection we devise some general properties of the Lee metric that will be useful for the rest of the paper. In the following lemma, resulting from a Plotkintype bound in the Lee metric (see [18, Problem 10.15]), we compute the average Lee weight of an element in .
Lemma 1
Let chosen randomly; then the expected Lee weight of is given by
Proof.
If is even, then summing up all weights gives
If is odd, then we get
To get the average we divide both cases by and get the desired formula. ∎∎
Next, we want to count the vectors in having Lee weight i.e.,
We will consider two cases: either is even, or is odd. Indeed, in the former case there exists only one element in having Lee value , whereas in the latter case there exist two such elements. We will first count the vectors in having Lee weight and a fixed size of support . For this, we introduce
Proposition 1
Let , let and , such that . Then

if is even:

if is odd:
Proof.
A vector having a support of size has at least Lee weight and can have at most Lee weight , which implies that there are no vectors such that .
In the case where is even, there exists only one element in having Lee value , thus if , we can only choose this element in the nonzero positions, which can be done in different ways.
Now we check whether or . In the first case the vector cannot have an entry of Lee value , thus we can choose nonzero positions, compose the wanted Lee weight into parts and for each choice of a part , there exists also the choice , hence many. In the other case, firstly, an entry of the vector could have Lee value , so we cannot simply multiply by anymore and, secondly, the compositions of into parts also consists of parts being greater than which, however, is the largest possible Lee value. For this reason, we have to define recursively. We start with all possible orderings of the desired Lee weight into parts and then take away the orderings that we cannot have, which are starting from a part being and proceed until the largest part is . Thus, we have to take away , repeating this times: the factor 2 is justified by the fact that we have assumed that there are always two choices for an element having Lee value , and times for the position of the entry having Lee value . The case has to be taken away only once, since, in the case where is even, we only have one element having Lee value .
The case in which is odd is simpler, since an element having Lee value does not need to be treated as a special case. ∎∎
Finally, to get the amount of vectors in having Lee weight , we only have to sum all from to .
Corollary 1
Let let and let . Then
(2) 
An upper bound, also observed in [18, Proposition 10.10], and a lower bound on (2) can easily be derived as reported next.
Corollary 2
Let and . Then, is at most
(3) 
and at least
(4) 
Proof.
The proof of the upper bound is given in [18, Proposition 10.10]. For the lower bound, if we count the vectors in with entries in . If , we count the vectors in . ∎∎
Simple computations show that the addends of the sum in (3) are monotonically increasing if and only if, for ,
(5) 
Under these assumptions, the following relation holds
3 An NPcomplete codingtheory problem for the Lee metric
In this section we prove NPcompleteness of the Decisional Lee  Syndrome Decoding Problem (DLSDP) and the Computational Lee  Syndrome Decoding Problem (CLSDP), which are formalized in the following.
Problem 1
Decisional Lee  Syndrome Decoding Problem (DLSDP)
Let and be positive integers. Given , and , does there exist a vector such that and ?
Problem 2
Computational Lee  Syndrome Decoding Problem (CLSDP)
Let and be positive integers. Given , and , find a vector , such that and .
Notice that we consider finite rings whose size is not necessarily a prime power, hence in order to avoid confusion with the variable , where is a prime number and a positive integer, we use a to denote the size of the considered ring.
Clearly, checking whether a vector is in fact a solution of the CLSDP can be done in polynomial time. Hence for the NPcompleteness, it is enough to show that CLSDP is NPhard.
Proving that there does not exist a polynomial time algorithm that solves the LSDP for all choices of is straightforward, since for and the Lee metric on , respectively on , is the same as the Hamming metric, where it is proven that such a solver does not exist. The more interesting question is if there exists a polynomial time algorithm that solves the LSDP for an arbitrary but fixed .
3.1 The shortest path problem in circulant graphs and its connection with the Lee metric
In this section we introduce the Shortest Path Problem (SPP), proven NPcomplete for the class of cyclic graphs [14, Theorem 5], upon which we mainly rely for our reduction of DLSDP.
Let be positive integers. Let be of size and be a graph with nodes and edges , such that
Observe that the considered graph is circulant, i.e., its adjacency matrix is circulant. A path from to of length is a vector , such that and for all . In our case a path from to is associated to a vector , such that there are steps of the form i) if , or ii) if . In other words, we can write
(6) 
Then, the length of the path corresponds to the norm of the associated vector , that is . In particular, (6) depends only on the difference , rather than on the particular values and . Then, for , we define the set of all possible paths connecting two nodes having label difference , that is
(7) 
We may then be interested in finding the shortest length of such paths, that is
(8) 
The (decisional) shortest path problem on a circulant graph is then formalized as follows.
Problem 3
Circulant  Shortest Path Problem (CSPP)
Given the positive integers , a set , and a bound , is ?
The above problem is NPcomplete [14, Theorem 5]. We remark that the hardness of the problem comes from the cyclicity of the considered graph. Indeed, the shortest path problem for undirected unweighted graph is known to be a non NPcomplete problem in general terms, i.e., if the graph is not necessarily circulant. Furthermore, an efficient solver is known, running with time complexity that grows with the graph size that is, . A circulant graph, instead, is unambiguously described by the set , that can be represented with bits. A graph representation that grows as the logarithm of the number of nodes (i.e., that allows a logarithmic reduction in the graph representation) is what differentiates the variant of the problem on circulant graphs from its general formulation on standard graphs.
In the following lemma, we provide an important analogy between the Lee metric and the norm.
Lemma 2
Let be positive integers, such that , and , then
where, with a slight abuse of notation,^{1}^{1}1This abuse of notation will be kept throughout the paper. we consider
Proof.
Since the norm (resp. the Lee weight) of a vector is defined as the sum of the absolute value (resp. the Lee value) of its entries, it is enough to prove the claim for . Let and , then
If , then for and for all it holds that Therefore there exists at least one with Since we are interested in the minimal absolute value of the elements in it is enough to consider elements in Notice that on the set the norm and the Lee value of an element coincide. In fact: if is such that is minimal, then
∎∎
Problem 4
Lowest Lee Subset Sum Problem (LLSSP)
Given the positive integers , a set , and a bound , decide whether the following relation holds
Finally, we introduce a general version of Problem 4, which is described as follows. We consider the collection of sets
and a vector , and define
We then define the following problem, strongly related to LLSSP.
Problem 5
Multiple Lowest Lee Subset Sum Problem (MLLSSP)
Let and be positive integers, let be a collection of length sets over and .
Given a bound , decide whether the following relation holds
Theorem 3.1
The MLLSSP is NPhard.
Proof.
We reduce MLLSSP to the NPhard problem LLSDP.
Given an instance of LLSSDP with input , and , we can construct a MLLSSDP instance with an arbitrary value of , and such that , and . Thus, solving MLLSSDP in polynomial time allows an efficient solution of LLSSDP. ∎∎
Remark 1
Observe that does not need to consist of distinct elements, since we can clearly transform in polynomial time that instance to one with a set , formed by the distinct elements of . It is very easy to see that, as , we have .
3.2 NPcompleteness of DLSDP and CLSDP
In this section we prove NPcompleteness of the Lee metric syndrome decoding problems DLSDP and CLSDP by using the results of the previous subsection. We first provide some additional notation.
Let and be positive integers, and , we define
(9) 
Furthermore, let
(10) 
Then, we introduce the following problem.
Problem 6
Decisional Minimum Lee Syndrome Decoding Problem (DMLSDP)
Let and be positive integers; given , and , is ?
Theorem 3.2
The DMLSDP, the DLSDP and the CLSDP are NPhard.
Proof.
We first reduce DMLSDP to the NPhard problem MLLSSP.
Let be a given instance of MLLSSP. Define , and as and . It is obvious that a solution of the DMLSDP on provides a solution for the initial instance of MLLSSP. Since MLLSSP is NPhard, DMLSDP is NPhard as well.
As a next step, we reduce DLSDP to the NPhard problem DMLSDP.
Starting from an DMLSDP instance , we can consider an instance of DLSDP with the same input. A yes (resp. no) answer to DLSDP implies a yes (resp. no) answer to the DMLSDP. Thus, the NPhardness of DMLSDP implies the NPhardness of DLSDP.
And clearly, if the decisional problem DLSDP is NPhard, also the computational problem CLSDP is NPhard. ∎∎
4 Information set decoding over : adaptation to the Lee metric
The first ISD algorithm was proposed by Prange in 1962 [7]
and can be summarized as follows. As a first step, one chooses an information set and, then, the paritycheck matrix is brought into a standard form through Gaussian elimination. Assuming that the errors are outside of the information set, we perform the same row operations on the syndrome and check if the weight of the transformed syndrome is now equal to the given weight (usually the error correction capacity of the code). If this is the case the transformed syndrome is indeed the error vector. Notice that, in this formulation, we only consider a particular pattern for the error vector; this restriction plays an important role in all ISD algorithms. The weight distribution of the error vector assumed in Prange’s algorithm is indeed not very likely and, even though the cost of one iteration is low, the entire cost of the algorithm, which is, in general, given by the product of the cost of one iteration and the inverted success probability of one iteration, is huge, due to the relatively large amount of iterations needed.
Observe that ISD algorithms are not bruteforce algorithms: in bruteforce algorithms one has to fix an information set and go through all possible error patterns; on the other hand, in ISD algorithms we fix an error pattern and go through all information sets. As a result, ISD algorithms are not deterministic. There have been many improvements upon the original algorithm by Prange, focusing on a more likely error pattern. These approaches increase the cost of one iteration but, on average, require a smaller number of iterations (see [16, 19, 15, 20, 21, 22, 23, 24, 25, 26, 27]). For a complete overview for the binary case see [28]. With new cryptographic schemes proposed over general finite fields, most of these algorithms have been generalized (see [29, 30, 31, 32, 33]).
All ISD algorithms are characterized by the same approach of first randomly choosing a set of positions in the code and then applying some operations that, if the chosen set has a relatively small intersection with the error vector, allow to retrieve the error vector itself. For each ISD variant, the average computational cost is estimated by multiplying the complexity of each iteration by the expected number of performed iterations; the latter quantity corresponds to the reciprocal of the probability that a random choice of the set leads to a successful iteration. Then, for all ISD algorithms, we have a computational cost that is estimated as
, where is the expected number of (binary) operations that are performed in each iteration and is the probability that the choice of the set of positions is indeed successful. We now derive some formulas for the complexity of Prange’s, Stern’s and LeeBrickell’s ISD algorithms, when adapted to the Lee metric.Notice that, in Definition 12, we observed that for Lee linear codes over of length and type we have a different systematic form to the one in the Hamming metric over finite fields and that a Lee linear code over has an information set of size .
4.1 Prange’s ISD adaptation to the Lee metric
The idea of Prange’s algorithm is to first find an information set that does not overlap with the support of the searched error vector ; when such a set is found, permuting
and computing its row echelon form is enough to reveal the error vector. In the Lee analogue of this algorithm we use the same idea. Our proposed adaptation of Prange’s ISD is reported in Algorithm
1. We first find an information set , and then bring the matrixinto a systematic form, by multiplying it by an invertible matrix
. For the sake of clarity, we assume that the information set is , such thatwhere and . Since we assume that no errors occur in the information set, we have that , with . Thus, if we also partition the new syndrome into parts of the same sizes as the (row)parts of , and we multiply by the unknown , we get the following situation
It follows that , hence we are only left to check the weight of .
Input: , , .
Output: with and .
4.2 Complexity analysis: Prange’s ISD in the Lee metric
In this section we provide a complexity estimate of our adaptation of Prange’s ISD to the Lee metric. First of all, we assume that adding two elements in costs binary operations and multiplying two elements costs binary operations [34, 35]. An iteration of Prange’s ISD only consists in bringing into systematic form and to apply the same row operations on the syndrome; thus, the cost can be assumed equal to that of computing , from which we obtain a broad estimate as
(11) 
The success probability is given by having chosen the correct weight distribution of ; in this case, we require that does not overlap with the chosen information set, hence
(12) 
The estimated overall computational cost of Prange’s ISD in the Lee metric is
(13) 
We now analytically compare the complexity of Prange’s ISD in the Lee and Hamming metric, exploiting the properties derived in Section 2. Under the assumption that , with , from Corollary 2 we derive the following chain of inequalities
(14) 
where corresponds to the success probability of an iteration of Prange’s ISD over the Hamming metric, seeking for an error vector of Hamming weight , in a code with length and dimension . A crude approximation, which however is particularly tight when , shows that [36]. Then, we have
Since does not depend on the considered metric, this simple analysis shows that the complexity of Prange’s algorithm over the Lee metric and over the Hamming metric differ at most by a polynomial factor. For all known ISD variants, the complexity grows asymptotically as , where is a constant that depends on the code rate [37]; different ISD variants essentially differ only in the value of . Our analysis shows that, for the Lee metric, Prange’s algorithm leads to an analogous expression. Thus, our results indicate confirm in the Lee metric are as hard as their corresponding Hamming counterparts, except for a relatively small polynomial factor.
4.3 Stern’s ISD adaptation to the Lee metric
As a further contribution of this paper, we improve upon the basic algorithm by Prange by adapting the idea of Stern’s ISD to the Lee metric. In this algorithm, we relax the requirements on the weight distribution, by allowing an information set with small Lee weight and the existence of a (small) set of size , called zerowindow, within the redundant set, where no errors occur. Our proposed adaptation of Stern’s algorithm to the Lee metric is reported in Algorithm 2.
For the sake of readability, in the following explanation we consider an information set and a zerowindow given by , such that , with and . The systematic form of is obtained as
where and . Using the same rowpartitions for the syndrome , we get
which implies the following three conditions
(15)  
(16)  
(17) 
We want to choose such that it has support in the information set and Lee weight , whereas should have a support disjoint from that of , and the remaining Lee weight . More precisely, we test , where and have disjoint supports of respective maximal sizes and and equal weight . In order for (15) and (17) to be satisfied we construct two sets and , where contains the equations regarding and contains the equations regarding . For all choices of and , we check whether the entries of and coincide, if they do we call this a collision. For each collision, we construct from (16) and check if has the missing Lee weight : if this occurs, we have found the error vector .
All these considerations are incorporated in Algorithm 2, where we allow any choice of and .
Input: , , , such that , , and .
Output: with and .
Comments
There are no comments yet.