 # Improved (Provable) Algorithms for the Shortest Vector Problem via Bounded Distance Decoding

The most important computational problem on lattices is the Shortest Vector Problem (SVP). In this paper we present new algorithms that improve the state-of-the-art for provable classical/quantum algorithms for SVP. We present the following results. ∙ A new algorithm for SVP that provides a smooth tradeoff between time complexity and memory requirement. For any positive integer q>1, our algorithm takes q^Θ(n) time and requires q^Θ(n/q) memory. In fact, we give a similar time-memory tradeoff for Discrete Gaussian sampling above the smoothing parameter. ∙ A quantum algorithm that runs in time 2^0.9532n+o(n) and requires 2^0.5n+o(n) classical memory and poly(n) qubits. This improves over the previously fastest classical (which is also the fastest quantum) algorithm due to [ADRS15] that has a time and space complexity 2^n+o(n). ∙ A classical algorithm for SVP that runs in time 2^1.73n+o(n) time and 2^0.5n+o(n) space and improves over an algorithm from [CCL18] that has the same space complexity.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

A lattice

is the set of all integer combinations of linearly independent vectors

,

 L=L(b1,…,bn):={n∑i=1zibi:zi∈Z}.

We call the rank of the lattice.

The most important computational problem on lattices is the Shortest Vector Problem (SVP). Given a basis for a lattice , SVP asks us to compute a non-zero vector in with the smallest Euclidean norm.

Starting in the ’80s, the use of approximate and exact solvers for SVP (and other lattice problems) gained prominence for their applications in algorithmic number theory [LLL82], convex optimization [JR.83, KAN87, FT87], coding theory [DE 89], and cryptanalysis tool [SHA84, BRI84, LO85]. In the recent years, many cryptographic primitives have their security based on the worst-case hardness of to within polynomial factors [AJT96, MR04] [REG09, REG06, MR08, GEN09, BV14]. Such cryptosystems have attracted a lot of research interest due to their conjectured resistance to quantum attacks.

The SVP is a well studied computational problem in both its exact and approximate (decision) versions. It is known to be NP-hard to approximate within any constant factor, and hard to approximate within a factor for some under reasonable complexity-theoretic assumptions [MIC98, KHO05, HR07]. For an approximation factor , one can solve SVP in time polynomial in using the celebrated LLL lattice basis reduction algorithm [LLL82]. In general, the fastest known algorithm(s) for solving a polynomial approximation of SVP relies on (a variant of) the BKZ lattice basis reduction algorithm initiated by Schnorr [SCH87, SE94, BL05, AKS01, GN08, ALN+19], which can be seen as a generalization of the LLL algorithm and gives a approximation in time. All these algorithms internally use an algorithm for solving (near) exact SVP in lower-dimensional lattices. Therefore, finding faster algorithms to solve SVP is critical to choosing security parameters of cryptographic primitives.

As one would expect from the hardness results above, all known algorithms for solving exact SVP, including the ones we present here, require at least exponential time and sometimes also exponential space (and the same is true even for polynomial approximation factors). There has been some recent evidence [AS18a] showing that one cannot hope to get a time algorithm for SVP if one believes reasonable complexity theoretic conjectures such as the (Gap) Exponential Time Hypothesis. Most known algorithms for SVP

can be broadly classified into two classes (i) the algorithms that require memory polynomial in

but run in time and (ii) the algorithms that require memory and run in time .

The first class, initiated by Kannan [KAN87, HEL85, HS07, MW15], combines basis reduction with exhaustive enumeration inside Euclidean balls. While enumerating vectors requires time, it is much space-efficient than other kinds of algorithms for exact SVP.

Another class of algorithms, and currently the fastest, is based on sieving. First developed by Ajtai, Kumar, and Sivakumar [AKS01], they generate many lattices vectors and then divide-and-sieve to create shorter and shorter vectors iteratively. A sequence of improvements [REG04, NV08, PS09, ADR+15, AS18b], has led to a

time and space algorithm by sieving the lattice vectors and carefully controlling the distribution of output, thereby outputting a set of lattice vectors that contains the shortest vector with overwhelming probability.

An alternative approach using the Voronoi cell of the lattice was proposed by Micciancio and Voulgaris [MV13] and this gives a deterministic -time and -space algorithm for SVP (and many other lattice problems).

There are variants [NV08, MV10, LMv15, BDG+16, GNR10, AN17, ANS18]

of the above mentioned sieving algorithms that, under some heuristic assumptions, have an asymptotically smaller (but still

) time and space complexity than their provable counterparts.

##### Algorithms giving a time/space tradeoff.

Even though sieving algorithms are asymptotically the fastest known algorithms for SVP, in high dimensions, the memory requirement becomes a limiting factor for running these algorithms, sometimes making them uncompetitive with enumeration algorithms, despite their superior asymptotic time complexity. Thus, it would be ideal and has been a long standing open question to obtain an algorithm that achieves the “best of both worlds”, i.e., an algorithm that runs in time and requires memory polynomial in . In the absence of such an algorithm, it is desirable to have a smooth tradeoff between time and memory requirement for algorithms for SVP

that interpolates between the current best sieving algorithms and the current best enumeration algorithms.

To this end, Bai, Laarhoven, and Stehlé [BLS16] proposed the tuple sieving algorithm, providing such a tradeoff based on heuristic assumptions similar in nature to prior sieving algorithms. They conjecture the time and space complexity of their algorithm to be and , respectively, where one can vary the parameter to obtain a smooth time/space tradeoff. Since the time complexity grows with , experimentally they could only verify the above conjecture for small values of . For this reason, it is difficult to have much confidence in this conjectured time and space complexity of tuple lattice sieving. It is therefore desirable to obtain a provable variant of this algorithm, even if the running time for such an algorithm is instead of .

Kirchner and Fouque [KF16] attempted to do this. They claim an algorithm for solving SVP in time and in space for any positive integer . Unfortunately, their analysis falls short of supporting their claimed result, and the correctness of the algorithm is not clear. We refer the reader to Section 1.3 for more details.

In addition to the above, Chen, Chung, and Lai [CCL18] propose a variant of the algorithm based on Discrete Gaussian sampling in [ADR+15]. Their algorithm runs in time and the memory requirement is . The quantum variant of their algorithm runs in time time and has the same space complexity. Their algorithm has the best space complexity among provably correct algorithms that run in time .

A number of work have also investigated the potential quantum speedups for lattice algorithms, and SVP in particular. A similar landscape to the classical one exists, although the quantum memory model has its importance. While quantum enumeration algorithms only require qubits [ANS18], sieving algorithms require more powerful QRAMs [LMP15, KMP+19].

### 1.1 Our results

In our first result, we present a new algorithm for SVP that provides a smooth tradeoff between the time complexity and memory requirement of SVP. Our tradeoff given in section 3 is the same as what was claimed by Kirchner and Fouque [KF16] and conjectured in [BLS16]

(upto a constant in the exponent). This algorithm is obtained by giving a new algorithm for sampling lattice vectors from the Discrete Gaussian distribution that runs in time

.

###### Theorem 1.1 (Time-space tradeoff for smooth discrete Gaussian, informal)

There is an algorithm that takes as input a lattice , a positive integer , and a parameter above the smoothing parameter of , and outputs samples from using time and space.

Using the standard reduction from BDD with preprocessing to DGS from [DRS14] and a reduction from SVP to BDD given in [CCL18], we obtain the following.

###### Theorem 1.2 (Time-space tradeoff for Svp, informal)

There is an algorithm that takes as input a lattice , a positive integer , and outputs the shortest non-zero vector in and runs in time and requires space .

Our second result is a quantum algorithm for SVP that improves over the current fastest quantum algorithm for SVP [ADR+15] (Notice that the algorithm in [ADR+15] is still the fastest classical algorithm for SVP).

###### Theorem 1.3 (Quantum Algorithm for Svp)

There is a quantum algorithm that solves SVP in time and classical space with an additional number of qubits polynomial in .

Our third result is a classical algorithm for SVP that improves over the algorithm from [CCL18] and results in the fastest classical algorithm that has a space complexity .

###### Theorem 1.4 (Algorithm for Svp with 20.5n+o(n) space)

There is a classical algorithm that solve SVP in time and space.

We then give a high-level overview of our proofs in Section 1.2. In Section 1.3, we compare our results with the previous known algorithms that claim/conjecture a time-space tradeoff for SVP. Section 2 contains some preliminaries on lattices. The proofs of the time-space tradeoff for Discrete Gaussian sampling above the smoothing and the time-space tradeoff for SVP are given in section 3. Our classical and quantum algorithms for solving SVP with space complexity are presented in section 4.

### 1.2 Proof overview

We now include a high-level description of our proofs. Before describing our proof ideas, we emphasize that it was shown in [ADR+15] that we can solve the problem of Bounded Distance Decoding (BDD) where the target vector is a constant factor smaller than given an algorithm for DGS. Additionally, using [CCL18], one can enumerate all lattice points within a distance to a target by queries to the BDD oracle (or queries via a quantum algorithm) with decoding distance . Thus, choosing and , an algorithm for BDD immediately gives us an algorithm for SVP. Thus, it suffices to give an algorithm for DGS above the smoothing parameter.

#### 1.2.1 Time-space tradeoff for Dgs above smoothing.

We begin with the ideas towards showing Theorem 1.1. Then, combined with the reduction from BDD with preprocessing to DGS from [DRS14] and the reduction from SVP to BDD from [CCL18], we obtain Theorem 1.2.

Recall that efficient algorithms are known for sampling from the discrete Gaussian at very high parameters [GPV08, BLP+13]. Thus, as was observed in [ADR+15], it suffices to find a way to efficiently convert samples from the discrete Gaussian with a high parameter to samples with a parameter lowered by a constant factor. By repeating this conversion many times, we can obtain samples with much lower parameters. In [ADR+15], the authors begin by sampling exponentially many vectors from the Discrete Gaussian distribution with parameter and then look for pairs of vectors whose sum is in , or equivalently pairs of vectors that lie in the same coset . Since there are cosets, if we take, say, samples from , almost all of the resulting vectors (except at most vectors) will be paired. A lemma due to Micciancio and Peikert ([MP13]) shows that we get more than vectors statistically close to independent samples from the distribution, , provided that the parameter is sufficiently above the smoothing parameter.

To reduce the space complexity, we modify the idea of the algorithm by generating random samples and checking if the summation of of those samples is in for some integer . Intuitively, if we start with vectors from the , where is sufficiently above the smoothing parameter, each of these vectors is contained in any coset for any with probability roughly . We therefore expect that a generalization of the birthday paradox should show us that, with high probability, there is a set of vectors that sum to a vector in , and hence . The lemma by Micciancio and Peikert ([MP13]) shows that this vector is statistically close to a sample from the distribution, . We can find this combination by trying all subsets of vectors. However, in order to continue the algorithm, we would like to repeat this and find (nearly) independent vectors in . It is not immediately clear how to continue since, in order to guarantee independence, one would not want to reuse the already used vectors and conditioned on the choice of these vectors, the distribution of the cosets containing the remaining vectors is disturbed and is no longer nearly uniform.

Our approach towards showing this is an ad-hoc alternative to the generalized birthday paradox mentioned above where it is shown that if there are elements each of which sampled from a large enough subset of a finite abelian group , then, with high probability, there exist elements that sum to . The proof of this statement is similar to the proof that the inner product is a strong -source extractor [CG88].

#### 1.2.2 A new algorithm for Bdd with preprocessing leading to a faster quantum algorithm for Svp.

In this result, we improve upon the quantum algorithm from [CCL18]. As mentioned above, a BDD oracle from discrete Gaussian sampling can have a decoding distance not greater than or equal to , and the search space is at least , which requires at least quantum queries. Thus, towards optimizing the algorithm for SVP, one should aim to solve -BDD for slightly larger than since a larger value of will still lead to the same running time for SVP. Using known bounds, it can be shown that such an algorithm requires independent (preprocessed) samples from 111The number of samples depends on the kissing number of the lattice, and the number given here is based on the best known upper bound on the kissing number due to [KL78] for for some constant .

In [ADR+15], the authors gave an algorithm that runs in time that outputs samples from for any (i.e., a factor above the smoothing parameter). In order to obtain samples at the smoothing parameter, we construct a dense lattice (given in [ADR+15]) because the smoothing parameter of is smaller than of lattice . We use vectors each sampled independently from and do the rejection of vectors which are not in . By repeating this algorithm, we obtain a time and -space algorithm to solve -BDD with preprocessing, where each call to BDD requires time. The total time complexity of the classical algorithm is , and the corresponding quantum algorithm is thus .

#### 1.2.3 Covering surface of a ball by spherical caps.

As we mentioned above, one can enumerate all lattice points within a distance to a target by queries to the BDD oracle with decoding distance . Our algorithm for BDD is obtained by preparing samples from the discrete Gaussian distribution. However, note that the decoding distance of BDD oracle built by discrete Gaussian samples as shown in [DRS14] is successful if the target vector is within a radius for (there is a tradeoff between and the number of DGS samples needed), and therefore, if we choose to be , as we do in the other algorithms mentioned above, then has to be at least to ensure that the shortest vector is one of the vectors output by the enumeration algorithm mentioned above. We observe here that if we choose a target to be a random vector “close to” but not at the origin, then the shortest vector will be within a radius from the target with some probability , and thus we can find the shortest vector by making calls to the BDD oracle. An appropriate choice of the target and the factor gives an algorithm that runs in time , which is faster than the algorithm (running in time ) mentioned above.

We remark here that the corresponding quantum algorithm runs in time , which is significantly slower than the quantum algorithm mentioned above.

### 1.3 Comparison with previous algorithms giving a time/space tradeoff

Kirchner and Fouque [KF16] begin their algorithm by sampling an exponential number of samples from the Discrete Gaussian distribution and then using a pigeon-hole principle, showing that there is a combination of input lattice vectors of small Hamming weight that results in a vector in , for some large enough integer ; A similar idea was used in [BLS16] to construct their tuple sieving algorithm. In both algorithms, it is difficult to control (i) the distribution of the resulting vectors (ii) the dependence between resulting vectors.

Bai et al [BLS16] get around the above issues by making a heuristic assumption that the resulting vectors behave like those distributed from a “nice enough” distribution of independently sampled vectors (and they do not try to analyze this distribution rigorously).

Kirchner and Fouque, on the other hand, use the pigeon-hole principle to argue that there exist coefficients and lattice vectors in the set of input vectors such that for a large enough integer . It is then stated that has a nice enough Discrete Gaussian distribution. We observe that while the resulting distribution obtained will indeed be close to being a discrete Gaussian distribution, we have no control over the parameter of this distribution and it can be anywhere between and depending on the non-zero co-ordinates in . For instance, let be input vectors which are all from for some large and we want to find the collision in for some positive integer . Suppose that we find a combination and another combination , then by Theorem 2.1, one would expect that and

. This means that the output of the exhaustive search algorithm by Kirchner and Fouque will behave like samples taken from a mixture of discrete Gaussian distributions with different standard deviations, making it extremely difficult to keep track of the standard deviation after several steps of the algorithm, and to obtain samples from the Discrete Gaussian distribution at the desired parameter above the smoothing parameter.

We overcome this issue by showing that there is a combination of the input vectors with a fixed Hamming weight that is in . To show that we can find such a combination we prove an ad-hoc alternative to the generalized birthday paradox where it is shown that if there are elements each of which sampled from a large enough subset of a finite abelian group , then, with high probability, there exist elements that sum to . The proof of this statement is similar to the proof that the inner product is a strong -source extractor [CG88].

There are other technical details that we needed to be careful about that were overlooked in [KF16]. In particular, our argument requires us to be careful with respect to the errors, both in the probability of failure and the statistical distance of the input/output. Since our algorithm performs an exponential number of steps, it is not enough to show that the algorithm succeeds with “overwhelming probability” and the output has a “ negligible statistical distance” from the desired output.

## 2 Preliminaries

Let . We use bold letters for vectors and denote a vector’s coordinates with indices . Throughout the paper, will always be the dimension of the ambient space .

#### 2.0.1 Lattices.

A lattice is a discrete subgroup of , or equivalently the set

of all integer combinations of linearly independent vectors . Such ’s form a basis of . The lattice is said to be full-rank if . We denote by the first minimum of , defined as the length of a shortest non-zero vector of .

For a rank lattice , the dual lattice, denoted , is defined as the set of all points in that have integer inner products with all lattice points,

 L∗={w∈span(L):∀y∈L,⟨w,y⟩∈Z}.

Similarly, for a lattice basis , we define the dual basis to be the unique set of vectors in satisfying if , and , otherwise. It is easy to show that is itself a rank lattice and is a basis of .

#### 2.0.2 Probability distributions.

Given two random variables

and on a set , we denote by the statistical distance between and , which are defined by

 dSD(X,Y) =12∑z∈E∣∣∣PrX[X=z]−PrY[Y=z]∣∣∣ =∑z∈E:PrX[X=z]>PrY[Y=z](PrX[X=z]−PrY[Y=z]).

Given a finite set , we denote by an i.i.d variable on , i.e. for all .

#### 2.0.3 Discrete Gaussian Distribution.

For any , define for all . We write for . For a discrete set , we extend to sets by . Given a lattice , the discrete Gaussian is the distribution over such that the probability of a vector is proportional to :

 PrX∼DL,s[X=y]=ρs(y)ρs(L).

### 2.1 Lattice problems

The following problem plays a central role in this paper.

###### Definition 1

For , a function that maps lattices to non-negative real numbers, and , (the Discrete Gaussian Sampling problem) is defined as follows: The input is a basis for a lattice and a parameter . The goal is to output a sequence of

vectors whose joint distribution is

-close to independent samples from .

We omit the parameter if , and the parameter if . We stress that bounds the statistical distance between the joint distribution of the output vectors and independent samples from .

For our applications, we consider the following lattice problems.

###### Definition 2

The search problem SVP (Shortest Vector Problem) is defined as follows: The input is a basis for a lattice . The goal is to output a vector with .

###### Definition 3

The search problem CVP (Closest Vector Problem) is defined as follows: The input is a basis for a lattice and a target vector . The goal is to output a vector with .

###### Definition 4

For (the approximation factor), the search problem (Bounded Distance Decoding) is defined as follows: The input is a basis for a lattice and a target vector with . The goal is to output a closest lattice vector to .

Note that, while our other problems become more difficult as the approximation factor becomes smaller, becomes more difficult as gets larger. For convenience, when we discuss the running time of algorithms solving the above problems, we ignore polynomial factors in the bit-length of the individual input basis vectors (i.e., we consider only the dependence on the ambient dimension ).

### 2.2 Some preliminary results

For a lattice and , the smoothing parameter is the smallest such that . Recall that if is a lattice and then for all . The smoothing parameter has the following well-known property.

###### Lemma 1 ([Reg09, Claim 3.8])

For any lattice , , , and ,

 1−ε1+ε≤ρs(L+c)ρs(L)≤1.
###### Corollary 1

Let be a lattice, be a positive integer, and let . Let be a random coset in sampled such that . Also, let be a coset in sampled uniformly at random. Then

 dSD(C,U)≤2ε.
###### Proof

By Lemma 1, we have that

 ρs(qL)≥ρs(qL+c)≥1−ε1+ερs(qL),

for any and hence,

 ρs(qL+c)ρs(L)≥1−ε1+ε⋅ρs(qL)ρs(L)≥1−ε1+ε⋅1qn.

We conclude that that

 dSD(C,U) =∑c∈L/qL:Pr[C=c]

as needed. ∎

The following simple bound shows

###### Lemma 2 ([Adr+15, Lemma 2.7])

For any lattice and , we have

Micciancio and Peikert [MP13] showed the following result about resulting distribution from the sum of many Gaussian samples.

###### Theorem 2.1 ([Mp13, Theorem 3.3])

Let be an dimensional lattice, a nonzero integer vector, , and arbitrary cosets of for . Let be independent vectors with distributions , respectively. Then the distribution of is close to , where , and .

We will need the following reduction from -BDD to DGS that was shown in [DRS14].

###### Theorem 2.2 ([Drs14, Theorem 3.1])

For any , let Then, there exists a reduction from to -, where and is the problem of solving CVP for target vectors that are guaranteed to be within a distance of the lattice. The reduction preserves the dimension, makes a single call to the DGS oracle, and runs in time .

We need the following relation between the first minimum of lattice and the smoothing parameter of dual lattice. We will use this to compute the decoding distance of BDD oracle.

###### Lemma 3 ([Adr+15, Lemma 6.1])

For any lattice and , if , where , we have

 √ln(1/ε)π<λ1(L)ηε(L∗)<√β2n2πe⋅ε−1/n⋅(1+o(1)), (1)

and if , we have

 √ln(1/ε)π<λ1(L)ηε(L∗)<√ln(1/ε)+nlnβ+o(n)π. (2)

Following theorem proved in [CCL18], is required to solve SVP by exponential number of calls to -BDD oracle.

###### Theorem 2.3 ([Ccl18, Theorem 8])

Given a basis matrix for lattice , a target vector , an -BDD oracle with , and an integer scalar . Let be

 fαp(s)=−p⋅BDDα(L,(Bs−t)/p)+Bs.

If ), then the list contains all lattice points within distance to .

We will need the following theorems to sample the DGS vectors with a large width.

For any , there is an algorithm that takes as input a lattice , (the desired number of output vectors), and outputs independent samples from in time where .

###### Theorem 2.5 ([Adr+15, Theorem 5.11])

For a lattice , let . Then there exists an algorithm that solves - in time with space for any . Moreover, if the input does not satisfy the promise, and the input parameter , then the algorithm may output vectors for some vectors that are -close to independent samples from .

We also need some preliminaries for quantum computing as well. The following subsection is basically taken from section 2.4 in [CCL18].

#### 2.2.1 Quantum Computation.

In this paper we use the Dirac ket-bra notation. A qubit is a unit vector in with two (ordered) basis vectors . and are the Pauli Matrices. A universal set of gates is

 H =1√2[111−1], S=[100i], T=eiπ/8[e−iπ/800eiπ/8], CNOT =|0⟩⟨0|⊗I+|1⟩⟨1|⊗X.

The Toffoli gate, a three-qubit gate, defined by

 Toffoli|a⟩|b⟩|c⟩={|a⟩|b⟩|1⊕c⟩,if a=b=1;|a⟩|b⟩|c⟩,otherwise,

for . Toffoli gate can be efficiently decomposed into and gates [NC16] and hence it is considered as an elementary quantum gate in this paper. It is easy to see that a NAND gate can be implemented by a Toffoli gate: , where , if , and , otherwise. In particular, Toffoli gate together with ancilla preparation are universal for classical computation, that is for any classical function, we can implement it as (controlled) quantum one and the total quantum elementary gates will be the same order of the classical ones.

###### Definition 5 (Search problem)

Suppose we have a set of objects named , of which some are targets. Suppose is an oracle that identifies the targets. The goal of a search problem is to find a target by making queries to the oracle .

Grover provided a quantum algorithm, that solves the search problem with queries [GRO96]. When the number of targets is unknown, Brassard et al. provided a modified Grover algorithm that solves the search problem with queries [BBH+05], which is of the same order as the query complexity of the Grover search. Moreover, Dürr and Høyer showed given an unsorted table of values, there exists a quantum algorithm that finds the index of minimum with only queries [DH96], with constant probability of error.

###### Theorem 2.6 ([Dh96], Theorem 1)

Let be an unsorted table of items, each holding a value from an ordered set. Suppose that we have a quantum oracle such that . Then there exists a quantum algorithm that finds the index such that is the minimum with probability at least 2 and queries to .

## 3 Algorithms with a time-memory tradeoff for lattice problems

In this section, we present a new algorithm for Discrete Gaussian sampling above the smoothing parameter. Efficient algorithms for Discrete Gaussian sampling at a very high parameter [GPV08, BLP+13] is already known. We present the Exhaustive search Algorithm, by iteratively applying this will decrease the width of the Gaussian distribution. This algorithm works only if the width of the Gaussian is sufficiently above the smoothing parameter.

The following lemma is crucial for the analysis of our algorithm, and is a variant of the proof that the inner product is a strong -source extractor [CG88].

###### Lemma 4

Let be a finite abelian group, and let be a positive integer. Let , and let . Define the inner product by for all . Let be independent and uniformly random variables on , respectively. Then

 dSD((⟨X,Y⟩,X),(UG,X))≤12⋅ ⎷|G|f+1|X|⋅|Y|,

where is uniform in and independent of .

###### Proof

We will use to represent the .

Note that if

is a probability distribution on a set

then

 ∑x∈E(p(x)−1|E|)2=∑x∈E(p(x)2−2p(x)|E|+1|E|2)=∑xp(x)2−1|E|. (3)

Then

 Δ =12∑z∈G,x∈X|Pr[⟨X,Y⟩=z,X=x]−Pr[U=z,X=x]| =12∑z,x∣∣∣Pr[⟨x,Y⟩=z,X=x]−1|G|Pr[X=x]∣∣∣ =12∑z,x∣∣∣Pr[X=x]Pr[⟨x,Y⟩=z]−1|G|Pr[X=x]∣∣∣ by % independence of X and Y =12∑z,xPr[X=x]∣∣∣Pr[⟨x,Y⟩=z]−1|G|∣∣∣ =12|X|∑z,x∣∣∣Pr[⟨x,Y⟩=z]−1|G|∣∣∣ 4Δ2 ≤|G||X|∑z,x(PrY[⟨x,Y⟩=z]−1|G|)2 by Cauchy-Schwarz Inequality =|G||X|∑x(∑zPrY[⟨x,Y⟩=z]2−1|G|) by (???) =|G||X|∑x(PrY,Z[⟨x,Y⟩=⟨x,Z⟩]−1|G|) where Z is i.i.d. as Y ≤|G||X|∑x′∈Gf(PrY,Z[⟨x′,Y⟩=⟨x′,Z⟩]−1|G|) =|G|f+1|X|(PrY,Z,T[⟨T,Y−Z⟩=0]−1|G|) % where T is uniform in Gf ≤|G|f+1|X|(Pr[Y=Z]+PrY,Z,T[⟨T,Y−Z⟩=0∣Y≠Z]−1|G|) ≤|G|f+1|X|(1|Y|+1|G|−1|G|) ≤|G|f+1|X|⋅|Y|,

as needed. ∎

### 3.1 Algorithm for Discrete Gaussian Sampling

We now present the main result of this section.

###### Theorem 3.1

Let be positive integers, and let . Let be any positive integer. Let be a lattice of rank , and let . There is an algorithm that, given independent samples from , outputs a list of vectors that is -close to independent vectors from . The algorithm runs in time and requires memory .

###### Proof

We will prove the result for , and the general result is immediate by repeating the algorithm.

Let be the input vectors and let be the corresponding cosets in . The algorithm does the following:

1. Initialize a list with input vectors, and let .

2. Find vectors (by trying all -tuples) such that . If no such vectors exist, then END.

3. Output the vector , and let . If , then END.

4. Remove all vectors in which are in one of .222This step is for the purpose of simplifying the analysis, and is not necessary.

5. Repeat Steps (2) to (4).

The time complexity and memory requirement of the algorithm is immediate. We now show correctness.

Let , by lemma 2 we get . Without loss of generality, we can assume that the vectors for are sampled by sampling such that and then sampling the vector according to . Moreover, by Corollary 1, this distribution is -close to sampling for , independently and uniformly from , and then sampling the vectors according to . We assume that the input is sampled from this distribution introducing statistical distance at most (using Corollary 1).

Thus, without loss of generality, we can assume that the algorithm initially gets only the corresponding cosets as input, and the vectors for are sampled from immediately after such a tuple is found in Step 2 of the algorithm. These samples are thus independent of all prior steps. This implies that, by Theorem 2.1 that the vector obtained in Step 3 of the algorithm is -close to being distributed as .

It remains to show that our algorithm finds vectors (with high probability).

Notice that after the algorithm finds vectors for any , the algorithm removes all vectors in cosets. Thus, conditioned on the choice of the cosets removed, for each of the remaining input vector, the corresponding coset is sampled uniformly from a set of cosets.

In the next iteration of the algorithm, in Step 2, the algorithm finds vectors that will be removed. Any other vector will be removed with probability at most

 8dqn−8dqn/d.

Thus, the probability that more than more vectors will be removed is at most

 (N4d)(8dqn−8dqn/d)4d≤(2Nqn−8dqn/d)4d≤dO(d)q4n−4nd≤q−4n,

for a large enough value of . Thus, introducing statistical distance at most , we may assume that at each step when a new vector is added, at most vectors are removed. Thus, at any iteration of the algorithm, there are remaining vectors.

It remains to prove that in each of the iterations, with high probability, we find vectors such that the sum of the corresponding cosets is . To see this, consider at any iteration, the cosets corresponding to the first of the remaining vectors which are each sampled uniformly from a set of size at least . Let be a random variable uniform over , and let be a random variable independent of and uniform over vectors in with Hamming weight . The number of such vectors is

 (M8d)≥(M8d)8d=q8n.

Let be a uniformly random coset of . By Lemma 4, we have that

 dSD((⟨X,Y⟩,X),(U,X)) ≤12⋅√(qn)M+1(qn−8dqn/d)M⋅q8n ≤12⋅√q−7n(1−8dqn/d−n)M =12⋅ ⎷q−7n(1−8dqn/d−n)8dqn/d ≤12⋅√q−7n1−64d2q2n/d−n

for a large enough value of .

So, by Markov inequality, with probability