DeepAI

# Fast Modular Subset Sum using Linear Sketching

Given n positive integers, the Modular Subset Sum problem asks if a subset adds up to a given target t modulo a given integer m. This is a natural generalization of the Subset Sum problem (where m=+∞) with ties to additive combinatorics and cryptography. Recently, in [Bringmann, SODA'17] and [Koiliaris and Xu, SODA'17], efficient algorithms have been developed for the non-modular case, running in near-linear pseudo-polynomial time. For the modular case, however, the best known algorithm by Koiliaris and Xu [Koiliaris and Xu, SODA'17] runs in time O (m^5/4). In this paper, we present an algorithm running in time O (m), which matches a recent conditional lower bound of [Abboud et al.'17] based on the Strong Exponential Time Hypothesis. Interestingly, in contrast to most previous results on Subset Sum, our algorithm does not use the Fast Fourier Transform. Instead, it is able to simulate the "textbook" Dynamic Programming algorithm much faster, using ideas from linear sketching. This is one of the first applications of sketching-based techniques to obtain fast algorithms for combinatorial problems in an offline setting.

• 9 publications
• 14 publications
• 46 publications
08/24/2020

### Fast and Simple Modular Subset Sum

We revisit the Subset Sum problem over the finite cyclic group ℤ_m for s...
10/18/2020

### On Near-Linear-Time Algorithms for Dense Subset Sum

In the Subset Sum problem we are given a set of n positive integers X an...
09/11/2022

### Dynamic Subset Sum with Truly Sublinear Processing Time

Subset sum is a very old and fundamental problem in theoretical computer...
12/11/2020

### Faster Deterministic Modular Subset Sum

We consider the Modular Subset Sum problem: given a multiset X of intege...
08/19/2020

### Modular Subset Sum, Dynamic Strings, and Zero-Sum Sets

The modular subset sum problem consists of deciding, given a modulus m, ...
07/28/2021

### Top-k-Convolution and the Quest for Near-Linear Output-Sensitive Subset Sum

In the classical Subset Sum problem we are given a set X and a target t,...
06/05/2021

### Can Subnetwork Structure be the Key to Out-of-Distribution Generalization?

Can models with particular structure avoid being biased towards spurious...

## 1 Introduction

In the Subset Sum problem, one is given a multiset of integers and an integer target and is asked to decide if there exists a subset of the integers that sums to the target . Subset Sum is a classic problem known to be NP-complete, originally included as one of Karp’s 21 NP-complete problems [Kar72]. Despite its NP-completeness, it is possible to obtain algorithms that are pseudo-polynomial in the target . In particular, the “textbook” Dynamic Programming algorithm of Bellman [Bel57] solves the problem in time.

Due to its importance and applications in various areas, there have been a lot of works improving the runtime [Pis99, Pfe99, Pis03, KX17, Bri17], obtaining polynomial space [LN10, Bri17]

, or achieving polynomial decision tree complexity

[MadH84, CIO16, ES16, KLM18]. In addition to Subset Sum, there has recently been a lot of effort in obtaining faster algorithms for the more general problem of Knapsack [EW18, BHSS18, AT18].

The most recent result by Bringmann [Bri17] brings down the runtime for Subset Sum to , which is known to be optimal assuming the Strong Exponential Time Hypothesis [ABHS17]. The fastest known deterministic algorithm by Koiliaris and Xu has a runtime of [KX17].

An important generalization of Subset Sum is the Modular Subset Sum problem, in which sums are taken over the finite cyclic group for some given integer . This problem and its structural properties has been studied extensively in Additive Combinatorics [EGZ61, Ols68, Sze70, Ols75, Alo87, HLS08, Vu08]. The trivial algorithm for deciding whether a given target is achievable modulo runs in time . Interestingly, even the fastest algorithm for (non-Modular) Subset Sum of [Bri17] does not give any nontrivial improvement over this runtime. Koiliaris and Xu [KX17] were able to obtain an algorithm running in time by exploiting structural properties of the Modular Subset Sum problem implied by Additive Combinatorics [HLS08].

The main contribution of our work is an optimal algorithm for the Modular Subset Sum problem.

### 1.1 Our Contributions

In this paper, we present an algorithm for the Modular Subset Sum problem running in time 111 If our algorithm would need to spend time just to read the input. However, if the input is represented succinctly, our algorithm runs in time even if . An -size succinct representation for a multiset of a universe with elements is always possible by listing the elements and their multiplicities. For this reason, we omit the dependence on . See discussion in Section 6..

###### Theorem 1.

There is an

-time algorithm that with high probability returns all subset sums that are attainable, i.e. it solves the Modular Subset Sum problem for all targets

.

Our algorithm works by simulating the “textbook” Dynamic Programming algorithm of Bellman [Bel57] much faster, using ideas from linear sketching to avoid recomputing target sums that are already known to be attainable. We present a summary of these techniques in Section 1.2.

An interesting feature of our algorithm is that, in contrast to most previous results on Subset Sum, it does not rely on the Fast Fourier Transform (FFT). In particular, by setting where is the sum of all input numbers, our algorithm implies an algorithm for the non-Modular Subset Sum problem. It matches the runtime achieved by Koiliaris and Xu [KX17], but without using FFT.

Another important property of our algorithm is that it does not need to know all the input numbers in advance. Instead, it works in an online fashion, by outputting all the newly attainable subset sums for every new number that is provided.

Finally, the runtime of our algorithm is optimal, in the sense that there is no algorithm for any , assuming the Strong Exponential Time Hypothesis or other natural assumptions. This is implied by recent results in Fine-Grained Complexity [ALW14, CDL16, ABHS17] for the Subset Sum problem. Note that our algorithm matches this conditional lower bound for a single target , while also outputting all attainable subset sums.

We expect that our techniques will be applicable to other settings. In particular, an interesting open problem is the following:

###### Open question.

Is there an time algorithm for the non-Modular Subset Sum problem, where is the largest of the given integers?

Such a runtime would improve the best known algorithm for Subset Sum [Bri17] without contradicting any known conditional lower bounds. In Section 6 we discuss how our techniques based on linear sketching could be helpful in resolving this question.

### 1.2 Overview and techniques

#### Certificate Complexity

To illustrate our ideas, it is helpful to first consider the certificate complexity of the Modular Subset Sum problem. It is easy to provide a certificate that target is attainable by just providing a list of elements that sum up to . But how can we certify that there is no such subset?

An idea is to efficiently certify correctness of every step of Bellman’s algorithm. Let and be the set of attainable sums using the first integers. Bellman’s algorithm computes as , where is the -th integer. The running time of Bellman’s algorithm stems from the fact that every step costs time, and there are steps.

To certify it more efficiently, the runtime of our algorithm shouldn’t depend on the whole , but rather spend time proportional to

, i.e. the number of newly created sums. The certificate provides a set which is supposed to be the set of newly added elements. While it is easy to certify that all elements from the provided set are indeed attainable, the harder part is to certify that no elements are missing from the provided set. To do that, we perform Polynomial Identity Testing via inner products with random vectors to check if the characteristic vectors of two sets are the same. To implement this efficiently, we show that it suffices to use pseudorandom vectors obtained by linear hash functions. Linear hash functions allow one to compute very efficiently the hash value of sets under shifts. This is important as operations of the form

appear throughout the execution of Bellman’s algorithm. We defer further details to Section 3.

The ideas above suffice to obtain a non-deterministic algorithm running in time that guesses the newly created elements at every step and certifies whether these guesses are correct. Removing the non-determinism and obtaining an actual algorithm is more challenging. For many problems such as matrix multiplication [Fre77], Orthogonal Vectors [Wil16]

, 3-SUM, Linear Programming, and All-Pairs-Shortest-Paths

[CGI16] the runtimes of the best algorithms are significantly worse than those of their non-deterministic counterparts. More generally, polynomial-sized certificates don’t necessarily imply polynomial-time algorithms, as this is equivalent to the question .

#### Sketching

For the problem of Modular Subset Sum, however, we show that using ideas from linear sketching it is possible to remove the non-determinism by incurring only a poly-logarithmic overhead in the runtime.

Specifically, besides just checking whether the characteristic vectors of two sets are the same, we can use poly-logarithmic size linear sketches of the vectors to identify a position in which they differ. This works by carefully isolating elements by randomly subsampling subsets of entries of different sizes. Naively computing all positions in which two sets differ would require computing new sketches many times with fresh randomness. A contribution of our work is showing that only limited randomness is sufficient, which allows us to maintain only a few data structures for evaluating the sketches.

This technique allows us to recover all elements from the symmetric difference between and , in poly-logarithmic amortized time per element. Observing that half of the elements in this symmetric difference are the newly attainable sums, we discover all of them by spending time which is near-linear in the number of newly attainable sums. More details can be found in Section 4.

Linear sketching has originally been developed with applications to streaming algorithms and dimensionality reduction. Recently, it has also emerged as a powerful tool for Linear algebra [Woo14], dynamic graph algorithms [AGM12, KKM13], and approximation algorithms [ANOY14]. However, to the best of our knowledge, our algorithm is one of the first applications of linear sketching to obtain fast algorithms for combinatorial problems in an offline setting.

## 2 Preliminaries

We first define the problems of Subset Sum and Modular Subset Sum formally.

###### Definition 2 (Subset Sum).

Given integers and a target , decide whether there exists an such that .

###### Definition 3 (Modular Subset Sum).

Given integers and a target , decide whether there exists an such that .

We will also need the following notation.

###### Definition 4.

Given and , we denote by the operation of shifting set by .

###### Definition 5.

We denote by the set of integers in that are coprime with . In other words,

 Z∗m={x∈Zm | gcd(x,m)=1}
###### Definition 6.

Given a random variable

, denotes that is sampled from . Given a random variable and a set of outcomes , denotes that is sampled uniformly at random from .

###### Definition 7.

Given , we denote by the absolute value of .

## 3 A Certificate for Bellman’s algorithm

To illustrate our ideas, we will provide an efficiently verifiable certificate for checking whether a given subset sum is attainable. While certifying attainable subset sums is straightforward, certifying that no subset sums to is more challenging. To achieve this, we provide a certificate that certifies the execution of Bellman’s algorithm. Even though Bellman’s algorithm runs in , we will show that it is possible to certify it in time.

Let and be the set of attainable sums using the first integers. Bellman’s algorithm computes as , where is the -th integer. To certify it more efficiently, the runtime of our algorithm shouldn’t depend on the whole , but rather spend time proportional to . To do this we certify all sets of newly attainable subset sums after the -th number is processed.

Given a collection of sets which are claimed to be , it is straightforward to certify that by checking for all that and . However, certifying that contains all elements in is significantly harder. To perform this verification, suppose we are also given sets , which are claimed to be equal to . Note that requiring knowledge of only doubles the valid certificate size, which directly follows from the claim below:

###### Claim 8.

It is again simple to certify that , by checking for all that and . To check that there are no elements missing from or we use the following claim, which follows from the above discussion.

###### Claim 9.

Given that and , the following statements are equivalent:

• and

By creftype 9, we just need to certify that the vector is the zero vector. A natural way to do this would be to use randomized identity testing.

###### Claim 10.

Given any non-zero vector , we have .

In order to verify that for all the vector is the zero vector the idea is to sample a random vector and compute its inner products with for every . If any is non-zero, this process will detect the inconsistency with constant probability. By repeating, we can amplify the probability of success.

Even though this would suffice, it would not be efficient. The reason for this is that we can’t afford to explicitly keep the characteristic vectors. Instead, we will directly maintain the inner products by using an appropriate data structure. Our data structure should allow us to compute efficiently for any given . However, for a random it seems necessary that such a data structure would need to spend significant time to re-compute the inner product for any such , when we move from to . To alleviate this issue, instead of using uniformly random vectors, we will choose a pseudo-random family that is more amenable to shifting. In particular, we will use the vector for random .

###### Definition 11 (Pseudo-random distribution D).

We define the distribution of vectors
where
,
,
.

We show that despite its limited randomness, this distribution can still be used for identity testing with a slightly smaller success probability, which can again be amplified through repetition.

###### Lemma 12.

Given any non-zero vector , we have .

The simplicity of the family of random vectors that we use allows us to efficiently update and compute inner products with characteristic vectors of the form . Notice that defines a permutation of the indices since . The computation below shows the effect of shifting by .

 ⟨1S+w,r⟩ =m−1∑j=01S+wrj=m−1∑j=01j∈S+w1aj+b∈[0,c]=m−1∑j=01j∈S1a(j+w)+b∈[0,c] =m−1∑j=01j∈S1aj∈[−b−aw,−b+c−aw]=m−1∑j=01j∈aS1j∈[−b−aw,−b+c−aw] =∑j∈[−b−aw,−b+c−aw]1j∈aS

(Note that all the operations are in )

From the above it becomes clear that the required inner product is equivalent to computing the number of elements of the set that lie in a specified interval. To be able to efficiently compute these interval sums we use a data structure that allows insertion of elements and range queries in logarithmic time. Such a data structure can be implemented using a Binary Search Tree. In order to update this data structure, one needs to insert all the “permuted” newly attainable subset sums at step , i.e. .

We will perform queries (one for and one for , for every step ), each taking time. We will also perform at most insertions (one for every distinct subset sum), each taking time. Thus the overall runtime is and succeeds with probability at least , which can easily be amplified by repetition.

## 4 From Certificate to Algorithm via Linear Sketching

So far we have seen how to efficiently check if when provided a certificate listing the elements of and . If the set only lists a strict subset of the elements of , our method would efficiently detect that. In order to make the process constructive, we want to be able to identify a missing element in such cases. This way we can start from and continue growing the set until we recover all elements. We would work similarly for recovering . The main property that enables us to do so is summarized in the following claim which is an extension of Claim 9.

###### Claim 13.

Given that and , the indices of the positive non-zero entries of the vector correspond to the elements missing from , i.e. , while the indices of the negative non-zero entries correspond to the elements missing from .

Linear sketching allows us to go beyond testing whether a vector is zero and identify the index of a non-zero element through inner products with carefully constructed random vectors. Consider a vector that only has a single non-zero entry at position . One way to find its index is by multiplying by the all-ones vector to obtain , and multiplying with the vector to obtain . Then, can be found by dividing the two values, . For an arbitrary vector containing non-zero entries, the same idea can be applied after randomly subsampling entries with probability to isolate a single non-zero entry. For a randomly sampled set , this corresponds to inner products with the vectors and where all entries outside are zeroed out. Linear sketching does not require knowledge of the sparsity parameter but constructs these random sets with different subsampling probabilities .

For our purposes, we show that the constructed sets need not be perfectly random but suffices to be pseudo-random. We will use the distribution of sets given by Definition 11. Such a set has the form for some interval . We define our sketch for some vector to be

We show that this sketch recovers a non-zero entry of with non-trivial probability.

###### Lemma 14.

Given any non-zero vector , we have .

Notice that again the sketch of a vector shifted by , denoted as , can still be written in terms of a sketch of the original vector.

 sketcha(v+w,[l,r])= ∑i:ai∈[l,r](vi−wivi−w)=∑i:ai∈[l−aw,r−aw](vi(i+w)vi)=∑i:ai∈[l−aw,r−aw](10w1)(viivi) = (10w1)sketcha(v,[l−aw,r−aw])

Similar to Section 3, we can build a data structure for a given parameter to efficiently compute the sketch for any range . See further details in Appendix B.

#### Recovering multiple non-zeros

Our discussion so far has focused on identifying a single non-zero entry of the vector . However, once a sketch is used to find a single non-zero, it won’t give any additional non-zero entries. To recover more entries, we need a new sketch with fresh randomness which can be expensive to compute and maintain. Computing the sketch for different parameters would require rebuilding a new data structure from scratch. On the other hand, computing the sketch for a different interval but the same parameter can be efficiently performed using a single data structure.

We show that this is sufficient to recover a constant fraction of the non-zeros. For a given parameter , we can compute sketches for disjoint windows, each yielding a different index of a non-zero entry if the corresponding sketch is valid.

###### Definition 15 (Valid sketch).

is valid if it identifies an index for a non-zero element in the corresponding set where .

We show that for a vector with non-zeros, computing sketches for all windows and so on, with a window size , yields at least half of the non-zero elements of in expectation.

###### Lemma 16.

Given any non-zero vector with non-zero entries. For any , it holds that

 Ea∼UZ∗m[#{j∈{1,…,⌈m/ℓ⌉} : sketcha(v,[(j−1)ℓ,jℓ−1]) is valid}]≥k2

By Markov’s inequality, this means that a single data structure is sufficient to recover a constant fraction of the non-zero entries with constant probability. Thus, with data structures to compute for different values of , we can find all non-zeros with high probability.

Even though by Lemma 16 we can take any window size , the time needed to iterate over all windows is and so we need to pick an , so that we only spend time proportional to the sparsity of .

#### Estimating the window size

As mentioned before, we need to identify an appropriate window size , in order to efficiently recover a constant fraction of non-zeros. If we pick a window size that is too large, we will recover few or none. On the other hand, if the window size is too small, the time spent iterating over windows will be much larger than . To identify an appropriate window size, we start with and keep doubling until we find an

for which a significant fraction of its windows yield valid sketches. We can estimate this fraction of windows that yield valid sketches through sampling.

## 5 Main Algorithm

To describe our algorithm, we will denote by an initially empty data structure that can efficiently maintain a vector , and allows changing entries of and computing for any given interval . Given such a data structure , we denote by this sketch for the corresponding vector . Both of these operations take time.

We will use different parameters for . For each such , we keep a data structure which is initialized as and maintains vector for every iteration .

Furthermore, we will maintain a set , which keeps the elements of and with their corresponding sign. For each , and are initialized as and grow as more elements are discovered, until and respectively. To be able to efficiently compute sketches of , we keep a data structure which is initialized as and maintains vector for every iteration and parameter with .

### 5.1 Finding Non-Zero Elements

In this section we describe the procedure that finds a non-zero element of the vector for a given window while is being considered. The first line of Algorithm 2 efficiently computes the value of as

 sketcha(1Si−1+wi,[l−aw,r−aw])−sketcha(1Si,[l,r])−sketcha(1Ni+−1Ni−,[l,r])

using the available data structures. Line 3 finds the index of the non-zero entry, assuming that the sketch is valid. Line 4 evaluates the vector at the found index and Line 5 checks that the sketch was indeed valid.

### 5.2 Estimating the Window size

To identify an appropriate window size , we start with and keep doubling until we find an for which a significant fraction of its windows yield valid sketches. At every step, we estimate the fraction of windows that yield valid sketches by randomly sampling a window and checking whether the corresponding sketch is valid using Algorithm 2.

### 5.3 Analysis

The goal of this section is to analyze the correctness and running time of Algorithm 1, as summarized in the following theorem:

###### Theorem 17.

Algorithm 1 runs in time and returns the set of attainable subset sums with high probability.

Recall that our goal is to recover all non-zero entries of . Given the current sets and we do that by identifying non-zero entries of the vector

 vi=1Ni−+1Si−1+wi−1Ni+−1Si−1

By Lemma 16 we know that for any it holds that

 Ea∼UZ∗m[#{j∈{1,…,⌈m/ℓ⌉} : sketcha(vi,[(j−1)ℓ,jℓ−1]) is valid}]≥k2

In particular, there exists an with for which the above holds and is a power of . By applying Markov’s inequality we get that

 Pra∼UZ∗m[#{j∈{1,…,⌈m/ℓ∗⌉} : sketcha(vi,[(j−1)ℓ∗,jℓ∗−1]) is % valid}≥k4]≥1100loglogm

Whenever this happens for a chosen , we call the parameter helpful for vector .

We argue that if a helpful is chosen, EstimateWindowSize will return an that recovers at least of the non-zero entries of . This follows from the two lemmas below:

###### Lemma 18.

If EstimateWindowSize does not return , it will return an with at least valid sketches with high probability.

###### Lemma 19.

If the chosen parameter is helpful for , then EstimateWindowSize will return with high probability.

The two lemmas imply that if the chosen parameter is helpful for vector , EstimateWindowSize will return an with at least valid sketches with high probability.

Since is helpful with probability at least , among the data structures that we create with different parameters , there will be at least helpful data structures with high probability. This means that with probability , all elements and will be recovered. The correctness follows by taking union bound over all ’s.

The runtime of the algorithm is . Calls to EstimateWindowSize and FindNonZero only take time. The number of calls to FindNonZero depends on the window size returned by EstimateWindowSize at every iteration. Lemma 18, implies that whenever EstimateWindowSize does not return , it returns an has at least valid sketches. This means that even though calls to FindNonZero are made, the number of such calls can be upper-bounded by times valid sketches. There will be at most valid sketches throughout the execution of the algorithm which yields the claimed bound on the runtime.

## 6 Discussion

#### On the optimality of our algorithm

The work of [ABHS17] shows that the non-Modular Subset Sum problem cannot be solved in time for any constant assuming the Strong Exponential Time Hypothesis (SETH). This lower bound also implies that the Modular Subset Sum problem cannot be solved in time . Indeed, suppose that there exists an time algorithm for the Modular Subset Sum problem. Given an instance of the non-modular version of the problem, we set and run the algorithm for the modular version. This solves the non-modular problem since we can assume that all given integers satisfy without loss of generality. We get time algorithm for the non-modular version of the problem, which contradicts SETH (by [ABHS17]). Thus our algorithm is essentially optimal.

#### Runtime independent of n

If the elements are succinctly described by listing the multiplicities of all elements in , the runtime of the algorithm can be made , independent of . This is because once a given weight does not produce any new subset sums, we can ignore any other elements with the same weight again. Thus, the number of elements we will consider is . This is because the number of elements that will produce a new subset sum is at most and there will be at most times where a new element does not produce any subset sums.

#### Extension to higher dimensions

Our algorithm can be easily extended to solve higher dimensional generalizations of Modular Subset Sum:

Given a sequence of vectors and a vector , does there exist a subset of such that ?

Our algorithm can list all attainable subset sums in time . Instead of using a data structure to compute interval sums and evaluate sketches, it needs to use a data structure supporting high dimensional range queries.

#### Applications to (non-Modular) Subset Sum

The (non-Modular) Subset Sum can be seen as a special case of Modular Subset Sum, by setting where is the sum of all input numbers. Our algorithm implies an algorithm for this problem which matches the runtime achieved by Koiliaris and Xu [KX17], but interestingly does so without using FFT.

An interesting open question is what the best possible runtime for Subset Sum is. Even though the best known algorithm for Subset Sum [Bri17] running in is optimal in its dependence on the target sum assuming the Strong Exponential Time Hypothesis, it is possible that an algorithm exists, where is the largest of the given integers. Such an algorithm would not contradict any of the known conditional lower bounds.

###### Open question.

Is there an time algorithm for the non-Modular Subset Sum problem, where is the largest of the given integers?

It is known that when elements are bounded by , any instance of Subset Sum can be reduced to an instance where the target is in and all elements are possibly negative and lie in the range . This follows from [EW18] using Steinitz Lemma or from [Pis99] by the process of Balanced Fillings. In addition, for some ordering of the elements, any prefix of the optimal subset also lies within the same window . This is quite similar to Modular Subset Sum as one only needs to keep track of elements within an interval of size . It is an interesting question whether the ideas from linear sketching can be used to efficiently simulate Bellman’s algorithm in such a setting.

## References

• [ABHS17] Amir Abboud, Karl Bringmann, Danny Hermelin, and Dvir Shabtay. Seth-based lower bounds for subset sum and bicriteria path. arXiv preprint arXiv:1704.04546, 2017.
• [AGM12] Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. Analyzing graph structure via linear measurements. In Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms, pages 459–467. Society for Industrial and Applied Mathematics, 2012.
• [Alo87] Noga Alon. Subset sums. Journal of Number Theory, 27(2):196–205, 1987.
• [ALW14] Amir Abboud, Kevin Lewi, and Ryan Williams. Losing weight by gaining edges. In European Symposium on Algorithms, pages 1–12. Springer, 2014.
• [ANOY14] Alexandr Andoni, Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. Parallel algorithms for geometric graph problems. In

Proceedings of the forty-sixth annual ACM symposium on Theory of computing

, pages 574–583. ACM, 2014.
• [AT18] Kyriakos Axiotis and Christos Tzamos. Capacitated dynamic programming: Faster knapsack and graph algorithms. arXiv preprint arXiv:1802.06440, 2018.
• [Bel57] Richard Bellman. Dynamic programming (dp). 1957.
• [BHSS18] MohammadHossein Bateni, MohammadTaghi HajiAghayi, Saeed Seddighin, and Clifford Stein. Fast algorithms for knapsack via convolution and prediction. STOC, 2018.
• [Bri17] Karl Bringmann. A near-linear pseudopolynomial time algorithm for subset sum. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1073–1084. Society for Industrial and Applied Mathematics, 2017.
• [CDL16] Marek Cygan, Holger Dell, Daniel Lokshtanov, Dániel Marx, Jesper Nederlof, Yoshio Okamoto, Ramamohan Paturi, Saket Saurabh, and Magnus Wahlström. On problems as hard as cnf-sat. ACM Transactions on Algorithms (TALG), 12(3):41, 2016.
• [CGI16] Marco L Carmosino, Jiawei Gao, Russell Impagliazzo, Ivan Mihajlin, Ramamohan Paturi, and Stefan Schneider. Nondeterministic extensions of the strong exponential time hypothesis and consequences for non-reducibility. In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science, pages 261–270. ACM, 2016.
• [CIO16] Jean Cardinal, John Iacono, and Aurélien Ooms. Solving k-sum using few linear queries. In 24th Annual European Symposium on Algorithms, ESA 2016. Schloss Dagstuhl-Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2016.
• [EGZ61] Paul Erdos, Abraham Ginzburg, and Abraham Ziv. Theorem in the additive number theory. Bull. Res. Council Israel F, 10:41–43, 1961.
• [ES16] Esther Ezra and Micha Sharir. The decision tree complexity for -sum is at most nearly quadratic. arXiv preprint arXiv:1607.04336, 2016.
• [EW18] Friedrich Eisenbrand and Robert Weismantel. Proximity results and faster algorithms for integer programming using the steinitz lemma. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 808–816. SIAM, 2018.
• [Fre77] Rusins Freivalds. Probabilistic machines can use less running time. In IFIP congress, volume 839, page 842, 1977.
• [HLS08] YO Hamidounea, AS Lladób, and O Serrab. On complete subsets of the cyclic group. Journal of Combinatorial Theory, Series A, 115:1279–1285, 2008.
• [Kar72] Richard M Karp. Reducibility among combinatorial problems. In Complexity of computer computations, pages 85–103. Springer, 1972.
• [KKM13] Bruce M Kapron, Valerie King, and Ben Mountjoy. Dynamic graph connectivity in polylogarithmic worst case time. In Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms, pages 1131–1142. Society for Industrial and Applied Mathematics, 2013.
• [KLM18] Daniel M Kane, Shachar Lovett, and Shay Moran. Near-optimal linear decision trees for k-sum and related problems. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 554–563. ACM, 2018.
• [KX17] Konstantinos Koiliaris and Chao Xu. A faster pseudopolynomial time algorithm for subset sum. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1062–1072. SIAM, 2017.
• [LN10] Daniel Lokshtanov and Jesper Nederlof. Saving space by algebraization. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 321–330. ACM, 2010.
• [MadH84] Friedhelm Meyer auf der Heide. A polynomial linear search algorithm for the n-dimensional knapsack problem. Journal of the ACM (JACM), 31(3):668–676, 1984.
• [Ols68] John E Olson. An addition theorem modulo p. Journal of Combinatorial Theory, 5(1):45–52, 1968.
• [Ols75] John Olson. Sums of sets of group elements. Acta Arithmetica, 2(28):147–156, 1975.
• [Pfe99] Ulrich Pferschy. Dynamic programming revisited: Improving knapsack algorithms. Computing, 63(4):419–430, 1999.
• [Pis99] David Pisinger. Linear time algorithms for knapsack problems with bounded weights. Journal of Algorithms, 33(1):1–14, 1999.
• [Pis03] David Pisinger. Dynamic programming on the word ram. Algorithmica, 35(2):128–145, 2003.
• [Sze70] Endre Szemerédi. On a conjecture of erdös and heilbronn. Acta Arithmetica, 17(3):227–229, 1970.
• [Vu08] Van Vu. A structural approach to subset-sum problems. In Building Bridges, pages 525–545. Springer, 2008.
• [Wil16] Richard Ryan Williams. Strong eth breaks with merlin and arthur: Short non-interactive proofs of batch evaluation. In 31st Conference on Computational Complexity, 2016.
• [Woo14] David P Woodruff. Sketching as a tool for numerical linear algebra. Foundations and Trends® in Theoretical Computer Science, 10(1–2):1–157, 2014.

## Appendix A Missing proofs

### a.1 Pseudo-random family

The goal of this section is to prove Lemmas 12, 14, and 16. We will first prove the following Lemma, which bounds the probability that a specific non-zero element falls within the same window with some other non-zero.

###### Lemma 20.

Let be a vector in with exactly non-zero elements. Then ,

 Pra∼UZ∗m[∃ j≠i :vj≠0,∣∣(ai−aj) mod m∣∣
###### Proof.

First of all, note that

 Pra∼UZ∗m[∃ j≠i :vj≠0,∣∣(ai−aj) mod m∣∣

for a fixed . Now, if , we can write and , where . But then we get

 Pra∼UZ∗m[|at mod m|

and so we can assume wlog that . Furthermore, note that in this case , and so wlog we can assume that . So it is enough to bound

 Pra∼UZ∗m[|a|

Finally, we get

 Pra∼UZ∗m[∃ j≠i :vj≠0,∣∣(ai−aj) mod m∣∣

Proof of Lemma 12.

Let be the set of coordinates of non-zero entries of and let be its size. We argue that the set for parameters drawn according to the distribution of Definition 11 will contain a single element from with non-trivial probability.

 PrR∼D[|S∩R|=1]=∑i∈SPr[i∈R]Pr[j∉R%forallj∈S∖{i}|i∈R]

With probability the parameter is selected so that . In that case, we have that .

Moreover, by Lemma 20, we have that

 Pr[j∉R for all j∈S∖{i}|i∈R,c]≥Pra∼UZ∗m[∀ j≠i :vj≠0,∣∣(ai−aj) mod m∣∣>m10kloglogm]>12

Thus overall,

 PrR∼D[|S∩R|=1]≥1O(logm)⋅140loglogm≥1˜O(logm)

Proof of Lemma 14. The result follows by the proof of Lemma 12 which shows that with probability at least a random vector drawn from distribution will isolate a single non-zero entry of . In that case the sketch will give its index correctly.

Proof of Lemma 16.

To prove that

 Ea∼UZ∗m[#{j∈{1,…,⌈m/ℓ⌉} : sketcha(v,[(j−1)ℓ,jℓ−1]) is valid}]≥k2

it suffices to lower bound the probability that a given element is unique in its window of size .

 Ea∼UZ∗m[#{j∈{1,…,⌈m/ℓ⌉} : sketcha(v,[(j−1)ℓ,jℓ−1]) is valid}] ≥∑i∈Zm,vi≠0Pra∼UZ∗m[∄ j≠i :vj≠0,∣∣(ai−aj) mod m∣∣<ℓ] ≥∑i∈Zm,vi≠0Pra∼UZ∗m[∄ j≠i :vj≠0,∣∣(ai−aj) mod m∣∣

where the last inequality follows by Lemma 20.

### a.2 Window size estimation

Given parameter and a window size , let