Subset Sum Made Simple

07/22/2018 ∙ by Konstantinos Koiliaris, et al. ∙ University of Illinois at Urbana-Champaign Oath Inc. 0

Subset Sum is a classical optimization problem taught to undergraduates as an example of an NP-hard problem, which is amenable to dynamic programming, yielding polynomial running time if the input numbers are relatively small. Formally, given a set S of n positive integers and a target integer t, the Subset Sum problem is to decide if there is a subset of S that sums up to t. Dynamic programming yields an algorithm with running time O(nt). Recently, the authors [SODA '17] improved the running time to Õ(√(n)t), and it was further improved to Õ(n+t) by a somewhat involved randomized algorithm by Bringmann [SODA '17], where Õ hides polylogarithmic factors. Here, we present a new and significantly simpler algorithm with running time Õ(√(n)t). While not the fastest, we believe the new algorithm and analysis are simple enough to be presented in an algorithms class, as a striking example of a divide-and-conquer algorithm that uses FFT to a problem that seems (at first) unrelated. In particular, the algorithm and its analysis can be described in full detail in two pages (see pages 3-5).

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Given a (multi) set of positive integers and an integer target value , the SubsetSum problem is to decide if there is a (multi) subset of that sums up to . The SubsetSum is a classical problem with relatively long history. It is one of Karp’s original NP-complete problems [14], closely related to other fundamental NP-complete problems such as Knapsack [7], Constrained Shortest Path [2], and various other graph problems with cardinality constraints [9, 12, 16]. Furthermore, it is one of the initial weakly NP-complete problems; problems that admit pseudopolynomial time algorithms – a classification identified by Garey and Johnson in [11]. The first such algorithm was given in 1957 111 Note that Bellman wrote this paper before the definition of pseudopolynomial time algorithms was provided by Garey and Johnson in 1977. by Bellman, who showed how to solve the problem in time using dynamic programming [3].

The importance of the SubsetSum problem in computer science is further highlighted by its role in teaching. Both the problem and its algorithm have been included in undergraduate algorithms courses’ curriculums and textbooks for several decades ([6, Chapter 34.5.5], used as archetypal examples for introducing the notions of weak NP-completeness and pseudopolynomial time algorithms to college students [15, Chapter 8.8]. In addition, the conceptually simple problem statement makes this problem a great candidate in the study of NP-completeness [8, Chapter 8.1]), and, finally, Bellman’s algorithm is also often introduced in the context of teaching dynamic programming [10, Chapter 5.6].

Extensive work has been done on finding better and faster pseudopolynomial time algorithms for the SubsetSum (for a collection of previous results see [17, Table 1.1]). The first improvement on the running time was a time algorithm by [18], almost two decades go. Recently, the state-of-the-art was improved significantly to time by the authors [17]. Shortly after, in a follow up work, the running time was further improved to time by Bringmann [5] – the algorithm is randomized and somewhat involved. Abboud et al. [1] showed that it is unlikely that any SubsetSum algorithm runs in time , for any constant and target number , as such an algorithm would imply that the Strong Exponential Time Hypothesis (SETH) of Impagliazzo and Paturi [13] is false.

In this paper, we present a new simple algorithm for the SubsetSum

problem. The algorithm follows the divide-and-conquer paradigm and uses the Fast Fourier Transform (), matching the best deterministic running time

of [17] with a cleaner and more straightforward analysis. The algorithm partitions the input by congruence into classes, computes the subset sums of each class recursively, and combines the results. We believe this new simple algorithm, although not improving upon the state-of-the-art, reduces the conceptual complexity of the problem and improves our understanding of it. We believe the new algorithm can be used in teaching as an example of a pseudopolynomial time algorithm for the SubsetSum problem, as well as a striking example of applying to a seemingly unrelated problem.

Comparison to previous work

Our previous algorithm [17] used a more complicated divide-and-conquer strategy that resulted in forming sets of two different types, that had to be handled separately. Bringmann’s algorithm [5] uses randomization and a two-stage color-coding process. Both algorithms are significantly more complicated than the one presented here.

2 Preliminaries

Let denote the set of integers in the interval . Given a set , let and denote the set of all subset sums of up to by

and the set of all subset sums of up to with cardinality information by

Let , be two sets, the set of pairwise sums of and up to is denoted by

If , are sets of points in the plane, then

Observe, that if and are two disjoint sets, then .

Next, we define two generalizations of the SubsetSum problem. Both can be solved by the new algorithm.

AllSubsetSums INPUT: Given a set of positive integers and an upper bound integer . OUTPUT: The set of all realizable subset sums of up to .

AllSubsetSums INPUT: Given a set of positive integers and an upper bound integer . OUTPUT: The set of all realizable subset sums along with the size of the subset that realizes each sum of up to .
Figure 1: Two generalizations of the SubsetSum problem.

Note that the case where the input is a multiset can be reduced to the case of a set with little loss in generality and running time (see [17, Section 2.2]), hence for simplicity of exposition we assume the input is a set throughout the paper.

3 The algorithm

Here, we show how to solve AllSubsetSums in time. Clearly, computing all subset sums up to also decides SubsetSum with target value .

3.1 Building blocks

The following well-known lemma describes how to compute pairwise sums between sets in almost linear time, in the size of their ranges, using .

[Computing pairwise sums ] The following are true:

  1. Given two sets , , one can compute in time.

  2. Given sets , one can compute in time.

  3. Given two sets of points , , one can compute in time.

Proof.

(A) Let be the characteristic polynomial of . Construct, in a similar fashion, the polynomial (for the set ) and let . Observe that for , the coefficient of in is nonzero if and only if . Using , one can compute the polynomial in time, and extract from it.

(B) Let , and let , for . Compute each , from and , in time using part (A). The total running time is .

(C) As in (A), let and be the characteristic polynomials of and , respectively, and let . For the coefficient of is nonzero if and only if . One can compute the polynomial by a straightforward reduction to regular (see multidimensional [4, Chapter 12.8]), in time, and extract from it. ∎

The next lemma shows how to answer AllSubsetSums quickly, originally shown by the authors in [17], the proof is included for completeness.

[AllSubsetSums [17]] Let be a given set of elements. One can compute, in time, the set , which includes all subset sums of up to with cardinality information.

Proof.

Partition into two sets and of roughly the same size. Compute and recursively, and observe that , . Finally, note that . Applying Lemma 3.1.C yields .

The running time follows the recursive formula , which is , proving the claim. ∎

Next, we show how to compute the subset sums of elements in a congruence class quickly.

Let , with . Given a set of size , one can compute in time.

Proof.

An element can be written as . Let . As such, for any subset of size , we have that

In particular, a pair corresponds to a set of size , such that . The set in turn corresponds to the set . By the above, the sum of the elements of is . As such, compute , using the algorithm of Lemma 3.1, and return as the desired result. ∎

3.2 Algorithm

The new algorithm partitions the input into sets by congruence. Next it computes the AllSubsetSums for each such set, and combines the results. The algorithm is depicted in Figures 2 and 3.

AllSubsetSums: INPUT: A set of positive integers and an upper bound integer . OUTPUT: The set of all subset sums with cardinality information of up to . if return an arbitrary subset of of size return
Figure 2: The algorithm for the AllSubsetSums problem, used as a subroutine in Figure 3.

AllSubsetSums: INPUT: A set of positive integers and an upper bound integer . OUTPUT: The set of all realizable subset sums of up to . for do AllSubsetSums return
Figure 3: The algorithm for AllSubsetSums.

3.3 Result

[AllSubsetSums] Let be a given set of elements. One can compute, in time, the set , which contains all subset sums of up to .

Proof.

Partition into sets , , each of elements. For each , compute the set of all subset sums in time by Lemma 3.1. The time spent to compute all is . Combining using Lemma 3.1.B takes time. Hence, the total running time is . ∎

AllSubsetSums is a generalization of SubsetSum, so the algorithm of subsection 3.3 applies to it.

References