A Faster FPTAS for the Subset-Sums Ratio Problem

03/27/2018 ∙ by Nikolaos Melissinos, et al. ∙ National Technical University of Athens 0

The Subset-Sums Ratio problem (SSR) is an optimization problem in which, given a set of integers, the goal is to find two subsets such that the ratio of their sums is as close to 1 as possible. In this paper we develop a new FPTAS for the SSR problem which builds on techniques proposed in [D. Nanongkai, Simple FPTAS for the subset-sums ratio problem, Inf. Proc. Lett. 113 (2013)]. One of the key improvements of our scheme is the use of a dynamic programming table in which one dimension represents the difference of the sums of the two subsets. This idea, together with a careful choice of a scaling parameter, yields an FPTAS that is several orders of magnitude faster than the best currently known scheme of [C. Bazgan, M. Santha, Z. Tuza, Efficient approximation algorithms for the Subset-Sums Equality problem, J. Comp. System Sci. 64 (2) (2002)].

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We study the optimization version of the following NP-hard decision problem which given a set of integers asks for two subsets of equal sum (but, in contrast to the Partition problem, the two subsets do not have to form a partition of the given set):

Equal Sum Subsets problem (ESS).

Given a set of positive integers, are there two nonempty and disjoint sets , such that

Our motivation to study the ESS problem and its optimization version comes from the fact that it is a fundamental problem closely related to problems appearing in many scientific areas. Some examples are the Partial Digest problem, which comes from molecular biology (see [2, 3]), the problem of allocating individual goods (see [8]), tournament construction (see [7]), and a variation of the Subset Sum problem, namely the Multiple Integrated Sets SSP, which finds applications in the field of cryptography (see [10]).

The ESS problem has been proven NP-hard by Woeginger and Yu in [11] and several of its variations have been proven NP-hard by Cieliebak et al. in [4, 5, 6]. The corresponding optimization problem is:

Subset-Sums Ratio problem (SSR).

Given a set of positive integers, find two nonempty and disjoint sets , that minimize the ratio

The SSR problem was introduced by Woeginger and Yu [11]. In the same work they present an approximation algorithm which runs in time. The SSR problem received its first FPTAS by Bazgan et al. in [1], which approximates the optimal solution in time no less than ; to the best of our knowledge this is still the faster scheme proposed for SSR. A second, simpler but slower, FPTAS was proposed by Nanongkai in [9].

The FPTAS we present in this paper makes use of some ideas proposed in [9], strengthened by certain key improvements that lead to a considerable acceleration: our algorithm approximates the optimal solution in time, several orders of magnitude faster than the best currently known scheme of [1].

2 Preliminaries

We will first define two functions that will allow us to simplify several of the expressions that we will need throughout the paper.

Definition 1 (Ratio of two subsets).

Given a set of positive integers and two sets we define as follows:

Definition 2 (Max ratio of two subsets).

Given a set of positive integers and two sets we define as follows:

Note that, in cases where at least one of the sets is empty, the Max Ratio function will return . Using these functions, the SSR problem can be rephrased as shown below.

Subset-Sums Ratio problem (SSR) (equivalent definition).

Given a set of positive integers, find two disjoint sets , such that the value is minimized.

In addition, from now on, whenever we have a set we will assume that (clearly, if the input contains two equal numbers then the problem has a trivial solution).

The FPTAS proposed by Nanonghai [9] approximates the SSR problem by solving a restricted version.

Restricted Subset-Sums Ratio problem.

Given a set of positive integers and two integers , find two disjoint sets , such that and the value is minimized.

Inspired by this idea, we define a less restricted version. The new problem requires one additional input integer, instead of two, which represents the smallest of the two maximum elements of the sought optimal solution.

Semi-Restricted Subset-Sums Ratio problem.

Given a set of positive integers and an integer , find two disjoint sets , such that and the value is minimized.

Let be a set of positive integers and . Observe that, if , is the optimal solution of SSR problem of instance and , the optimal solution of Semi-Restricted SSR problem of instance , then:

Thus, we can find the optimal solution of SSR problem by solving the SSR Semi-Restricted SSR problem for all .

3 Pseudo-polynomial time algorithm for Semi-Restricted SSR problem

Let the , be an instance of the Semi-Restricted SSR problem where and . For solving the problem we have to check two cases for the maximum element of the optimal solution. Let , be the optimal solution of this instance and . We define and from which we have that either or . Note that .

Case 1 (). It is easy to see that if , then and the optimal solution will be . We describe below a function that returns this pair of sets, thus computing the optimal solution if Case 1 holds.

Definition 3 (Case 1 solution).

Given a set of positive integers and an integer we define the function as follows:

where .

Case 2 (). This second case is not trivial. Here, we define an integer and a matrix , where , , is a quadruple to be defined below. A cell is nonempty if there exist two disjoint sets , with sums , such that , , and ; if , we require in addition that . In such a case, cell consists of the two sets , , and two integers and . A crucial point in our algorithm is that if there exist more than one pairs of sets which meet the required conditions, we keep the one that maximize the value ; for convenience, we make use of a function to check this property and select the appropriate sets. The algorithm for this case (Algorithm 1) finally returns the pair , which, among those that appear in some , has the smallest ratio .

Definition 4 (Larger total sum tuple selection).

Given two tuples and we define the function as follows:

1:a strictly sorted set , , and an integer , .
2:the sets of an optimal solution for Case 2.
3:,
4:,
5:if  then
6:     for all ,  do
7:         
8:     end for
Algorithm 1 Case 2 solution [ function]
9:      by problem definition
10:     for  to  do
11:         if  then
12:              for all  do
13:                  
14:                  
15:                  
16:                  
17:              end for
18:         else if  then is already placed in
19:              for all  do
20:                  
21:              end for
22:         else
23:              for all  do
24:                  
25:                  if  then
26:                       
27:                  end if
28:                  if  then
29:                       
30:                  end if
31:              end for
32:              for all  do
33:                  
34:                  if  then
35:                       
36:                  end if
37:              end for
38:         end if
39:     end for
40:     for  to  do
41:         
42:         if  then
43:              ,
44:         end if
45:     end for
46:end if
47:return ,

We next present the complete algorithm for Semi-Restricted SSR (Algorithm 2) which simply returns the best among the two solutions obtained by solving the two cases. Algorithm 2 runs in time polynomial in and (where ), therefore it is a pseudo-polynomial time algorithm. More precisely, by using appropriate data structures we can store the sets in the matrix cells in time (and space) per cell, which implies that the time complexity of the algorithm is .

1:a strictly sorted set , , and an integer , .
2:the sets of an optimal solution of Semi-Restricted SSR.
3:
4:
5:if  then
6:     return ,
7:else
8:     return ,
9:end if
Algorithm 2 Exact solution for Semi-Restricted SSR [ function]

4 Correctness of the Semi-Restricted SSR algorithm

In this section we will prove that Algorithm 2 solves exactly the Semi-Restricted SSR problem. Let , be the sets of an optimal solution for input .

Starting with the case 1 (where ), is easy to see that:

Observation 1.

The sets , give the optimal ratio.

Those are the sets which the function returns.

For the case 2 (where ) we have to show that the cell (where ) contains two sets , with ratio equal to optimum. Before that we will show a lemma for the sums of the sets of the optimal solution.

Lemma 1.

Let then we have and .

Proof.

Observe that . This gives us so it remains to prove . Suppose that . We can define the set as . Note that, for all , we have that the . Because of that,

which means that the pair is a feasible solution with smaller max ratio than the optimal, which is a contradiction. ∎

The next two lemmas describe same conditions which guarantee that the cells of are nonempty. Furthermore, they secure that we will store the appropriate sets to return an optimal solution.

Lemma 2.

If there exist two disjoint sets such that

then for all . Furthermore for the sets which are stored in it holds that

Proof.

Note that, for all pairs which meet the conditions, their sums are smaller than because so for the value we have

The same clearly holds for every pair of subsets of , .

We will prove the lemma by induction on . For convenience if we let .
(base case).
The only pair which meets the conditions for is the . Observe that cell is nonempty by the construction of the table and the same holds for , (by line 14). In this case the pair of sets which meets the conditions and the pair which is stored are exactly the same, so the lemma statement is obviously true.
Assume that the lemma statement holds for ; we will prove it for as well.
Let be a pair of sets which meets the conditions. Either or ; therefore either or (respectively) meets the conditions. By the inductive hypothesis, we know that

  • either or (resp.) is nonempty

  • in any of the above cases for the stored pair it holds that:

In particular, if meets the conditions then is nonempty. In line 15 is added to the first set and therefore is nonempty and the stored pair is (or some other with larger total sum). Hence, the total sum of the pair in is at least

If on the other hand is the pair that meets the conditions then is nonempty. In line 16 is added to the second set and therefore is nonempty and the stored pair is (or other with larger total sum). Hence, the total sum of the pair in is at least

The same holds for cells with (due to line 14).
This concludes the proof. ∎

A similar lemma can be proved for sets with maximum element index greater than .

Lemma 3.

If there exist two disjoint sets such that

  • ,

then for all . Furthermore for the sets which are stored in it holds that

Proof.

Note that, for all pairs which meet the conditions, the value it holds that

The same clearly holds for every pair of subsets of , .

We will prove the lemma by induction. Let meet the conditions and .
(base case)
Clearly so the sets meet the conditions of the Lemma 2 which gives us that

  • is nonempty

  • for the stored pair it holds that:

Having the the algorithm uses it in lines 33-36 and adds to the second (stored) set so, we have that is nonempty and the stored sets have total sum (at least):

Furthermore, because is nonempty the above hold, additionally, for all , (because the condition at line 25 is met, the algorithm fills those cells). The above conclude the base case.
Assuming that the lemma statement holds for , we will prove it for .
Here we have to check two cases. Either or not.

Case 1 (). The pair of sets meets the conditions; by the inductive hypothesis, we have

  • is nonempty

  • for the stored pair it holds that:

Having the the algorithm uses it in line 29 and adds to the second (stored) set so we have that is nonempty and the stored sets have total sum (at least):

As before, the same holds for the cells with because the condition at line 25 is met.

Case 2 (). The sets meets the conditions of the Lemma 2 (because ) which gives that

  • is nonempty

  • for the stored pair it holds that:

Having the algorithm uses it in lines 33-36 and adds to the second (stored) set so we have that is nonempty and the stored sets have total sum (at least):

Furthermore, because is nonempty the previous hold for all , (because the condition at line 25 is met). ∎

Now we can prove that, in the second case, the pair of sets which the algorithm returns and the pair of sets of an optimal solution have the same ratio.

Lemma 4.

If is the pair of sets that Algorithm 1 returns, then:

Proof.

Let be the size of the first dimension of the matrix . Observe that for all , , the sets , of the nonempty cells are constructed (lines 23-37 of Algorithm 1) such that and . Therefore the pair returned by the algorithm is a feasible solution. We can see that the sets , meet the conditions of Lemma 3 (the conditions for the sums are met because of Lemma 1) which give us that the cell (where ) is non empty and contains two sets with total sum non less than . Let , be the sets which are stored to the cell . Then we have

(1)

where the second inequality is because

and

By the Eq.1 and because the , have the smallest Max Ratio we have

Now, we can write the next theorem, which follows by the previous cases.

Theorem 4.1.

Algorithm 2 returns an optimal solution for Semi-Restricted SSR.

5 FPTAS for Semi-Restricted SSR and SSR

Algorithm 2, which we presented at Section 3, is an exact pseudo-polynomial time algorithm for the Semi-Restricted SSR problem. In order to derivee a -approximation algorithm we will define a scaling parameter which we will use to make a new set with . The approximation algorithm solves the problem optimally on input and returns the sets of this exact solution. The ratio of those sets is a -approximation of the optimal ratio of the original input.

1:a strictly sorted set , , an integer , , and an error parameter .
2:the sets of a -approximation solution for Semi-Restricted SSR.
3:
4:
5:for  to  do
6:     
7:     
8:end for
9:
10:return ,
Algorithm 3 FPTAS for Semi-Restricted SSR [ function]

Now, we will prove that the algorithm approximates the optimal solution by factor . Our proof follows closely the proof of Theorem 2 in [9].

Let , be the pair of sets returned by Algorithm 3 on input , and and be an optimal solution to the problem.

Lemma 5.

For any

(2)
(3)
Proof.

For Eq. (2) notice that for all we define . This gives us

In addition, for any we have , which means that

For the Eq. (3) observe that for any . By this observation, we can show the second inequality

Lemma 6.

Proof.

The same way, we have

thus the lemma holds. ∎

Lemma 7.

For any , .

Proof.

If , let , otherwise <