An Optimal Algorithm for Online Multiple Knapsack

02/11/2020 ∙ by Marcin Bienkowski, et al. ∙ Akademia Sztuk Pięknych we Wrocławiu 0

In the online multiple knapsack problem, an algorithm faces a stream of items, and each item has to be either rejected or stored irrevocably in one of n bins (knapsacks) of equal size. The gain of an algorithm is equal to the sum of sizes of accepted items and the goal is to maximize the total gain. So far, for this natural problem, the best solution was the 0.5-competitive algorithm First Fit (the result holds for any n ≥ 2). We present the first algorithm that beats this ratio, achieving the competitive ratio of 1/(1+ln(2))-O(1/n) ≈ 0.5906 - O(1/n). Our algorithm is deterministic and optimal up to lower-order terms, as the upper bound of 1/(1+ln(2)) for randomized solutions was given previously by Cygan et al. [TOCS 2016]. Furthermore, we show that the lower-order term is inevitable for deterministic algorithms, by improving their upper bound to 1/(1+ln(2))-O(1/n).

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Knapsack problems have been studied in theoretical computer science for decades [13, 14]. In particular, in the multiple knapsack problem [2, 5, 6, 7, 10, 12, 18], items of given sizes and profits have to be stored in bins (knapsacks), each of capacity . The goal is to find a subset of all items that maximizes the total profit and can be feasibly packed into bins without exceeding their capacities. We consider an online scenario, where an online algorithm is given a sequence of items of unknown length. When an item is presented to an algorithm, it has to either irrevocably reject the item or accept it to a chosen bin (which cannot be changed in the future). The actions of an online algorithm have to be made without the knowledge of future items.

Proportional case.

In this paper, we focus on the most natural, proportional variant (sometimes called uniform), where item profits are equal to item sizes and the goal is to maximize the sum of profits of all accepted items.

The single-bin case () has been fully resolved: no deterministic online algorithm can be competitive [16], and the best randomized algorithm ROne by Böckenhauer et al. [4] achieves the optimal competitive ratio of .111An online algorithm is called -competitive if, for any input instance, its total profit is at least fraction  of the optimal (offline) solution. While many papers use the reciprocal of as the competitive ratio, the current definition is more suited for accounting arguments in our proofs.

Less is known for multiple-bin case (). Cygan et al. [7] showed that the FirstFit algorithm is -competitive and proved that no algorithm (even a randomized one) can achieve a competitive ratio greater than , where

Other variants.

Some authors focused on the variant, where the goal is to maximize the maximum profit over all bins, instead of the sum of the profits. For this objective, optimal competitive ratios are already known: -competitive deterministic algorithm was given by Böckenhauer et al. [4], and the upper bound of holding even for randomized solutions were presented by Cygan et al. [7].

The multiple knapsack problem can be generalized in another direction: profits and sizes may be unrelated. However, already the unit variant, where the profit of each item is equal to , does not admit any competitive solutions (even randomized ones) [5].

These results together mean that the proportional case studied in this paper is the only variant, whose online complexity has not been fully resolved yet.

1.1 Our results

The main result of this paper is an -competitive deterministic online algorithm for the proportional variant of the multiple knapsack problem. We give insights for our construction in Section 1.3 below and the definition of our algorithm later in Section 2. Given the upper bound of for randomized solutions [7], our result is optimal up to lower-order terms also for the class of randomized solutions.

For deterministic algorithms, we show that the term in the competitive ratio is inevitable: In Appendix C, we show how the upper bound construction given in [7] can be tweaked and extended to show that the competitive ratio of any deterministic algorithm is at most .

1.2 Related work

Some previous papers focused on a removable scenario, where an accepted item can be removed afterwards from its bin [2, 7, 8, 10, 11]. Achievable competitive ratios are better than their non-removable counterparts; in particular, the proportional variant admits constant-competitive deterministic algorithms even for a single bin [10].

The online knapsack problem has been also considered in relaxed variants: with resource augmentation, where the bin capacities of an online algorithm are larger than those of the optimal offline one [11, 17], with a resource buffer [9], or in the variant where an algorithm may accept fractions of items [17].

The hardness of the variants with arbitrary profits and sizes as well as applications to online auctions motivated another strand of research focused on the so-called random-order model [1, 3, 15, 19]. There, the set of items is chosen adversarially, but the items are presented to an online algorithm in random order.

1.3 Algorithmic challenges and ideas

Our algorithm splits items into three categories: large (of size greater than ), medium (of size from the interval ) and small (of size smaller than ). We defer the actual definition of .

First, we explain what an online algorithm should do when it faces a stream of large items. Note that no two large items can fit together in a single bin. If an algorithm greedily collects all large items, then the adversary may give items of size (accepted by an online algorithm) followed by items of size (accepted by an optimal offline algorithm Opt), and the resulting competitive ratio is then . On the other hand, if an algorithm stops after accepting some number of large items, Opt may collect all of them.

Our Rising Threshold Algorithm (Rta) balances these two strategies. It chooses a non-decreasing threshold function and ensures that the size of the -th accepted large item is at least . While an actual definition of f is given later, to grasp a general idea, it is worth looking at its plot in Figure 1 (left). A natural adversarial strategy is to give large items meeting these thresholds and once Rta fills bins, present items of sizes slightly smaller than the next threshold . These items will be rejected by Rta but can be accepted by Opt. Analyzing this strategy and ensuring that the ratio is at least  for any choice of yields boundary conditions. Analyzing these conditions for  tending to infinity, we obtain a differential equation, whose solution is the function f used in our algorithm.

The actual difficulty, however, is posed by medium items. Rta never proactively rejects them and it keeps a subset of marked medium items in their own bins (one item per one bin), while it stacks the remaining, non-marked ones (places them together in the same bin, possibly combining items of similar sizes). This strategy allows Rta to combine a large item with marked medium items later. However, the amount of marked items has to be carefully managed as they do not contribute large gain alone. A typical approach would be to partition medium items into discrete subclasses, control their cardinalities, and analyze the gain on the basis of the minimal size item in a particular subclass. To achieve optimal competitive ratio, we however need a more fine-grained approach: we use a carefully crafted continuous function to control the number of marked items larger than a given value. Analyzing all possible adversarial strategies gives boundary conditions for . In particular, the value that separates medium items from small ones was chosen as the minimum value that ensures the existence of function satisfying all boundary conditions.

Finally, we note that simply stacking small items in their own bins would not lead to the desired competitive ratio. Instead, Rta tries to stack them in a single bin, but whenever its load exceeds , Rta tries to merge them into a single medium item and verify whether such an item could be marked. This allows for combining them in critical cases with large items.

1.4 Preliminaries

We have bins of capacity , numbered from to . An input is a stream of items from , defined by their sizes. Upon seeing an item, an online algorithm has to either reject it or place it in an arbitrary bin without violating the bin’s capacity. The load of a bin , denoted , is the sum of item sizes stored in bin . We define the load of a set of items as the sum of their sizes and the total load as the load of all items collected by an algorithm. Additionally, for any , we define . Note that if we fill a bin with items of size at least , then .

To simplify calculations, for any set of items, we define the gain of , denoted , as their load divided by ; similarly, the total gain is the total load divided by . Furthermore, we use to denote the minimum size of an item in set . If is accepted by our online algorithm, denotes the number of bins our algorithm uses to accommodate these items, divided by . For any value , is the set of all items from of size greater or equal . Whenever we use terms , or for a set that varies during runtime, we mean these values for the set  after an online algorithm terminates its execution.

For any input sequence and an algorithm , we use to denote the total gain of on sequence . We denote the optimal offline algorithm by Opt.

1.5 Neglecting lower-order terms

As our goal is to show the competitive ratio , we introduce a notation that allows to neglect terms of order . We say that is approximately equal to (we write ) if . Furthermore, we say that is approximately greater than (we write ) if or ; we define relation analogously. Each of these relations is transitive when composed a constant number of times.

In our analysis, we are dealing with Lipschitz continuous functions (their derivative is bounded by a universal constant). For such function , (i) the relation is preserved after application of , and (ii) an integral of can be approximated by a sum, as stated in the following facts, used extensively in the paper.

Fact .

Fix any Lipschitz continuous function and values from its domain. Then, . Furthermore, if is non-decreasing, then implies and implies .

Fact .

For any Lipschitz continuous function and integers , satisfying , it holds that .

1.6 Roadmap of the proof

We present our algorithm in Section 2. Its analysis consists of three main parts.

  • In Section 3, we investigate the gain of Rta on large items and explain the choice of the threshold function f.

  • In Section 4, we study properties of medium items, marking routine, function and show how the marked items influence the gain on other non-large items.

  • In Section 5, we study the impact of marked items on bins containing large items.

Each of these parts is concluded with a statement that, under certain conditions, Rta is -competitive (cf. Section 3.1, Section 4.4, Section 5.1 and Section 5.1). In Section 6, we argue that these lemmas cover all possible outcomes. Due to space limitations, some proofs are presented in Appendix B.

2 Rising Threshold Algorithm

We arrange items into three categories: small, medium and large. We say that an item is large if its size is in range , medium if it is in range , and otherwise it is small, where we define

(1)
(2)

We further arrange medium items into subcategories , and : a medium item belongs to if its size is from range . As we partition only medium items this way,  contains items of sizes from . Note that at most items of category  fit in a single bin.

At some times (defined precisely later) a group of small items of a total load from stored in a single bin may become merged, and from that point is treated as a single medium item. We ensure that such merging action does not violate invariants of our algorithm.

Our algorithm Rta applies labels to bins; the possible labels are , , , , , , and . Each bin starts as an -bin, and Rta can relabel it later. The label determines the content of a given bin:

  • an -bin is empty,

  • an -bin (an auxiliary bin) contains small items of a total load smaller than and at most one -bin exists at any time,

  • an -bin contains one or multiple small items,

  • an -bin contains a single medium item,

  • an -bin contains one or more medium items of category ,

  • an -bin contains a single large item and possibly some other non-large ones.

For any label , we define a corresponding set, also denoted , containing all items stored in bins of label . For instance, is a set containing all items stored in -bins. Furthermore, we define as the set of all large items (clearly and ) and the set .

Rta processes a stream of items, and it operates until the stream ends or there are no more empty bins (even if an incoming item could fit in some partially filled bin). Upon the arrival of an item, Rtaclassifies it by its size and proceeds as described below. We also give the pseudocode of Rta in Appendix A.

Large items.

Whenever a large item arrives, Rta compares its size with the threshold , and if the item is smaller, Rta rejects it. The function is defined as

(3)

and depicted in Figure 1 (left). If the item meets the threshold, Rta attempts to put it in an -bin with sufficient space left (relabeling it to ), and if no such bin exists, Rta puts the item in any empty bin.

Figure 1: Left: function f and its integral F. The value of roughly corresponds to our lower bound on the gain of Rta when it collects large items. Right: functions P and Q

used in estimating the gain in

Section 4 and Section 5; note that their arguments are marked at Y axis.

Medium items.

We fix a continuous and decreasing function that maps medium item sizes to :

(4)

We say that the subset of medium items is -dominated if for any . Intuitively, it means that if we sort items of from largest to smallest, then all points  are under or at the plot of , see Figure 2 (left).

Rta never proactively rejects medium items, i.e., it always accepts them if it has an empty bin. Some medium items become marked upon arrival; we denote the set of medium marked items by . Large or small items are never marked. Marked medium items are never combined in a single bin with other marked medium items. At all times, Rta ensures that the set is -dominated. As no two items from are stored in a single bin, this corresponds to the condition for any . Each marked item is stored either in an -bin (alone) or in an -bin (together with a large item and possibly some other non-marked items). That is, .

Whenever a medium item arrives, Rta attempts to put it in an -bin. If it does not fit there, Rta verifies whether marking it (including it in the set ) preserves -domination of . If so, Rta marks it and stores it in a separate -bin. Otherwise, Rta fails to mark the item and the item is stored in an -bin (where  depends on the item size): it is added to an existing bin whenever possible and a new -bin is open only when necessary.

We emphasize that if Rta puts a large item in an -bin later (and relabel it to ), the sole medium item from this bin remains marked (i.e., in the set ). However, if a medium item fits in an -bin at the time of its arrival, it avoids being marked, even though its inclusion might not violate -dominance of the set . Note also that contains medium items Rta failed to mark.

Small items.

Rta never proactively rejects any small item. Whenever a small item arrives, Rta attempts to put this item in an -bin, in an -bin, and in the -bin, in this exact order. If the item does not fit in any of them (this is possible only if the -bin does not exist), Rta places it in an empty bin and relabels this bin to .

If Rta places the small item in an already existing -bin and in effect its load reaches or exceeds , Rta attempts to merge all its items into a single medium marked item. If the resulting medium item can be marked and included in  without violating its -dominance, Rta relabels the -bin to and treats its contents as a single marked medium item from now on. Otherwise, it simply changes the label of the -bin to .

3 Gain on large items

In this section, we analyze the gain of Rta on large items. To this end, we first calculate the integral of function f, denoted F (see Figure 1, left) and list its properties that can be verified by routine calculations.

(5)

The following properties hold for function F.

  1. and for any .

  2. for any .

  3. for any .

Using Section 3, we may bound on the gain of Rta on large items and use this bound to estimate its competitive ratio when it terminates with empty bins.

It holds that . Moreover, for any .

Proof.

For the first part of the lemma, we sort large items from in the order they were accepted by Rta. The size of the -th large item is at least the threshold . Hence, by Section 3, .

To show the second part, we fix any and for each large item of size greater than  we reduce its size to . The total gain of the removed parts is exactly . The resulting large item sizes still satisfy acceptance thresholds, and thus the gain on the remaining part of  is approximately greater than . Summing up yields . ∎

3.1 When RTA terminates with some empty bins

If Rta terminates with some empty bins, then it is -competitive.

Proof.

Fix an input sequence . As Rta terminates with empty bins, it manages to accept all medium and small items from . Furthermore, it accepts large items from according to the thresholds given by function f. Recall that f is non-decreasing: at the beginning it is equal to (Rta accepts any large item) and the acceptance threshold grows as Rta accepts more large items. Let be the value of the acceptance threshold for large items when Rta terminates. We consider two cases.

  • . The threshold used for each large item is at most , i.e., Rta accepts all large items. Then, Rta accepts all items and is -competitive.

  • . Let be the set of all non-large items accepted by Rta. By Section 3,

    where for the last relation we used (by Section 3).

    As Rta takes all non-large items and all large items that are at least , the input sequence  contains items taken by Rta and possibly some large items smaller than . Thus, the gain of Opt on large items is maximized when it takes and fills the remaining bins with large items from smaller than . The total gain of Opt is thus at most

    Comparing the bounds on gains of Rta and Opt and observing that the term is non-negative yields . As , we obtain . ∎

As an immediate corollary, we observe that if contains large items only, then Rta is -competitive: If it terminates with empty bins, then its competitive ratio follows by Section 3.1. Otherwise, it terminates with large items, and hence, by Section 3, . On the other hand, , and therefore the competitive ratio is at most also in this case.

4 Gain on medium items

In the remaining part of the analysis, we make use of the following functions. For any , let and . Both functions are increasing and depicted in Figure 1 (right).

Fix any and any . It holds that and .

4.1 Boundary conditions on function

We start with a shorthand notation. Let , where (so that the values of and are well defined).

Our choice of function satisfies the conditions below. In fact, for our analysis to hold, function could be replaced by any Lipschitz continuous and non-increasing function mapping to satisfying these properties.

The following properties hold for function :

  1. is a non-increasing function of ,

  2. for ,

  3. ,

  4. for ,

  5. for and ,

  6. for and ,

  7. for .

4.2 Marked and tight items

We start with a simple bound on the gain of Rta on -bins. Recall that these bins store single marked items.

If Rta terminates with at least one -bin, then , and .

Proof.

The first condition follows trivially as each -bin contains a single medium item of size at least . For the second condition, , and thus also . As is -dominated, . ∎

We now take a closer look at the marked items and their influence on the gain on other sets of items. We say that a medium marked item is tight if it is on the verge of violating -domination invariant.

An item is tight if .

If an item is tight, then another item of size or greater cannot be included in without violating -domination invariant. Figure 2 (left) illustrates this concept. As can only grow, once an item becomes tight, it remains tight till the end. We emphasize that items smaller than are not relevant for determining whether is tight. If contains a tight item, then denotes the size of the minimum tight item in . This important parameter influences the gain both on set and also on stacking bins and .

If contains a tight item, then .

Proof.

Fix a tight item of size . By Section 4.2, , and thus . ∎

4.3 Impact of tight items on stacking bins

By Property 1 of Section 4.1, is a non-increasing function of . Therefore, the smaller is, the larger is the lower bound on guaranteed by Section 4.2. Now we argue that the larger is, the better is the gain on stacking bins and .

Assume Rta failed to mark a medium item . Then, a tight item exists and .

Proof.

Let . By the lemma assumption, is not -dominated, i.e., there exists an item such that . Note that , as otherwise we would have , and thus , which would contradict -domination of .

Let be the minimum size of an item from . Then, , and thus

(6)

where in the last inequality we used monotonicity of . By (6), is tight. On the other hand, -domination of implies that . This, combined with (6), yields , and thus . Note that remains tight till the end of the execution. This concludes the lemma, as the minimum tight item, , can be only smaller than . ∎

If Rta finishes

  • with at least one -bin, then is defined and ;

  • with at least one -bin, then is defined and .

Proof.

For the first part of the lemma, fix a medium item from of size . By the definition of Rta, it failed to mark this item. Hence, by Section 4.3, is defined and .

Assume now that Rta finishes with at least one -bin. When the first such -bin was created, Rta placed a small item in the already existing -bin of load , and the merge action failed, because Rta failed to mark the resulting item of size . Thus, again by Section 4.3, is defined and . ∎

Figure 2: Left: set of marked items with a tight (gray) item of size . As is tight, insertion of another item of size  (with dashed border) would violate -domination of . Right: items collected by Rta, when it terminates without empty bins and with -bins. Gain on -sets is split into three parts, where the first part corresponds to the gain of (cf. Section 5). The minimum guaranteed load in -bins is given by Section 4.2 and in the bins of by Section 4.3.

To estimate the gain on -bins and -bins, we define

(7)

It holds that .

Proof.

If does not contain a tight item, then, by Section 4.3, both and are empty, and the lemma follows trivially. Thus, in the following we assume that contains a tight item and we take a closer look at the contents of -bins and -bins.

Assume that is non-empty. Rta creates a new -bin (for ) if the incoming medium item of category (of size ) does not fit in any of the existing bins. Hence, each -bin (except at most one) has exactly items, and therefore its load is greater than and is also at least . Thus,

(8)

where the second inequality follows by Section 4.3 and by monotonicity of function pile. Note that (8) holds trivially also when there are no -bins.

If there are no -bins, then, , and the lemma follows.

If there are some -bins, recall that Rta creates a new -bin only if the considered small item does not fit in any existing -bin. Thus, the load of each -bin (except at most one) is at least , and therefore . Combining this with (8) implies . ∎

4.4 When RTA terminates without empty bins and without -bins

Using tight items, we may analyze the case when Rta terminates without empty bins and without -bins, and show that in such case its gain is approximately greater than . As the gain of Opt is at most , this yields the desired competitive ratio.

If Rta terminates without empty bins, then .

Proof.

There is at most one -bin. The remaining bins (at least many) are of classes , , or , and thus . ∎

If on input , Rta terminates without empty bins and without -bins, then .

Proof.

We analyze the gain of Rta on three disjoint sets: , and .

(by L.3, L.4.3 and L.4.4)
(by L.4 and L.4.2)

We consider three cases.

  • does not contain a tight item. Then, .

  • contains a tight item and . Then, , where the last inequality follows by Property 2 of Section 4.1.

  • contains a tight item and . By the definition of function level, this is possible only if -bins exist and . By Section 4.3, the existence of -bins implies . As the function is non-increasing (cf. Property 1 of Section 4.1), . The last inequality follows by Property 3 of Section 4.1. ∎

5 Gain on large items revisited

In this section, we assume that Rta terminates without empty bins and with at least one -bin. Recall that Section 3 allows us to estimate by calculating the gain on large items alone. Now we show how to improve this bound by taking into account non-large items in . First, we leverage the fact that if a (marked) medium item is in , then Rta must have failed to combine it with a large item, and we obtain a better lower bound on the size of each large item. Second, we show that in some cases marked medium items must be in which increases its load. If an -bin exists, we define

Note that is always non-negative. In particular, because for and .

Assume Rta terminates with at least one -bin. Then, the load of any -bin is at least .

Assume that Rta terminates with at least one -bin. Then, .

Proof.

We sort accepted large items by their arrival time and denote the bin containing the -th large item by . The bin contains a large item of size at least because of the acceptance threshold, and its load is at least by Section 5, i.e., .

We now show how to decrease the load in -bins, so that the remaining load in bin  remains at least and the change in the total gain is approximately equal to . This claim is trivial for , so we assume . This is possible only if a tight item exists, and . As , every marked medium item of size from the interval is in (a separate) -bin; let be the set of these bins. As , . Using the tightness of and Section 4.2, . From each bin of we remove a load of . The induced change in the total gain is then approximately equal to .

We now analyze the load of bin after the removal. The original load of bin was at least , and after removal it is at least . This amount is at least (as ) and at least (as ). Hence, the remaining load of is at least .

Thus, . Particular subsets of are depicted in Figure 2 (right); the removed part of gain is depicted as . ∎

Assume that Rta terminates with at least one -bin. Then, for any .

Proof.

We fix any and define . By Section 5, . We denote this lower bound by and we analyze it as a function of . When , then using we obtain . Therefore, we need to lower-bound the value of only for . In such case,

5.1 When RTA terminates without empty bins and with some -bins

The following lemma combines our bounds on gains on , , and .

Assume that Rta run on input terminates without empty bins and with at least one -bin. Then, for any .

Proof.

By the lemma assumptions,

(by L.4.3 and L.4.2)
(by L.4.4)

Applying the guarantee of Section 5 to concludes the proof. ∎

Assume that Rta run on input terminates without empty bins and with at least one -bin. If , then .

Proof.

As , we may apply Section 5.1 with . Note that for . Then,

(by L.4)

The last inequality follows by Property 4 of Section 4.1. ∎

Assume that Rta run on input terminates without empty bins and with at least one -bin. If , then .

Proof.

As , . Section 5.1 applied with any yields