A Competitive Analysis of Online Knapsack Problems with Unit Density

07/20/2019
by   Will Ma, et al.
MIT
Columbia University
0

We study an online knapsack problem where the items arrive sequentially and must be either immediately packed into the knapsack or irrevocably discarded. Each item has a different size and the objective is to maximize the total size of items packed. While the competitive ratio of deterministic algorithms for this problem is known to be 0, the competitive ratio of randomized algorithms has, surprisingly, not been considered until now. We derive a random-threshold algorithm which is 0.432-competitive, and show that our threshold distribution is optimal. We also consider the generalization to multiple knapsacks, where an arriving item has a different size in each knapsack and must be placed in at most one. This is equivalent to the Adwords problem where item truncation is not allowed. We derive a randomized algorithm for this problem which is 0.214-competitive.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

02/11/2020

An Optimal Algorithm for Online Multiple Knapsack

In the online multiple knapsack problem, an algorithm faces a stream of ...
12/01/2020

Improved Online Algorithms for Knapsack and GAP in the Random Order Model

The knapsack problem is one of the classical problems in combinatorial o...
12/01/2020

New Results for the k-Secretary Problem

Suppose that n items arrive online in random order and the goal is to se...
06/10/2020

An Asymptotically Optimal Algorithm for Online Stacking

Consider a storage area where arriving items are stored temporarily in b...
11/14/2021

Online Max-min Fair Allocation

We study an online version of the max-min fair allocation problem for in...
09/22/2019

Online Knapsack Problems with a Resource Buffer

In this paper, we introduce online knapsack problems with a resource buf...
11/29/2017

Online Knapsack Problem under Expected Capacity Constraint

Online knapsack problem is considered, where items arrive in a sequentia...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Consider the following problem. There is a knapsack of size 1 and an unknown sequence of items with sizes at most 1. The items arrive one-by-one, and each item must be irrevocably either packed into the knapsack or discarded upon arrival. An item can be packed only if its size does not exceed the remaining knapsack capacity. The goal is to maximize the sum of sizes of packed items, i.e. maximize the total capacity filled.

The decision of whether to accept each item into the knapsack is made by an online algorithm, which does not know the sizes of future items. Meanwhile, for any sequence of items, one could consider its optimal offline packing knowing the entire sequence in advance. For , a fixed (but possibly randomized) online algorithm is said to be -competitive if on any sequence, its (expected) capacity packed is at least times the optimal offline packing. We are interested in the highest-possible value of , which is called the competitive ratio.

A problem closely related to ours is the Adwords budgeted allocation problem, where there are multiple knapsacks (“advertiser budgets”) of different sizes, and each item (“impression”) has a different size (“bid”) in each knapsack, but is allocated to at most one. The key difference in Adwords, however, is that an item can be allocated to a knapsack even if it doesn’t fit, in which case it is truncated to fit. With this truncation allowed, a deterministic greedy algorithm, which allocates each item to the knapsack where it has the greatest size, is 1/2-competitive111This is without the small bids assumption; see Mehta et al. (2005). With the small bids assumption, a (non-greedy) deterministic algorithm is -competitive. in general, and 1-competitive (trivially optimal, accept every item until the knapsack is full) for a single knapsack.

In stark contrast, for our problem without truncation, the competitiveness of any deterministic online algorithm cannot be greater than 0, even when there is a single knapsack. This is easy to see: let the first item have size , for a small . If the deterministic algorithm would accept this item, then let the second item have size 1, which must get rejected. Otherwise, if the deterministic algorithm would reject this item, then let the sequence end right there. In either case, there exists a sequence for which the deterministic algorithm packs an arbitrarily small fraction of the optimum.

Using randomized online algorithms, we establish the first constant-factor competitive ratio guarantees for our problem without truncation. Our results are summarized below.

  1. We establish a competitive ratio guarantee of for a single knapsack using a random-threshold algorithm, which draws a threshold from a distribution at the very start, and then accepts all items with size at least that threshold which fit in the knapsack. We show that our competitive ratio is best-possible within the class of random-threshold algorithms.

  2. We establish a competitive ratio guarantee of for the Adwords problem with multiple knapsacks and no truncation.

1.1 Related Work and Motivation

To the best of our knowledge, we are the first to consider the competitive ratio of randomized algorithms for this foundational unit-density222In our problem, since the objective is capacity packed, the reward from packing each item is equal to its size, and hence the term “unit-density”. online knapsack problem. Without the unit-density assumption, the non-existence of any constant competitive ratio guarantee , even for randomized algorithms on a single knapsack, was first established in Marchetti-Spaccamela and Vercellis (1995). Tight instance-dependent competitive ratios (where the guarantee can depend on parameters based on the sequence of items) have also been established in Zhou et al. (2008).

Our unit-density knapsack problem (with multiple knapsacks) has been studied in Stein et al. (2018) when the sequence of items is stochastic, i.e. drawn from a known distribution, instead of adversarial, i.e. completely unknown. In their case, the number and order of items is known, and the size of each item (in each of the knapsacks) is realized independently from known but heterogeneous distributions. They derive an online algorithm which is 0.321-competitive (with multiple knapsacks), where under stochastic arrivals the definition of competitiveness takes an expectation over the arrival sequence when evaluating both the online algorithm and the offline optimum.

Any competitive ratio guarantee for adversarial arrivals also holds under stochastic arrivals, and hence our paper implies an improved guarantee of 0.432 for their problem with a single knapsack (their guarantee of 0.321 does not improve with a single knapsack). However, our guarantee of 0.214 with multiple knapsacks is worse under adversarial arrivals.

We now discuss some other streams of work, whose technical results are not directly related to ours, but whose similar models provide practical motivation for our online knapsack problem with unit-density, no truncation, and adversarial arrivals.

  1. Online Advertising: Many papers (Mehta et al. 2005, Goel and Mehta 2008, Devanur and Hayes 2009) have studied a problem in online advertising, where “a search engine company decide what advertisements to display with each query so as to maximize its revenue” (Mehta et al. 2005). Consumers arrive in an online fashion, type keywords, and reveal their preferences. A search engine then displays personalized advertisements to the consumer, and earns monetary transfers from companies who bid for consumer keywords within their daily budgets. The objective is to maximize the total revenue earned from the companies.

  2. Crowdsourcing: Ho and Vaughan (2012) have studied a problem in online crowdsourcing, where a requester asks workers that arrive online to finish his / her tasks, and cannot split tasks into two. Each worker spends some time to finish the assigned work. The objective is to maximize the total benefit that the requester obtains from the completed work, given time constraints. In a variant (Assadi et al. 2015), each worker picks a subset of tasks, along with task-specific bid numbers. The requester has to assign no more than one task to each worker, by paying the worker the on the bid. The objective of the requester is to either maximize the number of assigned tasks to workers, while not violating the budget constraint.

  3. Healthcare: Stein et al. (2018)

    have studied a closely related problem in healthcare operations, where we assign patients to doctors at the moment a patient makes an advance reservation, and cannot split appointments into two. The objective is to maximize the overall utilization of doctors. For other variants,

    Truong (2015) have studied a two-class advance-scheduling model and computed the optimal scheduling policy. Wang et al. (2018) have studied a simpler problem where each patient only consumes one unit of resource.

  4. Supply Chain Ordering: In our industry partner’s supply chain, a manufacturing plant, having some initial stock capacity, receives a sequence of orders which request some fraction of that stock. The plant must decide whether to fulfill the order using its stock, or reject it (redirecting it to a different plant). Orders cannot be split, because splitting a customer’s demand is often physically impossible or managerially undesirable due to customer service or accounting considerations. The plant is trying to maximize its final utilization, i.e. the total amount of stock used by the end of the time horizon.

1.2 Our Techniques

For the single-knapsack problem, we illustrate our techniques by first considering two simpler algorithms, which build up to our 0.432-competitive algorithm that is optimal among random-threshold algorithms.

We start with the following randomized algorithm. Initially, it flips a coin. With probability 2/3, the algorithm greedily accepts any item which fits in the knapsack. With probability 1/3, the algorithm accepts only the first item to have size at least 1/2 (if such an item exists).

We claim that this simple algorithm yields a constant competitive ratio guarantee of 1/3. To see why, first note that if the greedy policy can fit all the items, then it is optimal, and since the algorithm is greedy with probability 2/3, it would be at least 2/3-competitive. Therefore, suppose that the greedy policy cannot fit some items, and consider two cases. If the sequence contains no items of size at least 1/2, then the greedy policy must have packed size greater than 1/2 by the time it could not fit an item, and hence the algorithm packs expected size at least . In the other case, let denote the size of the first item to have size at least 1/2. When the algorithm is not greedy, it packs size , and when it is, it packs size at least , which equals because . In expectation, the algorithm packs size at least

Since the algorithm in both cases packs size at least 1/3, and the optimal offline packing cannot exceed 1, this completes the claim that the algorithm is 1/3-competitive.

Now, note that the previous algorithm effectively sets a random threshold whose distribution is 0 with probability 2/3, and 1/2 with probability 1/3. To improve upon it, we consider an arbitrary distribution for the threshold given by the CDF , and generalize the above analysis. We now let denote the size of the smallest item which the greedy policy does not fit. In the case where , we use similar arguments as above to deduce that the algorithm packs expected size at least

(1)

However, the other case where is more challenging, because when both terms in (1) equal 0. To refine the analysis, we define to be the maximum number such that, at the time of arrival of the item of size , it would not fit even if we could “magically discard” every accepted item of size less than . By the maximality of , there must exist an item of size . After carefully analyzing the cases (including the one where ), we show that the algorithm’s expected packing size is minimized in the case where it equals when the threshold is at most , and (for an arbitrarily small ) when the threshold is greater than . Therefore, it is lower-bounded by

(2)

Finally, we solve for the maximum at which there exists a threshold distribution such that both expressions (1) and (2) exceed (for all and ). This turns out to be , and since the optimal offline packing cannot exceed 1, the corresponding random-threshold algorithm is 0.428-competitive, as shown in Theorem 3.1.

In Theorem 3.2, we further improve the competitiveness to 0.432. The previous analysis with expressions (1) and (2) was not tight, despite optimizing the distribution , because it merely lower-bounded the algorithm’s expected packing size without considering the consequences on the optimal packing size, which was assumed to be 1. To improve upon the previous distribution, we perturb it to have a positive mass on all of [0,1] (instead of never setting a threshold above 3/7, as in Theorem 3.1). This prevents the adversary from making the optimal packing size 1 by appending a size-1 item to the end of any sequence, because there will always be a positive probability that the algorithm sets a threshold high enough to get the size-1 item. In fact, we show in Theorem 3.2.2 that this perturbed threshold distribution, which yields a 0.432-competitive algorithm, is best-possible for random-threshold algorithms.

Finally, in Theorem 4.1, our 3/14-competitiveness guarantee for multiple knapsacks is based on analyzing the execution of a “virtual algorithm” which is allowed to truncate items like in the Adwords problem. Then, using the fact that the greedy algorithm is 1/2-competitive for Adwords, we can independently realize a random threshold for the admission control of each knapsack, according to the threshold distribution which is 3/7-competitive for a single knapsack, to get a guarantee of for multiple knapsacks. We cannot employ the improved threshold distribution which is 0.432-competitive, because we are comparing against an optimum that is allowed to truncate (recall that the 3/7-competitiveness guarantee was relative to a truncating optimum that is always 1).

1.3 Roadmap

In Section 2 we introduce the model and notations. In Section 3 we introduce our results on a single knapsack. And in Section 4 we introduce our results on multiple knapsacks.

2 Definition of Problems, Notations

Let the entire set of items be , where . Each item is indexed by , the sequence of its arrival. The entire sequence of arrivals is then . Note that refers to an item, not item size. For any , let be the size of item . For any , let be the total size of items in .

Suppose there is a clairvoyant decision maker who knows the entire sequence in advance. This decision maker is going to take the optimal actions (accept / reject) over the process. Let this policy be . does not necessarily guarantee to fill the entire capacity of the knapsack.

For any specific instance of , let denote the total amount filled by on this instance, in expectation where expectation taken over the randomness of the algorithm. Let denote the total amount filled by on this instance. We also use and for and , respectively, if the sequence is clear from the context.

Under any policy, we say that an item is “rejected” because it fails to meet the admission critrion of this policy, e.g. failure to exceed the threshold of a threshold policy. If an item is rejected under a policy, we say that this policy rejects this item.

Under any policy, we say that an item is “blocked” at the moment it arrives, if the remaining capacity of the knapsack is not enough for to fit in. An item is said to be blocked regardless of the fact if it would have been rejected by the policy. If an item is blocked under a policy, we say that this policy blocks this item.

3 A Single Knapsack

Before we start to introduce the tight competitive ratio, we first introduce an easier proof of a (slightly) less competitive algorithm, to build some intuition. In Section 3.1, we prove the bound by only lower bounding the performance of our proposed algorithm. In Section 3.2, we strengthen the bound by lower bounding the performance of our proposed algorithm, and upper bounding the performance of the clairvoyant decision maker, at the same time.

3.1 Warm-Up: A Competitive Algorithm

Let be a threshold policy that accepts any item whose size is greater or equal to , as long as it can fit into the knapsack.

A policy is also referred to as a greedy policy, : accept any item regardless of its size, as long as it can fit into the knapsack. We will interchangeably use and for the same policy.

We propose a randomized threshold policy, , to prove a competitive ratio. Let be a randomized threshold policy that runs as follows,

  1. At the beginning of the entire process, randomly draw

    from a distribution whose cumulative distribution function (CDF) is given by

    (3)
  2. We apply policy throughout the process.

Notice that . This is the point mass we put on . This means that with probability , we will perform .

It is easy to check that our desired algorithm does not know how many items are there in total, not does it know the sizes of the items.

Now we state and prove our first result.

3.1.1 Proof of the Competitive Algorithm

Proof.

Proof of Theorem 3.1. We are going to show that, for any instance of arrival sequence , we have , thus finishing the proof. For any we start our analysis of as follows.

First of all, always accepts something. Denote the set of items accepted by as . Denote . If then is optimal. In this case

If , let denote the set of items blocked by . Since always accepts an item as long as it can fill in, any item blocked by must exceed the remaining space of the knapsack, at the moment it is blocked. We also know that , .

Let be the smallest size in , i.e. . Define index for the smallest item.

(4)

Let be such an item – if there are multiple items that are smallest, pick the first smallest item. Denote as the set of items accepted by , at the moment is blocked. could possibly already blocked some other items from before it blocks . But those items are larger than . Let . See Figure 1. A straightforward, but useful information about is:

(5)

because is blocked by .

Figure 1: Illustration of the items that accepts, and blocks

We wish to understand when we can admit an item of size at least , by selecting a proper threshold .

We distinguish two cases: and .

Case 1: .
Let be the set of items that have size at least , i.e. . Now we pick to be the largest threshold, such that , i.e.

(6)

This means that if we adopt a policy, then the size item must be blocked (possibly it will also be rejected, due to , which leads to the discussion in Case 1.1).

Figure 2: Illustration of Case 1 (and specifically, Case 1.2)

Now consider the items in . These items have sizes at least . We count how many size items are there, and let be the number of size items. Denote the total size of the remaining items be . We know that . See Figure 2.

We make the following observations:

  1. There must exist some item from that is of size , i.e.

    (7)

    This is because we pick otherwise we can select the smallest item size in that is larger than . This smallest item size in satisfies (6), and violates the maximum property of .

  2. Size items can not fit in together with all the items in , i.e.

    (8)

    This is because . This is a strenthened inequality than (5).

  3. A size item can fit in together with items , i.e.

    (9)

    This is because otherwise we could further increase to so that , which violates the maximum property of .

We further distinguish two cases: , and .

Case 1.1: .
In this case, if we adopt then we can get as much as . This is because is defined this way.

If we adopt then we can get no less than . This is because due to (16) there must exist some item of size . We either accept it, in which case we immediately earn , or we have blocked it because we admitted some item from and consumed too much space. But blocks earlier than it accepts , which means that . So in either case we earn .

We have the following:

where the second inequality is because (due to (5)) and (Case 1.1: ); second equality is because and the way we defined in (3) so ; last inequality is because .

Since , we have .

Case 1.2: .
In this case, if we adopt then we can get as much as . This is the definition of .

If we adopt then we get no less than . This is because due to (16) there must exist some item of size . We either accept it, in which case we immediately earn , or we have blocked it because we admitted some item from and consumed too much space. But blocks earlier than it accepts , which means that . So in either case we earn .

If we adopt then we get no less than . This is because due to (9), any item in will not block (from expression (4)); and so we will not reject . We either accept , in which case we immediately earn , or we have blocked it because we admitted some item from M and consumed too much space. But is smallest item size in , which means that . So in either case we earn .

We have the following:

where the second inequality is because and (due to (8)); second equality is because and the way we defined in (3) so ; the last inequality is because , so the coefficient in front of is positive.

Now we plug in the expression of as defined in (3). If then If then So in either case we have shown .

Since , we have .

Case 2: .
In this case, we only hope to get , and a crude analysis is enough. See Figure 3.

Figure 3: Illustration of Case 2

If we adopt then we can get as much as . This is because is defined this way.

If we adopt then we either get , or is blocked, in which case we must have already earned at least to block .

We have the following:

where the second inequality is because ; the last inequality is because (due to (5)).

Now we plug in the expression of as defined in (3). If then , because ; If then

So in either case we have .

Since , we have .

In all, we have enumerated all the possible cases, to find always holds. ∎

3.2 A Competitive Algorithm

In this section we are going to introduce a threshold policy that achieves the best-possible competitive ratio. In Section 3.2.1 we will prove the competitive ratio; In Section 3.2.2 we will show it is best-possible.

We first define some parameters that are going to be useful in the following analysis. Let be a bivariate real function defined as follows:

Now fix to be any number between . Define to be the only local minimizer on the second coordinate of , between – it can be implicitly given as the only solution between , such that

or, approximately,

Define to be the only solution between , such that

(10)

or, approximately,

We can check the following inequality: ,

(11)

We use the definition of a policy as in Definition 3.1. But instead of the algorithm as proposed in Definition 3.1, we now propose another randomized threshold policy, , to prove a competitive ratio. Let be a randomized threshold policy that runs as follows,

  1. At the beginning of the entire process, randomly draw from a distribution whose cumulative distribution function (CDF) is given by

    (12)
  2. We apply policy throughout the process.

Notice that . This is the point mass we put on . This means that with probability , we will perform .

We state our main result here.

3.2.1 Proof of the Competitive Algorithm

The proof idea is the same as in Theorem 3.1, but in order to improve it, we are more careful in upper bounding the performance of . To compare to the proof of Theorem 3.1, Case 1.2 will be different.

Proof.

Proof of Theorem 3.2. We are going to show that, for any instance of arrival sequence , we have , thus finishing the proof. For any we start our analysis of as follows.

First of all, always accepts something. Denote the set of items accepted by as . Denote . If then is optimal. In this case

If , let denote the set of items blocked by . Since always accepts an item as long as it can fill in, any item blocked by must exceed the remaining space of the knapsack, at the moment it is blocked. We also know that , .

Let be the smallest size in , i.e. . Define index for the smallest item.

(13)

Let be such an item – if there are multiple items that are smallest, pick the first smallest item. Denote as the set of items accepted by , at the moment is blocked. could possibly already blocked some other items from before it blocks . But those items are larger than . Let . See Figure 1. A straightforward, but useful information about is:

(14)

because is blocked by . We wish to understand when we can admit an item of size at least , by selecting a proper threshold .

We distinguish two cases: and .

Case 1: .
Let be the set of items that have size at least , i.e. . Now we pick to be the largest threshold, such that , i.e.

(15)

This means that if we adopt a policy, then the size item must be blocked (possibly it will also be rejected, due to ).

We make the following observations:

  1. There must exist some item from that is of size , i.e.

    (16)

    This is because we pick otherwise we can select the smallest item size in that is larger than . This smallest item size in satisfies (15), and violates the maximum property of .

  2. Now consider the items in . See Figure 2. These items have sizes at least . We count how many size items are there, and let be the number of size items. Denote the total size of the remaining items be . We know that

    (17)
  3. Size items can not fit in together with items , i.e.

    (18)

    This is because . This is a strenthened inequality than (5).

  4. A size item can fit in together with items , i.e.

    (19)

    This is because otherwise we could further increase to so that , which violates the maximum property of .

We further distinguish two cases: , and .

Case 1.1: .
In this case, if we adopt then we can get as much as . This is because is defined this way.

If we adopt then we can get no less than . This is because due to (16) there must exist some item of size . We either accept it, in which case we immediately earn , or we have blocked it because we admitted some item from and consumed too much space. But blocks earlier than it accepts , which means that . So in either case we earn .

We have the following:

where the second inequality is because (due to (14)) and (Case 1.1: ); second equality is because , so we plug in as defined in (12).

Since , we have .

Case 1.2: .
First we wish to upper bound . selects some items from , where . Notice that so there is at most item from that can select. If selects no item from , then . With probability , adopts and earns . So we have

If selects one item from , let be such an item, whose size is . See Figure 4.

Figure 4: Illustration of the items accepted by

We can partition all the items in into three sets:

Let . Since and form a partition of , we have . From (18) we know that . This means that even cannot pack and together. must block at least one item from – and the smallest item from this union is of size (because ). So we upper bound by:

(20)

Then we analyze . If we adopt then we can get as much as . This is because is defined this way.

If we adopt then we get no less than . This is because due to (17) there must exist some items in , which are of size . For any subset of items , we either accept it, in which case we immediately earn the size of this subset , or we have blocked it because we admitted some item from and consumed too much space. But blocks earlier than it accepts , which means that . So in either case we earn . Since is chosen arbitrarily, we know that we will always get no less than .

If we adopt then we get no less than . This is because due to (9), any item in will not block (from expression (13)); and so we will not reject . We either accept , in which case we immediately earn , or we have blocked it because we admitted some item from M and consumed too much space. But is smallest item size in , which means that . So in either case we earn .

If we adopt then we get no less than . This is because does exist, and must accept at least one item. The least that can get is .

We have the following:

where the second equality is due to integration by part (our definition of in (12) is a continuous function); the last inequality is because , (because , and is a increasing function), so that is increasing in . Hence, achieves its minimum when is the smallest, and from (18).

Observe that

If we focus on the dependence of , we find that

where the first inequality is because the subgradient of the subtracted term is either or . Since is a increasing function of , it achieves its minimum when .

We have further

Now let , and we plug in as we defined in (18).

Case 1.2.1: When , we have:

If we focus on the dependence of , we will see that has only one local minimum: when we have

because . So is decreasing on when . When we have

so is increasing on . Hence,