I Introduction
Knapsack problem [1] is a versatile combinatorial object that models a large variety of resource allocation paradigms. The knapsack has a given capacity, and the objective is to choose a set of items with the largest sum of their values such that the sum of their weights/sizes is less than the knapsack capacity. There are several important examples of reallife applications of knapsack problem, such as allocation of an advertising budget to the promotions of individual products, allocation of preparation of final exams in different subjects given limited time, job scheduling in clouds with overall machine time constraint, sensor networks with energy constraints etc. Applications of the knapsack problem also include questions in auction design, such as to choose agents with private values and publicly known weights that fit into a knapsack [2].
The knapack problem is is known to be NPhard even in the offline setting, where an algorithm can select items by considering all items together. It is, however, possible in the offline setting to approximate the optimal solution within a factor of for any in polynomial time [1].
The online version of the knapsack problem models the question of resource allocation under the future uncertainties, where items arrive in a sequential fashion and any algorithm has to accept or reject items irrevocably without having access to future arrivals. The online scenario is relevant for applications, such as in cloud servers, where jobs have to accepted/rejected without the knowledge of profitability of future jobs, or to hire a particular candidate not knowing whether a stronger candidate might apply at a later stage, generalized adwords [3], load balancing [4], cognitive radio, admission control [5, 6, 7, 8, 9]. etc. The performance of any online algorithm is typically quantified using the metric of competitive ratio, that measures the ratio of the profit of the online algorithm and the optimal offline algorithm (that has access to noncausal information). The online version of the knapsack problem has also received considerable attention in the literature [10, 11, 12, 13], with the best known competitive ratio of in [12].
In this paper, we consider an important variation of the online knapsack problem, where we enforce the capacity constraint in expectation, that is of both practical and theoretical interest. Classically, for the knapack problem (both in offline and online cases), a hard capacity constraint is enforced for selecting the items, i.e., the sum of the weight of all the selected items is below a fixed capacity. In modern applications, there are many scenarios, such as cloud computing, where it is sufficient to enforce the capacity constraint in expectation, i.e., for specific instances of input the algorithm might decide to use a larger capacity, but in expectation it satisfies the given constraint. Our expected capacity constraint generalizes the overdraft approach of [14] for similar capacitated problems, where performance within of the optimal revenue can be achieved when resources of the order of over the specified budget/capacity constraint are allowed to be used.
One specific application (among other online knapsack applications mentioned above) that motivates the study of the knapsack problem under the expected capacity constraint is job scheduling in clouds, where a large number of jobs are submitted with heterogenous resource requirements, and it is reasonable to expect the cloud to execute these jobs by using larger memory and resources for some instances of input while maintaining the resource constraint on average so as to maximize its utility function. Similar case can be made for other applications of the knapsack problem such as generalized adwords, load balancing, and sensor network, where it is easy to envisage an expected resource capacity constraint.
An important special case of the online knapsack problem is the secretary problem [15], where secretaries are interviewed sequentially, and as soon as one secretary is hired, the process terminates, and no more secretaries are interviewed. The secretary problem is equivalent to an online knapsack problem, where the weight of each item is and the hard capacity constraint is also
. Thus, in the secretary problem, the objective is to select only one item with the largest value in an online fashion. The secretary problem can be cast as an Markov decision process and has attracted attention from different research communities because of its universality. One limitation of the secretary problem is however that if the input, order of the arrival of items, is controlled by an adversary, then the performance of an optimal algorithm is arbitrarily bad. To keep the problem nondegenerate, a universal assumption is made about the input arrival sequence to be selected via an uniformly random permutation over the set of items, which is also called the secretarial model of input.
Under the secretary model of input, the classical secretary problem has been solved via multiple approaches as reviewed in [16], and the optimal probability (competitive ratio) of selecting the best item is known to be . Over the years, multiple variants of secretary problems have been studied, that have been well documented in survey [16]. Some important variations of the secretary problem include multiple choice [17, 18], infinitely many items [19], unknown number of items [16, 15], maximizing the expected value [16, 15], matroid constraint [20], etc. A simple extension of the secretary problem, called the secretary, is when the objective is to select items with the largest sum of their values, when all the weights are unity. Similar to the secretary problem, the optimal algorithm has competitive ratio [12] for as well.
To the best of our knowledge, however, the expected capacity constraint, which in the context of the secretary problem implies that an algorithm can select at most items in expectation, where the randomness is over the uniformly random input sequence, has not been studied and this new direction is rather novel. The fundamental difference between the hard capacity constraint and the expected capacity constraint in the context of secretary or online knapsack problem is that with the expected capacity constraint, an algorithm can scan all the items, and need not terminate as soon as the sum of the weight of selected items is as much as the fixed capacity. For example, given that the values of items in order or their arrival be . Then with a hard capacity constraint of , if the algorithm decides to choose the second item with value , then the algorithm terminates. With an expected capacity constraint of , however, even if an algorithm selects item , it can still consider the two items arriving thereafter and select item with value . Note that an algorithm is not allowed to remove already selected items, e.g. in this case item . Thus, with expected capacity constraint, the algorithm will have to appropriately modulate the probability of selecting item and subsequently item . Thus, the expected capacity constraint allows the algorithm a significantly larger flexibility.
The general online knapsack problem has been studied widely [10, 11, 12] under the hard capacity constraint, with the best known competitive ratio of reported in [12] for a randomized algorithm under the secretarial input. Under a large market assumption, that requires that the value of any item is ‘small’ compared to the value of the optimal solution, an online algorithm for the knapsack problem with competitive ratio of is proposed in [21]. The stochastic version of the knapsack problem has been studied in [10], while restricting the ratio of the value and the weight of any items to lie within , competitive algorithms have been proposed in [22, 23, 24].
Designing online algorithms for knapsack problem in comparison to the secretary problem are significantly more challenging as evident in [10, 11, 12], primarily because there is no ‘simple’ offline algorithm that can approximate the optimal solution. Indeed, it is possible to approximate the optimal solution of the knapsack problem in the offline setting within a factor of for any in polynomial time, however, that algorithm is not amenable to be made online.
As discussed before, for the case of hard capacity constraint, it is easy to see that if the input (values and weights of items) is chosen adversarially, no deterministic online algorithm can have bounded competitive ratio, and no randomized algorithm can have competitive ratio better than for the secretary problem, where is the total number of items. We show in this paper that even under the expected capacity constraint no randomized algorithm can have competitive ratio better than for the secretary problem under the adversarial input. Thus, following the long line of work on secretary and online knapsack problem [12, 25], we consider a secretarial input model even when considering the expected capacity constraint, where the order of arrival of items is uniformly random, but their values and weights are allowed to be arbitrary.
Our contributions are as follows:

For the secretary problem, under the expected capacity constraint of , we propose an algorithm whose competitive ratio is . To complement the result, we also show that no online algorithm can achieve competitive ratio better than under the expected capacity constraint. Compared to the hard capacity constraint of , where the optimal competitive ratio is , there is a twofold improvement in the competitive ratio with the expected capacity constraint.

For the secretary problem, where the objective is to select the best items and all items have weight , under the expected capacity constraint of , a simple modification of the algorithm proposed for the secretary problem is shown to achieve a competitive ratio is , which is also the best possible.

We propose a competitive algorithm for the online knapsack problem under the expected capacity constraint, which significantly improves the performance of best known algorithm that has competitive ratio of under the hard capacity constraint [12]. The main idea of the proposed algorithm is to first consider a ‘simple’ offline algorithm that is allowed to use extra capacity , where , that can be shown to provide a  approximation to the optimal solution of the offline knapsack problem with hard capacity constraint . The ‘simple’ offline algorithm also provides a threshold for selection of items in terms of the ratio of their weight and the value.
Using this threshold, we then make the ‘simple’ offline algorithm, online, using the ideas of sample and price class of algorithms, where the algorithm only observes (but does not select any) an initial set of items and builds a threshold, which is then used to select the forthcoming items. This online algorithm is shown to be competitive with respect to the simple offline algorithm, which itself is approximate with respect to the optimal offline algorithm for the knapsack problem with hard capacity of . The online algorithm that uses capacity is then used with probability to ensure the expected capacity constraint of and results in overall competitive ratio of , which is for . This algorithm’s performance comes close to competitive algorithm of [21], which however is valid only under the large market assumption.
Ii Online Knapsack Problem
Let the value and weight of item , be and , respectively, and the corresponding weight to value ratio (called the buckperbang in the paper) be . The usual knapsack problem is to select the subset of items of that maximizes the sum of their values, subject to a hard constraint on the sum of the weight of the items in the selected set. Without loss of generality, let by rescaling weights and .
In this paper, we consider the knapsack problem with a slightly weaker constraint on capacity. Specifically, we assume that the capacity constraint is in expectation, i.e., an algorithm is allowed to violate the hard capacity constraint of on specific instances of input or its own randomization, but in expectation should meet the capacity constraint of . This generalization is motivated by several practical cases of importance such as job scheduling in clouds, where typically the resource guarantees are easier to adhere to in expectation.
We consider the online version of the knapsack problem, where on each item’s arrival, it has to be accepted/rejected irrevocably. In the online setting, the performance metric is called the competitive ratio, that measures the ratio of the profit made by an online algorithm and the optimal offline algorithm that is allowed to know the future sequence (value and weight) of items, minimized over all possible input sequences , that specifies the order of arrival of items in . Thus, for an algorithm , its competitive ratio is
where is the optimal offline set of selected items and is the set of items selected by . Hence the objective is to design an online algorithm with maximum competitive ratio.
For the case of hard capacity constraint, it is easy to see that if the input (values and weights of items) are chosen adversarially, no deterministic online algorithm can have bounded competitive ratio, and no randomized algorithm can have competitive ratio better than , where is the total number of items. With capacity constraint in expectation, it still turns out that no randomized algorithm can have competitive ratio better than .
Theorem 1.
Under the expected capacity constraint of , the competitive ratio of any online algorithm with the adversarial input is at most .
Proof.
See Appendix A.∎
Following prior work, thus, to keep the problem nondegenerate in terms of competitive ratio, we assume that the order of arrival of items is uniformly random (secretarymodel), i.e., each permutation over arriving items in is equally likely. Let be a uniformly random permutation over . Then the the item that arrives has value , weight , and buckperbang .
For a set , we let . Under the secretarymodel of input, the competitive ratio of an online algorithm for solving the knapsack problem is defined as
where is the expectation operator, is the complete set of items, is the optimal offline set of selected items and is the set of items selected by . The online knapsack problem is to find the best algorithm that maximizes the competitive ratio . is said to be competitive if . We first consider the two popular special cases of the knapsack problem, called the secretary and the secretary problems, where the weight of each item is , before studying the general knapsack problem in Section III.
Iia Secretary Problem
In secretary problem, under the secretaryinput model, the problem is to maximize the probability of selecting the best secretary (item with the largest value in our setting). Letting each item’s weight to be , the classical secretary problem is a special case of the knapsack problem with hard capacity constraint of , since at most one item can be selected, and once the item is selected, the algorithm terminates.
With the expected capacity constraint of , as considered in this paper, the fundamental difference compared to the hard capacity constraint is that any online algorithm can actually access the whole input sequence sequentially and does not have to terminate as soon as one item is selected. However, an item is selected only using causal information, and once an item is selected, it cannot be removed subsequently.
Let be the best item in , then, under the secretarial input, the competitive ratio for algorithm for the secretary problem is defined as
with expected number of selected items being at most . We next propose a simple modification to the classical solution to the secretary problem under the hard capacity constraint, and show that the competitive ratio can be improved significantly under the expected capacity constraint compared to the hard capacity constraint.
IiA1 Upper Bound on Competitive Ratio
Consider a class of algorithms which we call Threshold Algorithm, that rejects the first items, and selects any item thereafter, if it is better than the best seen so far. Recall that for hard capacity constraint of , optimal and the optimal algorithm terminates as soon as the first item that is better than the best items seen until is encountered. With the expected capacity constraint, the Threshold Algorithm does not terminate without considering the whole input sequence but once an item is selected, it cannot be rejected, and it has to choose judiciously to ensure the expected capacity constraint.
Theorem 2.
Threshold Algorithm with ^{1}^{1}1Throughout, for ease of exposition, we assume that is an integer, otherwise, a floor operator will be needed. has a competitive ratio of and it satisfies the expected capacity constraint of .
Proof.
For Threshold Algorithm with , it is easy to see that the globally best item is not selected only if appears in the offline phase, i.e., it belongs to , which happens with probability . Thus, the probability of selecting the globally best item is .
Next, we check that the algorithm satisfies the expected capacity constraint. Let be the indicator function that the item appearing at the location is selected by the algorithm. Then the number of selected items by the algorithm is . By the definition of the algorithm, an item arriving at location is selected only if it is the best item seen so far, which happens with probability . Thus, with probability . Using linearity of expectation, we have that the expected number of selected items is
(1) 
∎
In addition to having the expected number of selected items to be less than , it is useful to know the distribution of the number of selected items by the Threshold Algorithm. For that purpose, in Fig. 1 with items, we plot the histogram of the number of items selected by the Threshold Algorithm with to illustrate that not only the expected number of items selected is at most , but there is rapid fall in the number of selected items and it almost never exceeds more than items.
The Threshold Algorithm is a simple extension of the optimal algorithm to solve the secretary problem under the hard capacity constraint, where an item in the online phase that is better than the best seen in the offline phase is selected and the algorithm terminates. By choosing the length of the offline phase to be , the classical result is that the best competitive ratio for the secretary problem under the hard capacity constraint of is . What we show in Theorem 2 is that when the capacity constraint is in expectation, one can expect a twofold increase in competitive ratio from to , by selecting as many items that are better than the best seen so far starting from the item that arrives at location . Thus, the relaxation in the capacity constraint allows a significant improvement in terms of selecting the best candidate.
We next show that no online algorithm can achieve better competitive ratio than under the expected capacity constraint of .
IiA2 Lower Bound on Competitive Ratio
Now we try and argue that the competitive ratio of any online algorithm cannot be more than for solving the secretary problem under expected capacity constraint of . Following observation is immediate, since we are trying to select only the best item and maximizing the probability of its selection.
Observation 1.
An optimal algorithm will not select an item arriving at location if it is not the best seen so far. Moreover, if an optimal algorithm for solving the secretary problem under expected capacity constraint of selects an item arriving at the location, then it always selects any item that arrives after the location with the largest value so far.
Theorem 3.
No online algorithm for solving the secretary problem under the expected capacity constraint of can have competitive ratio better than .
Proof.
Consider an optimal algorithm for solving the secretary problem under expected capacity constraint of . Let be the probability that it selects items at the end of the input sequence . Let be the probability that items selected by contain the best item.
Thus, the lower bound on the competitive ratio under the expected capacity constraint of is
(2) 
Then in light of Observation 1, we have for , since will select exactly items only when the globally best item is the item to be selected, otherwise no item is selected. Thus, we get the lower bound (2) as
(3) 
Thus, equivalently we have to solve for
(4) 
where , is the probability that misses out on selecting the globally best item. To minimize , the optimal algorithm needs to start selecting items arriving at the earliest location possible, but in light of Observation , if selects item at location , then it will always select better items arriving after location , increasing the number of selected items. An item arriving at location is selected by only if it is the best item seen so far, which happens with probability . Thus, if is the first location at which decides to select an item if that item is the best seen so far, then , and the expected number of items selected by the algorithm is . Thus, (5) is equivalent to
(5) 
Clearly, can be well approximated by . Hence, the constraint implies that , which implies that . ∎
Theorem 2 and 3 together characterize the optimal competitive ratio for the secretary problem under the expected capacity constraint. The classical secretary problem with hard capacity constraint is a richly studied object whose mutliple variants have been studied. To the best of our knowledge, enforcing the capacity constraint in expectation is rather novel, and the more interesting upshot is to note that with a relaxation in capacity constraint, there is a significant improvement in the competitive ratio, and the optimal competitive ratio can be exactly characterized.
Next, we consider the natural generalization of the secretary problem, where more than one secretary can be selected, called the secretary problem.
IiB Secretary Problem
In the classical secretary problem with a hard capacity constraint, each item has weight and an online algorithm can select at most items so as to maximize
(6) 
where is the set of items selected by with , for any subset , and is the best sized subset of in terms of the sum of the values.
With the expected capacity constraint of , the objective function remains the same as in (6), except now the constraint is that the set of items selected by an online algorithm should satisfy .
Using linearity of expectation, to find a lower bound on the competitive ratio (6), it is sufficient to focus on minimum probability of selecting any item that belongs to the optimal subset . Towards that end, we propose a simple modification to the Threshold Algorithm as follows.
Theorem 4.
KSec Threshold Algorithm with is an optimal online algorithm for the secretary problem, with competitive ratio and satisfies the expected capacity constraint of .
Proof.
With , it is easy to see that any item belonging to the set is not selected only if it appears in the offline phase , which happens with probability . So any item in is selected with probability . Therefore, the competitive ratio of KSec Threshold Algorithm, following (6), is .
So we only need to check that if the algorithm satisfies the expected capacity constraint. Let be the indicator function that the item appearing at the location is selected by the algorithm. Then the number of selected items by the algorithm is . By the definition of the algorithm, item arriving at location is selected only if it is among the best items seen so far, which happens with probability . Thus, with probability . Using linearity of expectation, we have that the expected number of selected items is
(7) 
The optimality of the algorithm follows from Theorem 3, since the competitive ratio is lower bounded by even for the secretary problem. ∎
Thus, exploiting the linearity of expectation, we can get the same competitive ratio of for the secretary problem with similar to the case. This behaviour is identical to the case of hard capacity constraint, where also the optimal competitive ratio is for all values of . Thus, relaxing the hard capacity constraint to an expected capacity constraint has identical performance advantage for the secretary problem independent of the value of . Now, we are ready to consider the general online knapsack problem, where the weight of items is arbitrary, under the expected capacity constraint.
Iii Knapsack Problem
In this section, we consider the general knapsack problem under the expected capacity constraint of . We will take a different approach for solving this general case compared to the special case of secretary problem studied in last subsection, where the weights of all items were identical.
Before dealing with the online version of the knapsack problem, it is instructive to discuss its linear programming (LP) relaxation offline version, where each item can be selected fractionally, as follows. The LP formulation for the fractional offline knapsack problem with knapsack size
is given by,(8) 
where we have relaxed the condition that to .
The following two facts are wellknown [12] for the fractional knapsack problem.
Lemma 1.
To prove this, arrange the items in increasing order of . Let there be an index such that but . Then claim that and increases the value of the objective function, while still being capacity feasible.
Let be the optimal fractional solution (8) with capacity and the corresponding optimal value of (8) be .
Lemma 2.
For , we have that
(9) 
Note that property (9) may not be true for an integral optimal solution.
We will first define an offline knapsack algorithm that is allowed to use a larger capacity , similar to [13]. We then make it online with the help of sample and price class of strategies e.g. Threshold algorithm used in typical secretary or secretary problems, where the algorithm only observes (but does not select any) an initial set of items and builds a threshold, which is then used to select the forthcoming items.
Algorithm : Order all the items in in nondecreasing order of their buckperbang . Select as many items in the indexed order starting from the first, subject to the augmented capacity constraint of . Thus, selects the first indexed items if , and . Let be the threshold on the buckperbang of all items selected by the algorithm, i.e., for all .
Lemma 3.
Algorithm with is approximate to the optimal solution of the offline knapsack problem (8) with hard capacity constraint , where an offline algorithm is approximate if the profit of the algorithm is at least times the optimal offline algorithm’s profit.
Proof.
Consider the set and of items selected by the optimal fractional offline algorithm with capacity (8), and the algorithm with capacity from the full set of items , respectively. By definition, the value obtained by with capacity is , and the value of is .
By definition, the set is the set of items ordered in nondecreasing order of their buckperbang, where the first of them satisfy , and where items are indexed in nondecreasing order of their buckperbang.
Moreover, since for each item , we have
(10) 
Therefore, (10) implies that the first items of indexed in nondecreasing order of buckperbang require capacity more than . Since the fractional optimal solution (8) also selects items in nondecreasing order of their buckperbang (Lemma 1), we get that if the knapsack capacity was , then the optimal fractional solution (8) would not have selected any item that is not selected by , i.e., in (8) with capacity . Thus, we get
(11) 
Moreover, from Lemma 2, with ,
which combining with (11), we get
proving the claim.
∎
Iiia Competitive Online Algorithm with Capacity
Before prescribing an online algorithm for the knapsack problem under the expected capacity constraint of , we first take a detour via proposing an online version of the algorithm that uses augmented capacity . In the online setting, we aim to select as many items among the set of items selected by with capacity that have buckperbang greater than on equal to .
To achieve this objective, following prior work [21] we need to make an extra assumption (Assumption 1) that is reasonable for most practical purposes.
Assumption 1.
We assume that given two items arriving at locations and , if , then which is reasonable for most applications.
Consider the following online algorithm . Divide the input into two phases; offline (first items) followed by online/decision (last items). The algorithm observes the first items and does not select any of them. The offline algorithm with capacity is run at the end of the offline phase (over the first items). Let be the set of items that are selected by the algorithm in the offline phase, and the buckperbang threshold be .
In the online/decision phase, we will use a modified Virtual algorithm for the secretary problem [12], starting from the arrival of item. At the beginning of the decision phase, we initialize the reference set as , and . Thus, the algorithm aims to select items as the did in the sampling phase. Moreover, in the decision phase only items with are eligible for selection, where the eventual selection is made if both the buckperbang and the weight of the newly arrived item is smaller than the buckperbang of the best item seen so far by the algorithm and if it was sampled in the offline phase.
Algorithm is a modification of the algorithm [12], with the most important change being on line that is essential to ensure that the capacity constraint of is satisfied in the online/decision phase. We illustrate this first with an example as follows.
Example 1.
Let the output of the offline phase contain four items with buckperbang with weight , respectively, where . Let . Then on the arrival of a new item in the decision phase, with buckperbang , it is included in by ejecting item with buckperbang . Thus, the updated reference set is after rearranging the items in nondecreasing order of their buckperbangs. Moreover, the new item is selected as long as its weight is less than . Thereafter, if an item with buckperbang arrives then it is neither selected nor included in . Consider, one more item arriving with buckperbang , then it is included in (selected only if its weight is less than ) and the updated set . Hereafter, no more new items can be accepted since the item with the worst buckperbang is sampled in the decision phase. Important point to note in this example (that is a property of the algorithm) is that an item in the decision phase is accepted only if its weight is less than a distinct item of , and once the weight of an item in is compared with an item that arrives in the decision phase, then item is not available for future comparisons irrespective of whether item was accepted or not. This leads us to following Lemma that the Algorithm satisfies the capacity constraint of .
Lemma 4.
Algorithm satisfies the capacity constraint of , i.e., the sum of the weight of the items accepted in the decision phase is less than .
Proof.
Note that whether an item arriving in the decision phase is selected or not, as long as it is included in the set , the item that is ejected from to make room for item which was sampled in the offline phase is never available thereafter for weight comparison for selection of new items. If the item with the worst buckperbang in is sampled in the decision phase, then no more items are selected anyway. Thus, importantly, the weight of any item that belonged to set (output of ) is compared at most once with any item arriving in decision phase. Thus, the weight of any item selected in the decision phase is less than the weight of any one distinct item of . Hence the sum of the weight of the items accepted in the decision phase is less than the sum of the weight of the items in , which is necessarily less than or equal to following the definition of .
∎
Let be an item selected by the algorithm when run on the full set of items . Then we will show that the probability of selecting item by this online algorithm is at least . Thus, we have the following result.
Lemma 5.
The competitive ratio of the algorithm with respect to is at least .
Proof.
Let be the set selected by the offline algorithm when run on the full set of items , with as the buckperbang threshold. Then the buckperbang threshold output by running with capacity on with satisfies . Therefore, all items belonging to are eligible for selection in the algorithm if they appear in the decision phase.
With the Algorithm, a new item that appears at location is selected if and only if at location , the item with the largest buckperbang in the reference set is sampled at or before location , and .
Since the permutations are uniformly random, the probability that at location , the item with the smallest value in the reference set is sampled at or before time is . Moreover, the probability of any item arriving at the location is independent of .
Hence the probability of selecting an item when it arrives at position , without considering the weight acceptance constraint that on line , is
Since we choose , we get that
Hence by linearity of expectation, we get that the expected value of the selected items by the Virtual algorithm is at least
(12) 
Now we enforce back the condition that on line of the algorithm. As noted in Lemma 4, the weight of each of the items of selected in the offline phase is compared at most once while selecting the new items in the online phase. Since for any two items , given , from Assumption 1, hence each item that is selected without enforcing for some item that is part of , is selected with probability even when the condition is enforced, since each item selected by the AUGON algorithm has for some distinct item that is part of offline selected set .
∎
Now we are ready to describe an online algorithm for the knapsack problem with the expected capacity constraint of . We take recourse to the algorithm for that purpose.
IiiB Online Algorithm with Expected Capacity Constraint
Theorem 5.
The competitive ratio of algorithm is and it satisfies the expected capacity constraint of .
Proof.
The online algorithm uses algorithm with with probability , and does not choose any item with probability . Therefore, clearly, the expected capacity constraint of is satisfied. From Lemma 3, it follows that has approximation ratio of with respect to the optimal offline algorithm for the knapsack problem with hard capacity constraint of . Moreover, Lemma 5 ensures that has a competitive ratio of with respect to the offline algorithm run on full set of items with capacity . With the choice of , is run with probability , hence the overall competitive ratio of is with respect to the optimal offline algorithm for the knapsack problem with hard capacity constraint of .
∎
Thus with an expected capacity constraint, one can get far superior competitive ratio guarantees than the best known guarantee [12] for the online knapsack problem under the hard capacity constraint. The basic idea of the proposed online algorithm is to use twice the capacity with probability , so that the expected capacity constraint can be met, and to take advantage of the fact that with increased capacity, there is a simple threshold based algorithm () that can closely approximate the optimal knapsack solution. The advantage of threshold based offline policy is that it can be made online with reasonable competitive ratio using the basic ideas developed for the secretary problem, where the objective is to choose each of the top item with large enough probability. One major challenge in the knapsack problem is that we do not know the exact number of items to be selected in contrast to the secretary problem. Our algorithm chooses the number of items to be selected as the number of items chosen by the threshold based offline algorithm in the offline phase when run over a subset of items. Thus, helps in finding a good threshold for selecting the items in the online phase, as well to find how many items to select.
Iv Conclusions
In this paper, we have considered a new paradigm for some important online problems, namely the secretary and the knapsack problem, by relaxing the hard capacity constraint to an expected capacity constraint. This relaxation allows more flexibility for online algorithms that is well motivated by modern applications such as job scheduling in cloud servers, and is also an object of theoretical interest given the attention that both the secretary and the knapsack problem have received in literature. Under the expected capacity constraint we show that there is a twofold increase in the competitive ratio for the secretary problem compared to the hard capacity constraint, which is significant. Moreover, for the knapsack problem, we are able to improve the competitive ratio by a factor of compared to the best online algorithm known under the hard capacity constraint. We believe that considering the expected capacity constraint is an exciting new direction that can be studied for online problems with hard capacity constraints that can allow fundamental improvement in the competitive ratios.
Appendix A Competitive Ratio For Adversarial Input
In this appendix, we show that under the adversarial input model, the competitive ratio of any online algorithm is at best , even under the expected capacity constraint similar to hard capacity constraint.
Consider items with item being the best item. Let any online algorithm select items with probability . Index all the sized subsets of in lexicographic order coming before for . Then an online algorithm selects items arriving at locations defined by with probability from the input sequence . Suppose , then the algorithm picks some subset locations with probability higher than ; consequently some other subset locations will be picked with probability less than . An adversary using this knowledge can put the best item to lie in any such subset locations, in which case the probability of selecting the best candidate will be less than . Thus, with adversarial input, given that the algorithm is selecting items, the best strategy is to choose each of the location subsets equally likely.
Thus, the linear program to maximize the success probability for any online algorithm under the expected capacity constraint is
(14) 
which is equivalent to
(15) 
where Thus, the maximum probability of success is at most with the adversarial input even when the capacity constraint is in expectation. Hence, given the expected capacity constraint of
, there is no advantage in choosing nontrivial probability distribution over the number of items to be selected when an adversary can choose the sequence of arrival.
References
 [1] V. V. Vazirani, Approximation algorithms. Springer Science & Business Media, 2001.
 [2] G. Aggarwal and J. D. Hartline, “Knapsack auctions,” in Proceedings of the seventeenth annual ACMSIAM symposium on Discrete algorithm. Society for Industrial and Applied Mathematics, 2006, pp. 1083–1092.
 [3] N. Buchbinder, J. S. Naor et al., “The design of competitive online algorithms via a primal–dual approach,” Foundations and Trends® in Theoretical Computer Science, vol. 3, no. 2–3, pp. 93–263, 2009.
 [4] A. Goel and P. Indyk, “Stochastic load balancing and related problems,” in Foundations of Computer Science, 1999. 40th Annual Symposium on. IEEE, 1999, pp. 579–586.
 [5] W. Shi, L. Zhang, C. Wu, Z. Li, and F. Lau, “An online auction framework for dynamic resource provisioning in cloud computing,” ACM SIGMETRICS Performance Evaluation Review, vol. 42, no. 1, pp. 71–83, 2014.
 [6] L. Zhang, Z. Li, and C. Wu, “Dynamic resource provisioning in cloud computing: A randomized auction approach,” in IEEE INFOCOM 2014IEEE Conference on Computer Communications. IEEE, 2014, pp. 433–441.
 [7] Z. Zheng, M. Li, X. Xiao, and J. Wang, “Coordinated resource provisioning and maintenance scheduling in cloud data centers,” in INFOCOM, 2013 Proceedings IEEE. IEEE, 2013, pp. 345–349.

[8]
M. Cello, G. Gnecco, M. Marchese, and M. Sanguineti, “A generalized stochastic
knapsack problem with application in call admission control,” in
10 th CologneTwente Workshop on Graphs and Combinatorial Optimization CTW 2011
, p. 105.  [9] Y. Zhang and C. Leung, “Resource allocation in an OFDMbased cognitive radio system,” IEEE Transactions on Communications, vol. 57, no. 7, pp. 1928–1931, 2009.
 [10] A. MarchettiSpaccamela and C. Vercellis, “Stochastic online knapsack problems,” Mathematical Programming, vol. 68, no. 13, pp. 73–104, 1995.
 [11] X. Han, Y. Kawase, and K. Makino, “Randomized algorithms for online knapsack problems,” Theoretical Computer Science, vol. 562, pp. 395–405, 2015.
 [12] M. Babaioff, N. Immorlica, D. Kempe, and R. Kleinberg, “A knapsack secretary problem with applications,” in Approximation, randomization, and combinatorial optimization. Algorithms and techniques. Springer, 2007, pp. 16–28.
 [13] K. Iwama and G. Zhang, “Online knapsack with resource augmentation,” Information Processing Letters, vol. 110, no. 22, pp. 1016–1020, 2010.
 [14] B. Tan and R. Srikant, “Online advertisement, optimization and stochastic networks,” IEEE Transactions on Automatic Control, vol. 57, no. 11, pp. 2854–2868, Nov 2012.
 [15] T. S. Ferguson, “Who solved the secretary problem?” Statistical science, pp. 282–289, 1989.
 [16] P. Freeman, “The secretary problem and its extensions: A review,” International Statistical Review/Revue Internationale de Statistique, pp. 189–206, 1983.
 [17] J. Preater, “On multiple choice secretary problems,” Mathematics of Operations Research, vol. 19, no. 3, pp. 597–602, 1994.
 [18] R. Kleinberg, “A multiplechoice secretary algorithm with applications to online auctions,” in Proceedings of the sixteenth annual ACMSIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, 2005, pp. 630–631.
 [19] J. Gianini and S. M. Samuels, “The infinite secretary problem,” The Annals of Probability, pp. 418–432, 1976.
 [20] M. Babaioff, N. Immorlica, and R. Kleinberg, “Matroids, secretary problems, and online mechanisms,” in Proceedings of the eighteenth annual ACMSIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, 2007, pp. 434–443.
 [21] R. Vaze, “Online knapsack problem and budgeted truthful bipartite matching,” in IEEE INFOCOM. IEEE, 2017.
 [22] N. Buchbinder and J. Naor, “Online primaldual algorithms for covering and packing problems,” in ESA, vol. 3669. Springer, 2005, pp. 689–701.
 [23] ——, “Improved bounds for online routing and packing via a primaldual approach,” in Foundations of Computer Science, 2006. FOCS’06. 47th Annual IEEE Symposium on. IEEE, 2006, pp. 293–304.
 [24] Y. Zhou, D. Chakrabarty, and R. Lukose, “Budget constrained bidding in keyword auctions and online knapsack problems,” in International Workshop on Internet and Network Economics. Springer, 2008, pp. 566–576.
 [25] N. Korula and M. Pál, “Algorithms for secretary problems on graphs and hypergraphs,” in Automata, Languages and Programming. Springer, 2009, pp. 508–520.
Comments
There are no comments yet.