1 Introduction
Uncertainty plays a critical role in many real world applications where the decision maker is faced with multiple alternatives with different costs. These decisions arise in our daily lives, such as whether to rent an apartment or buy a house, which cannot be answered reliably without knowledge of the future. In a more general setting with multiple alternatives, such as a large number of files with different execution time in a distributed computing system, it is hard to decide which file should be executed next without knowing which file will arrive in the future. These decisionmaking problems are usually modeled as online rentorbuy problems, such as the classical ski rental problem [7, 11, 8].
Two paradigms have been widely studied to deal with such uncertainty. On the one hand, online algorithms are designed without prior knowledge to the problem, and competitive ratio (CR) is used to characterize the goodness of the algorithm in lack of the future. On the other hand, machine learning is applied to address uncertainty by making future predictions via building robust models on prior data. Recently, there is a popular trend in the design of online algorithms by incorporating machine learned (ML) predictions to improving their performance [14, 12, 17, 16, 4, 9, 2, 3, 10]. Two properties are desired in online algorithm design with ML predictions: (i) if the predictor is good, the online algorithm should perform close to the best offline algorithm (a design goal called consistency); and (ii) if the predictor is bad, the online algorithm should not degrade significantly, i.e., its performance should be close to the online algorithm without predictions (a design goal called robustness). Importantly, these properties are achieved under the assumption that the online algorithm has no knowledge about the quality of the predictor or the prediction error types.
While previous studies focused on using ML predictions for a single skier to buy or rent the skis in a single shop, we study the more general setting where the skier has multiple shops to buy or rent the skis with different buying and renting prices. We call this a multishop ski rental (MSSR) problem. This is often the case in practice, where the skier not only needs to decide when to buy, but also where to buy, whereas only decision on when to buy is needed in the classical single shop ski rental problem. Furthermore, we consider not only the case of using a single ML prediction, which is inspired by recent work [12, 17, 9, 2], but also the case of getting predictions from multiple ML models. Closest to ours is the work by [4], which considered the case where multiple experts provide advice in a single shop, which can be considered as a special case of our problem. However, we incorporate multiple predictions into decision making by comparing the number of predictions to a threshold, which is much easier to implement in real world systems.
1.1 The ski rental problem
In ski rental problem, the skier is going to ski for an unknown number of days, and has to make a decision on either renting skis with a unit cost each day or buying skis at a higher price It is easy to see that the skier should buy skis on the first day if she is going to ski more than days, otherwise, rent every day. The number of days is unknown in advance, and only revealed by the end of skiing season. It is wellknown that the best deterministic algorithm is to rent for the first days and then buy on day , which achieves a competitive ratio of . On the other hand, the best randomized algorithm [7] achieves a competitive ratio of The skirental problem and many of its generalizations such as dynamic TPC acknowledgement [5], the parking permit problem [15], snoopy caching [6], renting cloud servers [8] and others are canonical examples of online rentorbuy problems, which play a central role in decision making in many different settings, and have continuously been extensively studied in different domains.
1.2 The multishop ski rental problem
We consider the multishop ski rental problem, in which the skier has multiple shops to buy or rent the skis with different buying and renting prices. In such a MSSR, the skier has to make a twofold decision, i.e, when and where to buy. Specifically, we consider the case that the skier must choose one shop at the beginning of the skiing season, and must buy or rent the skis at that particular shop since then. In other words, once a shop is chosen by the skier, the only decision variable is when she should buy the skis. The MSSR not only naturally extends the classical ski rental problem, where a single skier rents or buys the skis in a single shop, but also allows heterogeneity in skier’s options. This desirable feature makes the ski rental problem a more general modeling framework for online algorithm design. Here we give a few real world applications that can be modeled with MSSR.
Example 1: Cost in Cloud CDN Service. With the advent of cloud computing, the content service provided by content distribution network (CDN) has been offered as managed platforms with a novel payasyougo model for cloud CDNs. For example, cloud providers such as Microsoft Azure and Amazon AWS, now provide different price options to users based on their demand, which is usually unknown in advance. Table 1 lists the price option provided by Microsoft Azure. Each price option can be considered as a shop in the MSSR problem, and the hourly price is the renting price.
Options  Hourly price () 

Payasyougo  
year reserved  
year reserved 
Example 2: Caching in Wireless Sensor Networks. A content can be replicated and stored in multiple base stations to serve requests from users. Upon a user request, if the requested content is stored in base stations, the service latency is short, otherwise, it incurs a longer latency to fetch the requested content from remote servers. On the other hand, the content can be prefetched and stored in base stations at the expense of wasting space if the content will not be requested by users. In this application, each base station is considered as a shop, and renting corresponds to serve requests ondemand, and buying refers to prefetch content in advance.
1.3 Consistency and Robustness
The competitive ratio of an online algorithm is defined as the worstcase ratio of the algorithm cost (ALG) to that of the offline optimum (OPT). Inspired by [12, 17, 4, 2], we also use the notions of consistency and robustness to evaluate our algorithms. We denote the prediction error as , which is the absolute difference between the prediction and the actual outcome. We say that an online algorithm is consistent if when the prediction is accurate, i.e., and robust if for all and feasible outcomes to the problem. We call and the consistency factor and robustness factor, respectively. Thus consistency characterizes how well the algorithm does in case of perfect predictions, and robustness characterizes how well it does in worstcase predictions.
This novel analytical framework can bring the gap between two radically online algorithm design methodologies. On the one hand, the worstcase analysis framework always assumes that the future is unpredictable, and try to design online algorithms with a bounded competitive radio. On the other hand, historical data are usually used to make predictions for decision making in realworld systems. However, this approach results in poor performance if the future inputs look different to the past ones. In this framework, a hyperparameter
is leveraged to determine the trust on ML predictions, where indicates fully trust on ML predictions and indicates no trust on ML predictions. To that end, such online algorithm design with ML predictions can provide a full spectrum coverage from pure worstcase to fully predictionbased decision making. In this paper, our goal is to design online algorithms for MSSR that improve consistency factor without degrading robustness factor significantly, compared to algorithms for MSSR without predictions.1.4 Main Results
Our main contribution is to develop online algorithms for MSSR with consistency and robustness properties in presence of ML predictions. We develop new analysis techniques for online algorithms with ML predictions via the hyperparameter. We first define a few notions before presenting our main results. We assume there are shops with buying prices and renting prices We develop several online algorithms for MSSR with a single ML prediction or multiple predictions as highlighted below:
We first present a best deterministic algorithm (achieving minimal competitive ratio) for MSSR without ML predictions. It turns out that the algorithm chooses exactly one shop with the minimal value of , and buy on the start day at shop
Next, we consider MSSR with a single ML prediction. We show that if this ML prediction is naively used in algorithm design, the proposed algorithm cannot ensure robustness (Section 3.1). We then incorporate ML prediction in a judicious manner by first proposing a deterministic online algorithm that is consistent, and robust (Section 3.2). We further propose a randomized algorithm with consistency and robustness guarantees (Section 3.3). We numerically evaluate the performance of our online algorithms (Section 3.4). We show that with a natural prediction error model, our algorithms are practical, and achieve better performance than the ones without ML predictions. We also investigate impacts of several parameters and provide insights on the benefits of using ML predictions. It turns out that the predictions need to be carefully incorporated in online algorithm design.
We then study a more general setting where we get ML predictions from some ML models. We redefine the prediction error to incorporate the average ML predictions’ impact into our algorithms. We slightly modify the algorithms and show that similar techniques lead to tight results for online algorithms of MSSR with multiple ML predictions. In particular, we propose both deterministic algorithm (Section 4.1) and randomized algorithm (Section 4.2) with consistency and robustness guarantees. Finally, numerical results are given to demonstrate the impact of multiple ML predictions.
1.5 Related Work
Our work is inspired by the aforementioned recent trend of incorporating ML predictions into online algorithms design [14, 12, 17, 16, 4, 9, 2, 3, 10]. In particular, we use the concepts of consistency and robustness from [12]. For example, [12] incorporates ML predictions into the classical Marker algorithm ensuring both robustness and consistency for caching. [17] and [4] extend the models for a comprehensive understanding of the classical single shop ski rental problem with a single bit advice and multiple bits advice, respectively. [2] further quantifies the impact of advice quality and proposed Paretooptimal algorithm for ski rental problem. While we operate in the same framework, none of previous results can be directly applied to our setting, as our work significantly differs from previous studies in the sense that we consider a multishop ski rental problem with multiple ML predictions, where the skier has to make a twofold decision on when and where to buy. This makes the problem considerably more challenging but more practical.
Closest to our model is that multiple options in one shop [11] or multiple shops [1], however, no ML prediction is incorporated in their online algorithms design. On the other hand, there is an extensive study for online optimization with advice model, in particular, multiple predictions has been studied in the context of online learning, however, existing techniques are not applicable to our multishop setting. We refer interested readers to the surveys [3, 13] for a comprehensive discussion.
2 Preliminaries
We consider the multishop ski rental (MSSR) problem, where a skier goes to ski for an unknown number of days. The skier can buy or rent skis from multiple shops with different buying and renting prices. The skier must choose one shop as soon as she starts the skiing.
More precisely, we assume that there are totally shops and denote the set of shops as . Each shop offers a renting price of dollars per day, and a buying price dollars, where , In particular, our model reduces to the classical ski rental problem when In a MSSR problem, it is obvious that if one shop has higher prices for both renting and buying than another shop, it is suboptimal to choose this shop. To that end, we assume and For the ease of exposition, we set , which is the same with that used in classical ski rental problem. Let be the actual number of skiing days which is unknown to the algorithm.
We first consider the offline optimal algorithm where is known. It is easy to see that the skier should rent at shop if and buy on day at shop if
2.1 Best Deterministic Online Algorithm for MSSR
It is wellknown that the best deterministic algorithm for the classical ski rental problem is the breakeven algorithm: rent until day and buy on day The corresponding competitive ratio is and no other deterministic algorithm can do better. Now we consider the best deterministic online algorithm (BDOA) that obtains a minimal competitive ratio for MSSR without any ML prediction. We make the following assumption.
Assumption 1.
The skier cannot change the shop once she chooses it, but she can decide to buy or continue to rent the skis in that particular shop at any time.
Lemma 1.
The best deterministic algorithm for MSSR is that the skier rents for the first days and buys on day at shop where . The corresponding competitive ratio is
Proof sketch of Lemma 1: It is obvious that Under Assumption 1, we can consider the competitive ratio of shop Let be the buying day. Then if otherwise It is easy to argue that the worst case happens when We have . Inspired by the classical ski rental problem, we can show that the best is achieved when Thus, we have The proof details are available in Appendix A.
3 Online Algorithms for MSSR with a single ML prediction
In this section, we consider MSSR with a single ML prediction. Let be the predicted number of skiing days. Then is the prediction error. For the ease of exposition, we use the twoshop ski rental problem as a motivating example, and then generalize the results to the general MSSR with shops.
3.1 A Simple Algorithm with ML prediction
Lemma 2.
The cost of Algorithm 1 satisfies
Proof.
Since there is only one breakeven point we consider four cases based on the relations of and with
(i) , : , i.e., ;
(ii) , : , i.e., ;
(iii) , : , i.e., ;
(iv) , : , i.e.,
Combining (i)(iv), which is unbounded.
Furthermore, we can rewrite (ii),
Similarly, by rewriting (iv), we also have ∎
We now generalize Algorithm 1 and Lemma 2 to the general MSSR with shops. Inspired by Lemma 1, it is easy to check that it is suboptimal to buy at shop with and rent at shop with
Corollary 1.
The simple algorithm with ML prediction for the general MSSR with shops follows that the skier buy on day at shop if , otherwise it rents at shop The corresponding cost satisfies
We note that by simply following the ML prediction, the competitive ratio of Algorithm 1 is unbounded (e.g., ) even when the prediction is small (due to case (iii)). Furthermore, Algorithm 1 has no robustness guarantee. In the following, we show how to properly integrate the ML prediction into online algorithm design to achieve both consistency and robustness.
3.2 A Deterministic Algorithm with Consistency and Robustness Guarantee
We develop a new deterministic algorithm by introducing a hyperparameter , which gives us a smooth tradeoff between the consistency and robustness of the algorithm.
Theorem 1.
Proof.
We first prove the first bound. When we consider two cases.
First, if , then , i.e., rent at shop since Hence we have
i.e.,
Second, if , we have
When we have , i.e., buy at shop on day as then When we have then thus, Combining these two cases, we have .
Similarly, when we consider the following three cases.
First, if we have It is clear that , i.e.,
Second, if , we have , i.e., buy at shop on day , and
where (a) is obtained by following Algorithm 2, i.e., rent at shop with and (b) holds true due to the predictor error definition. Therefore, we have .
Finally, if , we have and
where (c) follows , i.e., then . Thus
Combining and we get the first bound.
Now we prove the second bound. According to Algorithm 2, the skier rents the skis at shop until day and then buys on day at shop , when the predicted day satisfies , we have
if . It is easy to see that the worst CR is obtained when , for which . Therefore,
Similarly, the skier rents the skis at shop until day and then buys on day at shop , when the worst CR is obtained when for which , and
∎
Similarly, we can generalize the above results to the general MSSR with shops.
Corollary 2.
The deterministic algorithm with a singlebit ML prediction for the general MSSR with shops follows that the skier buys on day at shop if , otherwise it buys on day at shop The corresponding competitive ratio is at most where is a parameter. In particular, the deterministic algorithm is consistent and robust.
Remark 1.
The competitive ratio is a function of hyperparameter and prediction error , which is different from the conventional competitive design. By tuning the value, one can achieve different values for competitive ratio. The competitive ratio might be even worse than the BDOA for some cases (e.g., prediction error is large). We will show this in Section 3.4. This shows that decision making based on ML predictions comes at the cost of lower worstcase performance guarantee. Finally, it is possible to find the optimal to minimize the worstcase competitive ratio if the prediction error is known (e.g. from historically observed error values).
3.3 A Randomized Algorithm with Consistency and Robustness Guarantee
We consider a class of randomized algorithms for MSSR in this section. Similarly, we consider a hyperparameter satisfying . First, we emphasize that a randomized algorithm that naively modifies the distribution used for randomized algorithm design for the classical ski rental algorithm with or without predictions fail to achieve a better consistency and robustness at the same time. We customize the distribution functions carefully by incorporating different renting and buying prices from different shops into the distributions, as summarized in Algorithm 3.
Theorem 2.
Proof.
We compute the competitive ratio of Algorithm 3 under four cases.
Case . and It is clear that According to Algorithm 3, the skier should rent at shop until day and buy on day
This happens with probability
, for and incurs a cost . Therefore, we have Therefore, we havewhere (a) holds since , for and (b) follows that , i.e., and increases in
Case . and Since we have If the skier buys the skis on day then it incurs a cost , otherwise, the cost is . Therefore, we obtain the robustness through the following
where (c) holds true since i.e., and To get the consistency, we can rewrite the above inequality
where (d) follows , (e) holds true since , and , and (f) follows that
Case . and It is clear that Similar to Case , we have
where (g) follows that , i.e., , (h) follows from two cases i) when , we have ; and ii) when we have as thus Hence, . (i) holds since and increases in as mentioned earlier.
Case . and As we have . Similar to Case , we have the robustness as
where (j) follows that and i.e., Again, we rewrite the above inequality to get the consistency
where (k) follows that and . ∎
Again, we can generalize Algorithm 3 to the general MSSR problem with shops. As it is suboptimal to rent at any shop besides shop and buy at any shop besides shop The randomized algorithm for the general MSSR simply replaces shop by shop with the corresponding and in Algorithm 3. Similarly, the corresponding competitive ratio can be achieved by replacing and in Theorem 2 by and of shop The hyperparameter should satisfy
3.4 Model Validation and Insights
In this section, we numerically evaluate the performance of our algorithms. For all our experiments, we set the number of shops , the buying costs are dollars with and , and the renting costs dollars with and Note that the actual values of and are not important as we can scale all these values by some constant factors. The actual number of skiing days
is a random variable uniformly drawn from
, where is a constant. The predicted number of skiing days is set to whereis drawn from a normal distribution with mean
and standard variation . We vary either the value of from to , or the value of to verify the consistency and robustness of our algorithms. To characterize the impact of the hyperparameter on the performance of deterministic and randomized algorithms, we consider the values of , and for Note means that our algorithms ignore the ML prediction, and reduce to the algorithms without predictions. For each value of we plot the average competitive ratio by running the corresponding algorithm over independent trials. We consider both unbiased and biased prediction errors in our experiments.3.4.1 Unbiased Prediction Errors
We first consider unbiased prediction errors, i.e., to characterize the impact of and
The impact of . As is uniformly drawn from , is an important parameter that can impact the competitive ratio. We consider two possible values of : and . Since means that it is highly possible the actual number of skiing days is larger than Thus according to Algorithm 2, buying as early as possible will be a better choice, i.e., small results in better competitive ratio as shown in Figure 1. . On the other hand, with it is highly possible that is smaller than . Therefore, if the prediction is more accurate (small ), smaller (i.e., more trust on ML predictions) achieves smaller competitive ratio, while the prediction is inaccurate (with large ), larger achieves smaller competitive ratio. This can be observed from Figure 1. In particular, with the values of ’s and ’s in our setting, , i.e., do not trust the prediction achieves the best competitive ratio when the prediction error is large. We can observe a similar trend for the randomized algorithm (Algorithm 3) as shown in Figure 2.
We further compare the performance of the deterministic algorithm (Algorithm 2) and the randomized algorithms (Algorithm 3), as shown in Figure 4 with . We make the following observations: (i) with the same prediction errors (e.g., ), the randomized algorithm always performs better than the deterministic algorithm. Similar trends are observed for other
values and hence are omitted due to space constraints. (ii) our deterministic algorithm with ML prediction can beat the performance of classical randomized algorithm without ML predictions when the standard deviation of prediction error is smaller than
.The impact of hyperparameter . Hyperparameter incorporates the trust of ML predictions in online algorithm design. In particular, close to means more trust on predictions while close to means less trust. We investigate the impact of on the deterministic algorithm (Algorithm 2) by considering a perfect prediction and an extremely erroneous prediction. From Figure 4 with , we observe (i) With an extremely erroneous prediction, blinding trust the prediction (smaller ) leads to even worse performance than the BDOA without ML predictions; (ii) By properly choosing , our algorithm achieves better performance than the BDOA even with extremely erroneous prediction. This demonstrates the importance of hyperparameter
3.4.2 Biased Prediction Errors
Next we consider the impact of biases on prediction errors. We consider three possible values of for The performance of deterministic algorithm (Algorithm 2), and randomized algorithm (Algorithm 3) with and are shown in Figure 5 and Figure 6, respectively. With the above analysis of ’s impact and the same trust on ML predictions (
), a smaller bias benefits the competitive ratio when the variance is small, however, when the variance is large, the impact of bias is negligible.
4 Online Algorithms for MSSR with Multiple ML Predictions
Now we consider a more general case that there are ML predictions, and denote them as Without loss of generality, we assume We define an indicator function to represent the relation between and satisfying
Let , which indicates the number of predictions that is greater than We redefine the prediction error under the multiple predictions case as
In this section, we design deterministic and randomized algorithms for MSSR with multiple ML predictions. We slightly modify the algorithms proposed in Section 3, and show that similar techniques lead to tight results for online algorithms of MSSR with multiple ML predictions. Again for the ease of exposition, we take the twoshop ski rental problem as a motivating example, and the results can be easily generalized to the general shop MSSR.
4.1 A Deterministic Algorithm with Multiple ML Predictions
We first design a deterministic algorithm tuned by a hyperparameter to achieve a tradeoff between consistency and robustness.
Theorem 3.
Proof sketch of Theorem 3: Denote and For the first bound, when we consider two cases and ; and when we consider three cases, and . We can compute the corresponding ALG and OPT to achieve the results. Due to space constraints, we omit the details here. Similar to the proof of Theorem 1, we can achieve the worst case CR when if , and when if The proof details are available in Appendix B.
4.2 A Randomized Algorithm with Multiple ML Predictions
In this section, we propose a randomized algorithm with multiple ML predictions that achieves a better tradeoff between consistency and robustness than the deterministic algorithm.
Theorem 4.
4.3 Model Validation and Insights
We consider the same setting as that in Section 3.4. We vary the number of ML predictions from to , and set the associated predictions to , where is drawn from a normal distribution with mean and standard variation , and We investigate the impacts of and on the performance and make the following observations: (i) For unbiased prediction errors and fixed , if the prediction is accurate (small ), increasing improves the competitive ratio, however, more predictions hurt the competitive ratio when prediction error is large, see Figure 8; (ii) For with fixed if the prediction is accurate, more trust (small ) benefits the algorithm. On the other hand, less trust achieves better competitive ratio when the prediction error is large. See Figure 8. (iii) For fixed and , a smaller bias benefits the competitive ratio when the variance is small, while a larger bias achieves a smaller competitive ratio when the variance is large. See Figure 10. We also characterize the impact of the term in Algorithm 4, and compare the algorithms with and without it in the breakeven points, see Figure 10. We observe that the can improve the competitive ratio as it suggests the skier to buy earlier when more predictions are above and rent longer when more predictions are smaller than i.e., making decisions more cautious.
5 Conclusions
In this paper, we investigate how to improve the worstcase performance of online algorithms with (multiple) ML predictions. In particular, we consider the general multishop ski rental problem. We develop both deterministic and randomized algorithms when there are either a single or multiple ML predictions. Our online algorithms achieve a smooth tradeoff between consistency and robustness, and can significantly outperform the ones without ML predictions. Going further, we will study extensions of MSSR. e.g., the skier is allowed to switch shops, in which she can simultaneously decide where to buy or rent the skis. We will also consider to integrate prediction costs into the online algorithm design.
References
 [1] (2014) The multishop ski rental problem. ACM SIGMETRICS Performance Evaluation Review 42 (1), pp. 463–475. Cited by: §1.5.
 [2] (2019) Online computation with untrusted advice. arXiv preprint arXiv:1905.05655. Cited by: §1.3, §1.5, §1, §1.
 [3] (2016) Online algorithms with advice: a survey. Acm Sigact News 47 (3), pp. 93–129. Cited by: §1.5, §1.5, §1.
 [4] (2019) Online algorithms for rentorbuy with expert advice. In International Conference on Machine Learning, pp. 2319–2327. Cited by: §1.3, §1.5, §1, §1.

[5]
(2001)
Dynamic tcp acknowledgement and other stories about e/(e1).
In
Proceedings of the thirtythird annual ACM symposium on Theory of computing
, pp. 502–509. Cited by: §1.1.  [6] (1988) Competitive snoopy caching. Algorithmica 3 (14), pp. 79–119. Cited by: §1.1.
 [7] (1994) Competitive randomized algorithms for nonuniform problems. Algorithmica 11 (6), pp. 542–571. Cited by: §1.1, §1.
 [8] (2013) The constrained skirental problem and its application to online cloud cost optimization. In 2013 Proceedings IEEE INFOCOM, pp. 1492–1500. Cited by: §1.1, §1.
 [9] (2019) Optimal algorithms for ski rental with soft machinelearned predictions. arXiv preprint arXiv:1903.00092. Cited by: §1.5, §1, §1.
 [10] (2019) Learningassisted competitive algorithms for peakaware energy scheduling. arXiv preprint arXiv:1911.07972. Cited by: §1.5, §1.
 [11] (2008) Rent, lease or buy: randomized algorithms for multislope ski rental. In Dans Proceedings of the 25th Annual Symposium on the Theoretical Aspects of Computer ScienceSTACS 2008, Bordeaux: France (2008), Cited by: §1.5, §1.
 [12] (2018) Competitive caching with machine learned advice. In International Conference on Machine Learning, pp. 3302–3311. Cited by: §1.3, §1.5, §1, §1.
 [13] (2014) Mixture of experts: a literature survey. Artificial Intelligence Review 42 (2), pp. 275–293. Cited by: §1.5.
 [14] (2017) Revenue optimization with approximate bid predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1856–1864. Cited by: §1.5, §1.
 [15] (2005) The parking permit problem. In 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05), pp. 274–282. Cited by: §1.1.
 [16] (2018) A model for learned bloom filters and optimizing by sandwiching. In Advances in Neural Information Processing Systems, pp. 464–473. Cited by: §1.5, §1.
 [17] (2018) Improving online algorithms via ml predictions. In Advances in Neural Information Processing Systems, pp. 9661–9670. Cited by: §1.3, §1.5, §1, §1.
Appendix A Detailed Proof of Lemma 1
It is obvious that Under Assumption 1, we can consider the competitive ratio of shop Let be the buying day. Then if otherwise It is easy to argue that the worst case happens when We have
Hence, the competitive ratio is minimized when i.e., the best competitive ratio satisfies . Thus, we have