Two-Stage Robust Edge Service Placement and Sizing under Demand Uncertainty

Edge computing has emerged as a key technology to reduce network traffic, improve user experience, and enable various Internet of Things applications. From the perspective of a service provider (SP), how to jointly optimize the service placement, sizing, and workload allocation decisions is an important and challenging problem, which becomes even more complicated when considering demand uncertainty. To this end, we propose a novel two-stage adaptive robust optimization framework to help the SP optimally determine the locations for installing their service (i.e., placement) and the amount of computing resource to purchase from each location (i.e., sizing). The service placement and sizing solution of the proposed model can hedge against any possible realization within the uncertainty set of traffic demand. Given the first-stage robust solution, the optimal resource and workload allocation decisions are computed in the second-stage after the uncertainty is revealed. To solve the two-stage model, in this paper, we present an iterative solution by employing the column-and-constraint generation method that decomposes the underlying problem into a master problem and a max-min subproblem associated with the second stage. Extensive numerical results are shown to illustrate the effectiveness of the proposed two-stage robust optimization model.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

07/10/2021

Resilient Edge Service Placement and Workload Allocation under Uncertainty

In this paper, we study an optimal service placement and workload alloca...
07/18/2021

A Bilevel Programming Framework for Joint Edge Resource Management and Pricing

The emerging edge computing paradigm promises to provide low latency and...
03/12/2022

Stateless or stateful FaaS? I'll take both!

Serverless computing has emerged as a very popular cloud technology, tog...
03/20/2021

Joint Resource Allocation and Cache Placement for Location-Aware Multi-User Mobile Edge Computing

With the growing demand for latency-critical and computation-intensive I...
11/11/2020

Optimizing AI Service Placement and Resource Allocation in Mobile Edge Intelligence Systems

Leveraging recent advances on mobile edge computing (MEC), edge intellig...
01/27/2022

Mean-Field Game and Reinforcement Learning MEC Resource Provisioning for SFC

In this paper, we address the resource provisioning problem for service ...
09/11/2019

Learning End-User Behavior for Optimized Bidding in HetNets: Impact on User/Network Association

We study the impact of end-user behavior on service provider (SP) biddin...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Edge computing (EC) has been proposed to augment the traditional cloud computing model to meet the soaring traffic demand and accommodate diverse requirements of various services and systems in future networks, such as embedded artificial intelligence (AI), 5G wireless systems, virtual/augmented reality (VR/AR), and tactile Internet

[1, 2]. By distributing storage, computing, control, and networking resources closer to the network edge, EC offers remarkable advantages and capabilities, including local data processing and analytics, localized services, edge caching, edge resource pooling and sharing, and improved privacy and security [3, 4]. Also, EC is a key enabler for ultra-reliable low-latency applications.

To enhance user experience and reduce bandwidth usage, content/application/service providers (e.g., AR/VR companies, Google, Netflix, Facebook, Uber, Apple, and other OTT providers) can proactively install their applications, especially latency-sensitive and/or data-intensive ones such as AR/VR, cloud gaming, and video analytics, onto selected edge nodes (EN) in proximity of their users. Therefore, in addition to local execution on end-devices and remote processing in public clouds or their private data centers (DC), the SPs can offload their tasks to edge servers. Besides SPs, virtual network operators, vertical industries, enterprises, and other third parties (e.g., schools, hospitals, malls, sensor networks) can also outsource their data and computation to EC systems.

Fig. 1 depicts the new network architecture with an EC layer lying between the cloud and the aggregation layer. In particular, the aggregation layer consists of numerous Points of Aggregation (POA), such as base stations (BS) and network routers/switches, which aggregate data and requests from users, things, and sensors. In practice, various sources (e.g., Telco edge clouds, telecom central offices, servers at BSs, PCs in research labs, micro DCs in campus buildings, malls, and enterprises) can act as ENs [1, 2, 3, 4]. Indeed, an EN can be co-located with a POA. For example, edge servers can be placed in cell sites or deployed near routers in enterprise DCs.

Fig. 1: Edge Network Architecture

Typically, a service request first arrives at a POA, then it will be routed to an EN or the remote cloud for processing. For instance, with EC, when a user submits a Google Maps request or an Uber ride request, the request can now be handled by an EN instead of going all the way to the remote servers of Google or Uber. Clearly, EC not only can help the SP drastically improve the service quality but also significantly lower network bandwidth consumption.

Despite tremendous potential, EC is still in its infancy stage and many interesting open problems remain to be solved. In this work, we focus on the optimal edge service placement and workload allocation from the perspective of a SP (e.g., AR/VR, Pokémon Go, real-time translation, Uber, Apple Siri, Amazon Alexa, Google Assistant, Google Maps). Specifically, the SP needs to serve a large number of users/subscribers located in different areas. The goal of the provider is to minimize the total operating cost while maximizing the service quality. Here, we measure the quality of service (QoS) in terms of network delay.

In order to reduce the delay between the users and computing nodes, the SP can provision the service on various distributed ENs. Then, each user request can be processed by its closest ENs, which have the service installed. It is easy to see that placing the service on more ENs can lower the overall delay, but it also increases the SP’s cost. Specifically, when the service is available on more ENs, the network delay decreases because requests in each service area can be routed to the ENs closer to them. On the other hand, the service placement cost increases when the service is installed on more ENs. Hence, there is an inherent trade-off between the operating cost of the SP and the overall network delay of the requests.

Furthermore, unlike the traditional cloud with virtually infinite capacity, ENs often have limited computational power [1, 2, 3]. Additionally, in contrast to a small number of cloud DCs, there are numerous heterogeneous distributed ENs coming with different sizes and configurations. The resource prices of the ENs can also be different due to various factors such as different hardware specifications, electricity prices, location, reliability, reputation, and ownership. Thus, some ENs close to the users may not be chosen because of their higher prices. As a result, selecting suitable ENs for service placement is a challenging task due to the heterogeneity of the ENs.

Besides the placement decisions, the SP also needs to decide the amount of resource to buy from each selected EN. Given the service placement and sizing decisions, the provider will then decide how to allocate the traffic demand in different areas to different ENs to minimize the overall network delay. To this end, when the traffic demand is known, we formulate the joint service placement, resource sizing, and workload allocation problem as a mixed integer linear program (MILP), which can be solved efficiently by leveraging the state-of-the-art MILP solvers. The formulated problem aims to minimize the weighted sum of the operating cost of the SP and the total network delay of the user requests, while taking into account practical system design criteria such as resource capacity limits, budget constraint, and delay preference.

This problem becomes more sophisticated when considering the demand uncertainty. For example, in practice, the SP normally solves the deterministic MILP formulation using the forecast demand to find the optimal resource provisioning solution (i.e., placement and sizing). However, since the actual demand is unknown to the SP at the time of making decision, over-provisioning or under-provisioning may occur frequently, which is undesirable. Specifically, if the procured resources are excessive to serve the actual traffic demand most of the time, it leads to over-provisioning and unnecessarily high provisioning cost. On the other hand, if the procured resources are not sufficient to serve the actual demand most of the time, it leads to under-provisioning and may severely affect the quality of service (e.g., high latency, dropping requests).

A popular technique to deal with uncertainty is stochastic optimization, which has been successfully applied to many engineering problems, including cloud resource provisioning [5, 6, 7]

. However, the stochastic optimization approach requires to know the probability distribution of uncertain data, which is often difficult to obtain. Also, to ensure the solution quality, in stochastic programming, we need to generate a large set of scenarios based on this probability distribution and associate each scenario with a certain probability. Hence, even if the distribution is known, the stochastic model can still be computationally prohibitive, and even intractable.

Recently, robust optimization (RO) [8] has emerged as an alternative methodology to handle data uncertainty. Since the RO approach does not require knowledge of probability distribution of the uncertainty, it can avoid some of the difficulties arising from the stochastic programming approach. Indeed, RO has also been applied in the context of cloud resource management [9, 10, 11]

. Unlike stochastic programming where uncertainty is captured by a large number of scenarios, uncertainty in RO is described by parametric sets, called uncertainty sets. Since uncertainty sets can be constructed simply by using information such as lower bounds and upper bounds of uncertain parameters (i.e., random variables), it is much easier to derive than exact probability distributions.

The goal of RO is to find a robust solution that not only optimizes system performance but can also hedge against any perturbation in the input data within the uncertainty sets. Thus, the solution to a RO model tends to be conservative. However, the conservativeness of robust solutions can be controlled by adjusting the uncertainty sets whose forms significantly affect the tractability and computational complexity of a robust model [8]. Furthermore, while a larger uncertainty set strengthens the robustness of a solution, it also increases the conservativeness. In practice, RO models usually scale well with the increasing dimension of data and are computationally tractable for large-scale systems. Additionally, uncertainty sets are often constructed based on the desired level of robustness, historical data and experience of the decision maker.

In RO models, all decisions are made before the uncertainty is revealed, which can be overly conservative. To tackle this issue, adaptive robust optimization (ARO) [12], also known as two-stage RO [13], has recently been introduced, where the second-stage problem models recourse decisions after observing the first-stage decisions and the realization of the uncertainty. The first-stage decisions are often referred as “here-and-now” decisions that cannot be adjusted after the uncertainty is disclosed, while the second-stage decisions are known as “wait-and-see” decisions that can be adjusted and adapted to the actual realization of uncertain data. Thus, ARO is still robust while less conservative than RO.

In this paper, to address the challenge caused by the demand uncertainty, we propose a novel two-stage RO model to help the SP identify optimal ENs for placing the service and optimal amount of resource procured from each node before knowing the actual demand. The service placement and sizing are the first-stage decisions that are robust against any realization of traffic demand within a predefined uncertainty set. Given the first-stage solution, the optimal workload allocation decision is made in the second-stage after the uncertainty is disclosed.

The rationale behind this design is that service placement and sizing typically happen at a larger time scale (e.g., in the order of hours or days) to ensure system stability [14] while the workload allocation decisions can be adjusted in a shorter time scale (e.g., every few minutes) based on the actual demand. Furthermore, in practice, the SP may not be able to change the resource procurement decision frequently in short time scale. Hence, two-stage RO is a reasonable modeling choice for our problem. Note that the first-stage decision is robust against all scenarios, including the worst-case one, contained in the uncertainty set. If the realized demand is not the worst case, the workload allocation can be updated based on the actual demand in the operation stage.

To the best of our knowledge, this is the first two-stage robust model for the edge service placement and sizing problem. Our main contributions are summarized below:

  • We first introduce a deterministic MILP model for joint edge service placement, resource procurement, and workload allocation, which is then extended to a new two-stage RO model to deal with demand uncertainty. In particular, the first-stage decision variables include service placement and resource sizing, while resource allocation and request scheduling are the second-stage decisions.

  • The formulated model is a trilevel optimization problem that is decomposed into a master problem and a max-min subproblem. The bilevel subproblem is reformulated into a single-level problem with complementary constraints, which is then transformed into an MILP. We develop an iterative algorithm based on the column and constraint generation (CCG) method [13] to solve the problem in a master-subproblem framework, which is guaranteed to converge in a finite, typically small, number of iterations.

  • Finally, extensive numerical results are presented to demonstrate the efficacy of the proposed ARO model compared to the deterministic and RO models. We also perform sensitivity analysis to evaluate the impacts of different system parameters on the optimal solution.

The rest of the paper is organized as follows. In Section II, we describe the system model. In Section III, we fist formulate the deterministic optimization model, which is then extended to a static robust model and an adaptive robust model. The CCG-based iterative solution approach is introduced in Section IV. Simulation results are shown in Section V followed by discussion of related work in Section VI. Finally, we present conclusions and future research directions in Section VII.

Ii System Model

Fig. 2: System model
Notation Meaning
EN, AP Edge Node, Access Point
, M Set and number of APs
, N Set and number of ENs
, AP index and EN index
Resource capacity of EN available for purchase
Price of one computing unit at the cloud
Price of one computing unit at EN
Service placement cost at EN
Storage cost at EN
Delay threshold
Network delay between AP and the cloud
Network delay between AP and EN
Budget of the service provider
Weighting factor, delay cost
Computing resource demand of one request
Forecast traffic demand at AP
Actual traffic demand at AP
Total network delay cost
Cloud resource procurement cost
Service placement cost at EN
Edge resource procurement cost at EN
Storage cost at EN
Workload at AP assigned to the cloud
Workload at AP assigned to EN
Amount of computing resource purchased at the cloud
Amount of computing resource purchased at EN
, 1 if the service is placed at EN at the beginning
Binary variable, 1 if the service is placed at EN
Uncertainty budget
Demand uncertainty set
TABLE I: NOTATIONS

In this paper, we study the service placement and sizing from the perspective of a SP (e.g., Google Maps, AR/VR). The SP has subscribers located in different areas, where user requests in each area are aggregated at an access point (AP). Without EC, the requests are typically sent to remote servers for processing. However, with EC, the SP can serve the requests at ENs closer to the users. We envision the emergence of an EC market managed by a Telco, a cloud provider (e.g., Amazon), or a third-party [3, 4, 15]. Indeed, numerous EC markets are currently being constructed by big companies and startups. The SP is assumed to have access to a portal listing various types of ENs in the market. Based on different factors such as price and location, the SP will decide where to place the service and how much resource to buy from each EN.

We consider an EC system that consists of a set of ENs and a set of APs. Let and be the EN index and AP index, respectively. Also, denote by the computing resource capacity of EN (e.g., number of servers, number of virtual machines, number of vCPUs, or number of CPU cycles per second). Define and as the price of one computing unit at the cloud and at EN , respectively. In practice, the ENs can be geographically distributed in different locations that have different electricity prices. Furthermore, the ENs may come with different sizes, ownership, server types, and configurations. Thus, the computing resource prices can vary among the ENs. Similar to the previous literature [3, 4, 15, 27, 25, 26, 28, 29, 14, 19, 16, 17, 18, 30, 31, 32, 33, 34, 35, 20, 21, 22, 23, 24], we only consider the network from the APs to the ENs and the cloud. Let and be the distance between AP and the cloud, and between AP and AP , respectively. The system model is depicted in Fig. 2.

The SP is assumed to be a price-taker and have a budget for operating the service at the edge and the cloud. The SP uses the budget to buy cloud and edge resources to serve the user requests during a fixed period of time (e.g., 30 minutes, few hours, or a day), with the goal of minimizing not only the resource procurement cost but also the total network delay of the requests. As mentioned in the introduction, to lower the network delay and reduce the data transmission, the SP can place the service on different ENs. Let be a binary variable that equals to if the SP decides to place the service onto EN and otherwise. Also, let be a binary indicator that equals to if the service is available at EN at the beginning.

If the SP wants to place the service onto EN that does not have the service installed at the beginning of the scheduling horizon, we need to download the software and data of the service from a remote server or nearby ENs and install it onto EN . This will incur a cost that can be calculated as follows. First, denote by the set of ENs that have the service at the beginning, i.e., . Let be the cost of installing and downloading the service from the cloud to EN . Similarly, is the cost of installing and downloading the service from EN to EN . Typically, for all and because the distance between two ENs is much shorter than the distance between the cloud and an EN. We have:

(1)

Let be the cost of storing the service at EN . Also, define and as the amount of computing resource purchased from the cloud and EN , respectively. The request arrival rate (i.e., traffic demand, workload) of the service at AP is denoted by . Given the set of procured edge and cloud resources, the SP decides how to optimally divide the workload in different areas to the ENs and the cloud for processing to minimize the total network delay. Define as the portion of workload at AP to be routed to the cloud, and as the workload at AP assigned to EN . Obviously, the SP prefers to have the requests processed by an EN closer to them rather than the remote cloud. To ensure the service quality, the SP may impose an upper bound on the average network delay. A lower average delay implies that more requests are processed at the edge. The main notations are summarized in Table I.

Iii Problem Formulation

In this section, we first present a deterministic formulation of the service placement, resource procurement, and request scheduling problem, which is then extended to a two-stage robust formulation to deal with demand uncertainty.

Iii-a Deterministic Formulation

In the deterministic model, the SP jointly optimizes the service placement, sizing, and workload allocation decisions when the traffic demand is assumed to be known exactly.

Iii-A1 Objective Function

In this paper, we are interested in minimizing the total cost of the SP as follows:

(2)

where is the placement cost, is the storage cost, the cost of purchasing edge computing resource, is the cloud resource procurement cost, and is the network delay cost. Also, we have , , , , and .

As explained in the system model section, the service placement cost is the total cost of downloading and installing the service at EN , which can be expressed as:

(3)

Clearly, if the service is available at EN at the beginning (i.e., ), the service placement cost becomes zero. When the service is placed at EN but it is not available at the node at the beginning (i.e., and ), is equal to . Additionally, if the SP decides to run the service at EN (), the cost of storing the service data at EN is:

(4)

The edge resource procurement cost at EN is equal to the amount of purchased resource multiplied by the resource price at the EN, i.e., we have:

(5)

Similarly, the cloud resource procurement cost is . In addition, the total network delay of all the service requests is:

(6)

Thus, the total network delay cost is . Overall, the goal of SP is to minimize the following objective function:

(7)

The SP can vary the delay cost parameter to control the tradeoff between the total expenses of the SP and the total network delay of the requests (i.e., between cost and service quality). The weight reflects the SP’s attitude towards the network delay. Clearly, a larger indicates that the SP is more delay-sensitive and willing to spend more to lower the delay. Note that the delay of each request includes the transmission delay, propagation delay (network delay), and processing delay at the cloud or an EN. In this work, we assume that each request is assigned a fixed amount of computing resource and transmission bandwidth. Thus, the processing delay and the transmission delay (i.e., the request size divided by the bandwidth) are fixed. Hence, for simplicity, we consider the network delay only. Additionally, it is straightforward to consider other costs such as bandwidth cost, which would discourage sending workload to the cloud, with minor modifications. We are now ready to describe all the constraints of the underlying optimization problem.

Iii-A2 Budget Constraint

The total expense should not exceed the budget of the SP. Thus, we have the following constraint:

(8)

Iii-A3 Reliability Constraint

To enhance the service reliability, the SP may want to place the service on at least a minimum number of ENs since link/node failures can occur unexpectedly. Hence, we can impose:

(9)

Iii-A4 Workload Allocation Constraints

The service requests arriving at each AP must be allocated to either the remote cloud or the ENs. Hence, we have:

(10)

Iii-A5 Capacity Constraints

The amount of computing resource and purchased at the cloud and each EN should be sufficient to serve the resource demand of all requests assigned to the cloud and the EN. Also, the SP buys resource only from the ENs that have the service installed (i.e., ). Furthermore, the amount of resource purchased from each EN cannot exceed the capacity of the EN. Thus:

(11)
(12)
(13)

Iii-A6 Delay Constraint

In order to ensure a certain service quality for their users, the SP may require the average network delay () of the requests does not exceed a certain maximum delay threshold . The average network delay is:

(14)

Hence, can be rewritten as:

(15)

Iii-A7 Constraints on Variables

The service placement indicator is a binary variable. Also, the workload allocation and resource procurement must be non-negative. Thus:

(16)
(17)

Overall, the SP aims to solve the following deterministic optimization problem:

This is an MILP that can be solved efficiently using existing MILP solvers. Note that our model can be easily extended to capture other system and design constraints. For instance, we may consider multiple resource types (e.g., RAM, CPU, bandwidth) instead of only computing resource. The SP may be enforced to buy an integer quantity of computing units from each EN rather than a continuous amount , which can be easily handled using a unary or binary expansion.

Iii-B Uncertainty Modeling

In the deterministic model, the traffic demand is assumed to be known exactly at the time of making the service placement and sizing decision. In other words, the SP assumes that the actual demand is the same as the forecast one, which is then used as input to the deterministic problem. However, the exact demand in each area typically cannot be accurately predicted at the time of making the strategic decision. Thus, how to properly capture the uncertainties in the decision making process is a crucial task.

In RO, uncertain parameters are modeled through uncertainty sets, which express an infinite number of scenarios. A well constructed uncertainty set should be computationally tractable and balance robustness and conservativeness of the robust solution [8]. In practice, the polyhedral uncertainty set is a natural and popular choice for representing uncertainties in the RO literature [8, 12, 13]. In particular, to construct a polyhedral uncertainty set, the SP would need to specify intervals expressing the uncertain demand at every AP, and a parameter to control the degree of conservativeness.

Define and

as the actual demand vector and the forecast demand vector, respectively. Note that

is also called the nominal value of the uncertain demand . Additionally, let , where is the maximum demand deviation, which can be understood as the maximum forecasting error of the uncertain demand at AP . Thus, represents the uncertain demand at AP . The polyhedral uncertainty set can be defined as follows:

(18)

where is the budget of uncertainty, which can vary in the continuous interval [0, M]. The form of this uncertainty set is widely used in the RO literature [8, 12, 13]. This set contains the lower and upper bounds of the uncertain parameters and the bound of the linear combination of the uncertain parameters. These information can be extracted by learning from historical data. Indeed, a more general polyhedron can also be accommodated by the RO approach.

The actual demand can take any value in the range of . However, in practice, it is not likely that all the actual demands are simultaneously close to the corresponding lower bounds or upper bounds. This observation is captured by the uncertainty budget . A larger value of implies a larger uncertainty set and a more robust solution. However, the solution is also more conservative to protect the system against a higher degree of uncertainty. Therefore, can be used to adjust the robustness against the conservative level of the solution. When , the actual demand is equal to the forecast demand and a robust model becomes a deterministic model without considering demand uncertainty. When , we simply consider all possible realizations of the uncertain demand in the interval of , and becomes a box uncertainty set. Hence, the polyhedral uncertainty set is less conservative compared to the box uncertainty set.

Iii-C Robust Optimization Formulation

In the static (single-stage) robust optimization model, the service placement, sizing, and workload allocation decisions are made simultaneously before observing the actual realization of the demand. Based on the RO theory [8], the robust service placement and sizing problem can be formulated as:

(19)

subject to

(20)
(21)

where is the set of constraints on the placement, sizing, and workload allocation variables. These constraints are given in the deterministic model. The RO model aims to minimize the total cost of the SP under the worst-case demand scenario. Also, all the constraints related to uncertain parameters should be satisfied for any potential realization of these parameters within the uncertainty set.

Note that in the deterministic formulation, the workload allocation constraints (10) can be written as either equalities or inequalities since inequalities become equalities at the optimality for the cost minimization objective. Also, equalities related to uncertainties are meaningless in the RO approach [8] since the equality constraints cannot be satisfied for every demand scenario in the uncertainty set. Hence, the workload allocation constraints is written in form of inequalities as in (20). Due to space limitation, we do not present the solution approach here and refer to Appendix -B for more details.

Iii-D Two-Stage Adaptive Robust Formulation

Unlike the RO model where all decisions are made simultaneously in a single stage, the ARO model includes two stages. In particular, the service placement and the resource sizing are the first-stage decision variables. Given the first-stage decisions, the workload allocation decisions (i.e., operation decisions) are determined as an optimal solution to the second-stage problem (i.e., the recourse problem). Note that the first-stage decisions are made without the knowing the uncertain parameters while the recourse decisions are made based on the revealed uncertainties. Define The two-stage adaptive robust model can be formulated as:

(22)

where:

(25)

Note that represents the constraints on the service placement and sizing variables, and is the set of feasible workload allocation solutions for a fixed resource procurement decision and demand realization . The optimal workload allocation aims to minimize the total delay cost in the worst-case scenario of the demands. The two-stage robust service placement and sizing above is inherently a trilevel (mix-max-min) optimization model, which is difficult to solve.

Iv Solution Approach

In this section, we develop an iterative algorithm based on the column and constraint generation (CCG) procedure [13] to solve the formulated two-stage robust service placement and sizing problem (22). In particular, the developed algorithm is implemented in a master-subproblem framework that decomposes the problem into a master problem and a bilevel max-min subproblem. The optimal value of the master problem in each iteration provides a lower bound while the optimal solution to the subproblem helps us compute an upper bound of the original two-stage robust problem (22). The algorithm iteratively solves an updated master problem and an updated subproblem in each iteration until convergence.

First, it can be shown that the two-stage robust model (22) can be transformed into an equivalent mixed integer program built on the collection of extreme points of the uncertainty set . It can be explained intuitively as follows. Assume that the problem (22) as well as the innermost minimization problem in (22) are feasible. Then, we can rewrite the second-stage bilevel problem as a max-max problem (i.e., simply a maximization problem) over and the set of dual variables associated with the constraints of the innermost minimization problem. Since this maximization problem is optimized over two disjoint polyhedrons, it always has an optimal solution combining extreme points of these two polyhedrons [36]. Therefore, the optimal solution to the second-stage problem always occurs an extreme point of the polyhedron .

Define as the number of extreme points of the uncertainty set . Also, let be the set of extreme points (i.e., extreme demand scenarios) of , where is the -th extreme point. Therefore, the ARO model (22) is equivalent to:

(26)

Clearly, this problem can be transformed into the following equivalent MILP by enumerating all the extreme points in :

(27)

subject to

(28)
(29)

For a large polyhedral uncertainty set as in (18), obtaining the optimal solution to the reformulated large-scale MILP above, which needs to enumerates all the possible extreme demand scenarios, may be not practically feasible. This motivates the iterative solution based on the CCG method [13]. Specifically, instead of solving the full problem (27)-(29) for all extreme points in the uncertainty set, we only solve this problem for a subset of , which obviously provides a valid relaxation of this problem and give us a lower bound (LB). Therefore, we can obtain stronger LBs by gradually adding non-trivial demand scenarios to the relaxed problem. This is indeed the core idea behind the CCG method, which expands a subset of gradually and add an additional variable in each iteration . Furthermore, an optimal solution to the second-stage problem for a fixed (, ) clearly provides an upper bound (UB) of the two-stage robust problem (22).

In the following, we first describe the master problem that is a relaxation of the problem (27)-(29). Then, we elaborate how to solve the subproblem (i.e., the second-stage problem) given the first-stage decision. Finally, we present the iterative algorithm for solving the two-stage robust service placement and sizing problem in a master-subproblem framework.

Iv-a Master Problem

The master problem (MP) at iteration is given as:

(30)

subject to

(31)
(32)
(33)
(34)
(35)

where is the set of optimal solutions to the subproblem in all previous iterations up to iteration . Also, . The optimal solution to this master problem includes the optimal placement (), sizing (, ), delay cost (), and workload allocation . Then, the optimal sizing decisions and will serve as input to the subproblem in Section IV-B. Indeed, the master problem at iteration corresponds to extreme points of the uncertainty set . Therefore, because each MP contains only a subset of constraints of the original two-stage RO formulation (27)-(29), the optimal solution to an MP provides a LB of the original problem. We also achieve a stronger LB in every iteration since each new iteration adds more constraints to the MP. Thus:

(36)

Iv-B Reformulation of the Subproblem

The max-min subproblem is indeed a bilevel optimization problem, which is difficult to solve. To this end, we show how to reformulate the subproblem as a MILP problem that can be solved globaly using MILP solvers. Specifically, given , the subproblem (SP) is:

(37)

From (III-D), the inner minimization problem can be written as:

(38)

subject to

(39)
(40)
(41)
(42)
(43)
(44)

where are the dual variables associated with constraints (39)-(44), respectively. Also, are the optimal sizing solution to the latest MP, i.e., at iteration , we have and .

Based on the Karush–Kuhn–Tucker (KKT) conditions [45], we can infer that given , the optimal solution to the innermost problem (38)-(44) is any of the feasible solutions to the set of constraints (46)-(51). Please refer to Appendix -C for more details. Hence, the SP (37) is equivalent to the following problem with complementary constraints:

(45)

subject to

(46)
(47)
(48)
(49)
(50)
(51)
(52)
(53)

where the last two constraints represent the uncertainty set . Note that a complimentary constraint means and . Thus, it is a nonlinear constraint. Fortunately, this nonlinear complimentary constraint can be transformed into equivalent exact linear constraints by using the Fortuny-Amat transformation [46]. Specifically, the complementarity condition is equivalent to the following set of mixed-integer linear constraints:

(54)

where M is a sufficiently large constant. By applying this transformation to all the complementary constraints (46)-(50), we obtain an MILP that is equivalent to the subproblem (37). The explicit form of this MILP is given in Appendix -D. Thus, the subproblem can be solved using an MILP solver.

Denote by (, ) the optimal solution to the SP at iteration . The solution to each SP helps us determine an UB to the original two-stage RO problem. Specifically, we have:

(55)
(56)

Also, is used as input to the MP in the next iteration.

Iv-C Algorithm

Based on the description of the master problem and the subproblem, we are now ready to present the CCG-based iterative algorithm for solving the formulated two-stage robust service placement and sizing problem (22) as shown in Algorithm 1.

1:  Initialization: set , , and .
2:  repeat
3:     Solve the following MP.
(57)
Obtain an optimal solution and update LB according to (36).
4:     Solve SP (37) with to obtain the worst-case demand given and update UB following (55).
5:     Update , and .
6:  until 
7:  Output: optimal placement and sizing decisions .
Algorithm 1 Two-Stage Adaptive Robust Algorithm

Different from [13], we consider the extreme scenario with the maximum total demand in the first iteration. In particular, without loss of generality, let be the extreme demand scenario in with the maximum total demand. Specifically, can be computed as an optimal solution to the following optimization problem:

Indeed, can be found analytically by sorting the demand deviations in descending order and set associated with the largest demand deviations to be 1 and of the next largest to be . Note that the ARO model (22) is equivalent to the problem (27)-(29) whose constraints enumerate over all extreme points of , including . By considering in the first iteration, we have . Hence, . As a result, the total resource purchased from the cloud and the ENs is always sufficient to serve every realization of the demand.

Since new constraints related to are added to the MP (57) at every iteration , the LB is improved (weakly-increasing) at every iteration. Also, by definition as in (55), the UB is non-increasing. Furthermore, as explained before, the worst-case demand in Step 4 is always an extreme point of the polyhedral uncertainty set . The set of extreme points is a finite set with elements. Hence, we can prove that Algorithm 1 converges to the optimal value of the original two-stage robust problem (22) in iterations.

Indeed, this can be shown by contradiction that any repeated implies . Specifically, assume is the optimal solution to MP (57), is the optimal solution to the SP in iteration , and appears in a previous iteration. From step 4 of Algorithm 1, we have: Now, since appears in a previous iteration, the MP in iteration is identical to the MP in iteration . Hence, is also the optimal solution to the MP in iteration . We have: Since the extreme scenario has already been identified and related constraints are added to the MP before iteration , we have: . Thus, , which implies . As a result, Algorithm 1 converges in a finite number of iterations. Typically, the algorithm converges within a few iterations as shown in the numerical results.

V Numerical Results

V-a Simulation Setting

Since EC is still in its early stage and we are not aware of any public data for edge network topologies, similar to previous works [17, 39], we adopt the Barabasi-Albert model [41] to generate random a scale-free edge network topology with 100 nodes and the attachment rate of 2 [17]. Also, link delays are randomly generated in between 2 ms and 5 ms [42]. The network transmission delay between any two APs is the delay of the shortest path between them. Based on the generated topology, 80 nodes are chosen as APs and 20 nodes are chosen as ENs. The delay between each AP and the remote cloud is set to be 80 ms. Additionally, the forecast traffic arrival rate (i.e., demand) at each AP is randomly drawn from 1000 to 4000 requests per time unit [42], and the resource demand of each service request is set to be 1 MHz [43].

Each EN is chosen randomly from the set of Amazon EC2 M5 instances [44]. Using the hourly price of a general purpose m5d.xlarge Amazon EC2 instance [44] as a reference, the resource price ($ per vCPU per time unit) at the cloud is set to be 0.03 while the resource prices at the ENs are randomly generated from 0.04 to 0.06. Additionally, the service placement costs () and storage costs () at the ENs are randomly generated in the ranges of [0.2, 0.25] and [0.1, 0.12], respectively. The budget of the SP is set to be 100. We also assume that the service is not available on any EN at the beginning (i.e., .

For the sake of clarity in the figures and analysis, in the base case, we consider a small system with 20 APs and 5 ENs (i.e., M = 20, N = 5), which are selected randomly from the corresponding original sets of 80 APs and 20 ENs. Note that we also study the impacts of varying the number of APs and ENs later. In the base case, the maximum average delay is set to be 30 ms, and the minimum number of edge servers is 2. Let be the ratio between the maximum demand deviation and the forecast demand . Also, define . In the base case, we set , the uncertainty budget , and the delay penalty parameter . This default setting is used in most of the simulations unless mentioned otherwise. We implement all the algorithms in Matlab environment using CVX111http://cvxr.com/cvx/ and Gurobi222https://www.gurobi.com/.

V-B Performance Evaluation

V-B1 Comparison between RO, ARO, and deterministic models

(a) Cost
(b) Payment
Fig. 3: Comparison between RO and ARO

First, we compare the performance of the traditional RO approach and the ARO approach to verify that the ARO solution is less conservative than the RO solution. Specifically, Figs. 3(a) and 3(b) present the costs and payments, respectively, in the RO and ARO schemes under different values of the delay penalty . The payment is the total money spent for placing the service and buying resource, while the total cost is the value of the objective function that equals to the sum of the payment and the total network delay cost. As we can observe, the total cost as well as the payment produced by ARO is less than those of RO. Hence, these results indicate that the ARO solution is less conservative than the RO solution.

In addition, the total cost increases as the delay penalty cost increases (i.e., the SP is more delay-sensitive and willing to pay more to reduce the delay). Note that is the ratio between the maximum demand deviation and the forecast demand, which can be understood as the maximum forecast error (e.g., = 0.2 implies an error of 20% of the forecast value). It can be seen that the costs and payments in the robust approaches increase as the forecast error increases.

When (i.e., no uncertainty), the ARO and RO models become deterministic, and the robust solutions are the same as the deterministic solution. The cost and payment in the deterministic model (i.e., ) are lower than those in the robust models because the deterministic model assumes that the SP has perfect knowledge of the demand.

(a) Demand Scale = 1
(b) Demand Scale = 2
Fig. 4: Resource procurement comparison

Figs. 4(a) and 4(b) show the amount of resources procured from the ENs and the cloud (i.e., EN 0) in the deterministic scheme, RO scheme, and ARO scheme for a specific problem instance. The demand in each figure is equal to the demand in the base case multiplied by the demand scale factor. Clearly, in the deterministic model, the SP buys the lowest amount of resources since the demand is assumed to be exactly known. It can also be observed that the amount of procured resources in the ARO model is less than that in the RO model.

V-B2 Comparison between RO, ARO, and deterministic approaches in the operation stage

It is worth emphasizing that the main goals of the RO, ARO, and deterministic models presented throughout the paper are to identify the placement and sizing decisions before knowing the actual demand. The robust approaches aim to hedge against any realization of the demand (i.e., robust against worst-case scenarios), while the deterministic approach uses the forecast demand to determine the optimal placement and sizing solution. However, the actual demand does not necessarily coincide with either the forecast demand or the worse-case demand scenario. Thus, it is important to evaluate the performance of these approaches in the actual operation stage when the demand is disclosed.

Indeed, the cost in the deterministic model should be acquired by solving it based initially on the forecast demand and then re-evaluating it under demand uncertainty. Specifically, we first solve the deterministic model using the forecast demand to obtain the optimal decision . When the actual demand is revealed, the SP then solves the workload allocation problem using as an input. Since may not be sufficient to serve the actual demand, we allow dropping requests in the operation stage at a penalty cost per percentage of unserved requests (see Appendix -A for more details).

Thus, the actual cost of the deterministic model is the sum of the service placement and resource procurement costs before knowing the demand (i.e., first-stage cost) and the actual delay cost in the operation stage (i.e., second-stage cost). We apply similar procedures to the robust approaches since the actual demand is rarely the same as the worst-case demand scenario. In particular, the service placement and resource procurement costs are the costs in the first-stage before the demand is revealed. Given these decisions, the delay cost in the second-stage is re-computed when the actual demand is disclosed.

(a) Average
(b) Worst-case
Fig. 5: Cost comparison under actual demand

We now compare the performance of the deterministic and robust approaches in terms of average cost and worst-case cost. First, we generate 100 demand scenarios within the uncertainty set. The average cost is the sum of the second-stage cost, which is averaged over 100 scenarios, and the first-stage provisioning cost. For the worst-case cost, the second-stage cost is the cost in the worst-case demand scenario (i.e., the scenario giving the maximum second-stage cost).

Fig. 6: Robust cost and actual cost

The average cost and the worst-case cost comparisons among these approaches are shown in Figs. 5(a) and 5(b), respectively. Here, we set (e.g., the penalty will be if 1% of requests are unserved). Since the deterministic method does not consider demand uncertainty, its cost is significantly higher than those of the robust schemes, especially in the worst-case scenario. Also, because the realized demand in the operation stage is usually not the worst-case demand scenario, the actual costs of the robust solutions are considerably lower than the optimal values of the objective functions (III-C) and (22) in the robust models, as can be seen in Fig. 6. This figure further confirms that ARO is less conservative than RO.

(a) Varying and
(b) Varying and
Fig. 7: Impact of the uncertainty set on system performance

V-B3 Sensitivity Analysis

We now study the impacts of different design parameters on the system performance. First, by definition, the uncertainty set is characterized by the maximum forecast error and the uncertainty budget . Figs. 7(a) and 7(b) show the impact of the uncertainty set on the optimal solution. As expected, the total cost increases as the uncertainty set enlarges (i.e., increases and/or increases). Note that the maximum value of is 20 since we have 20 APs in the base case. Also, Fig. 7(b) suggests that the SP can lower the total cost by reducing the delay penalty parameter .

(a) M = 20 , varying N and
(b) N = 20, varying M and
Fig. 8: Impact of M and N on system performance

Figs. 8(a) and 8(b) illustrate the impact of number of ENs and number of APs, respectively, on the system performance. It is easy to see that when there are more ENs, the SP has more options to buy edge resources and allocate workload to closer ENs. Hence, the total cost of the SP decreases as N increases. Similarly, the total cost increases when there are more APs due to increasing total workload. Finally, the convergence property of the proposed algorithm is presented in Figs. 9(a) and 9(b) for certain problem instances. It can be seen that the algorithm converges very quickly towards the optimal solutions. Indeed, we conducted extensive numerical experiments which show that the algorithm typically converges in a few iterations (even just one or two iterations in some cases).

(a)
(b)
Fig. 9: Convergence property

Vi Related Work

Various aspects of EC have been studied over the last few years. A majority of the previous work has focused on the joint allocation of communication and computational resources for task offloading in wireless networks [37]. In [38], Stackelberg game and matching theory are combined to tackle the fog resource allocation problem. Reference [15] presents a primal-dual method for online matching edge resources to different service providers to maximize system efficiency. A cloudlet load balancing problem is formulated in [39] to minimize the maximum response time of offloaded tasks. R. Deng et al. [40] propose a novel workload allocation model in a hybrid cloud-fog system to minimize energy cost under latency constraints. In [3, 4], a market equilibrium approach is employed to fairly and efficiently allocate edge resources to competing services.

A growing literature has focused on the optimal placement and activation of ENs. In [16], the authors jointly optimize cloudlet placement and workload allocation to minimize the average network delay between mobile users and the cloudlets by placing a given number of cloudlets to some strategic locations. This model is extended in [17] to capture both network delay and computing delay at the cloudlets using queuing models. In [18], L. Ma et al. formulate a cloudlet placement and resource allocation problem, with the goal of minimizing the number of cloudlets while respecting the access delay requirements of users. The service entity placement problem for social virtual reality applications is examined in [19] to minimize the total system cost, including the cloudlet activation cost, the placement cost, the proximity cost, and the colocation cost, for deploying these applications at the edge.

A ranking-based near optimal algorithm is presented in [20] for efficiently deploying cloudlets among numerous APs in an IoT network. Reference [21] aims to minimize the overall energy consumption subject to delay constraints by intelligently placing cloudlets on the network and allocating tasks to cloudlets and the public cloud. In [22], A. Ceselli et al. introduce a mobile edge cloud network planning framework that simultaneously considers cloudlet placement, assignment of APs to cloudlets, and traffic routing from and to the cloudlets, with the goal of minimizing the total installation costs of all network nodes. A MINLP formulation is proposed in [23] to determine optimal locations for cloudlet placement with minimum installation cost considering the capacity and latency constraints. Reference [24] presents a multi-objective optimization framework to minimize the service delay and the cloud’s load by simultaneously identifying the optimal location, capacity, and number of ENs, as well as the optimal links between the ENs and the cloud.

Recently, service placement in EC has attracted a lot of attention in the literature. In [25], a unified service placement and request dispatching framework is proposed to optimize the tradeoffs between the average latency of users’ requests and the cost of service providers. In [26], the authors present an application image placement and task scheduling problem in a fog network with dedicated storage and computing servers to minimize the makespan. A constant-factor approximation algorithm is introduced in [28] to find a feasible service placement that maximizes the total user utility considering the heterogeneity of ENs and user locations. In [27], R. Yu et. al. consider an IoT application provisioning problem that jointly optimizes application placement and data routing to support all data streams with both bandwidth and delay guarantees.

In [29], a joint application placement and workload allocation scheme is presented to minimize the response time of IoT application requests. The work [30] introduces a fog service provisioning framework that dynamically deploys and releases applications on fog nodes to satisfy low latency and QoS requirements of the applications while minimizing the total system cost. Joint optimization of access point selection and service placement is addressed in [31] to enhance user QoS by balancing the access delay, communication delay, and switching delay. In [33], the authors jointly optimize service placement and request routing in MEC-enabled multi-cell networks to reduce the load of the centralized cloud considering the limited storage capacities of ENs and asymmetric bandwidth requirements of the services. In [14], a two-time-scale optimization framework is proposed to optimize service placement and request scheduling under multi-dimensional resource and budget constraints, in which request scheduling occurs at a smaller time scale and service placement occurs at a larger scale to reduce system instability.

The resource allocation and provisioning problem under uncertainty in cloud/edge computing has also been studied in recent literature. In [34], T. Ouyang el. al.

formulate the dynamic service placement problem as a contextual multi-armed bandit problem and propose a Thompson-sampling based online learning algorithm to assist users to select an EN for offloading considering the tradeoff between latency and service migration cost. In the same line of research, due to the unknown benefit of placing edge service in a specific site, a combinatorial contextual bandit learning problem is presented in

[35] to help an application provider decide the optimal set of ENs to host its service under the budget constraint.

In [5, 6], the authors employ the scenario-based stochastic programming approach to tackle different cloud resource provisioning problems that aims to minimize the total resource provisioning cost. Similarly, reference [7] formulates an energy-aware edge service placement as a multi-stage stochastic program with the objective of maximizing the QoS of the system under the limited energy budget of edge servers. In [9, 10], the authors propose different robust cloud resource provisioning formulations using the standard RO method. Also, RO is utilized in [11] to jointly optimize radio and virtual machine resources in mobile edge computing.

Different from the existing work, we propose a new two-stage RO model for joint optimization of service placement, sizing, and workload allocation from the perspective of a SP.

Vii Conclusion and Future Work

In this paper, we introduced a novel two-stage RO model to help a SP decide an optimal service placement and sizing solution that can hedge against all possible realizations of the uncertain traffic demand within an uncertainty set. Given the placement and sizing decision in the first-stage, the workload allocation decisions are made in the second-stage after the uncertainty is revealed. The proposed robust model enables the SP to balance the tradeoff between the total operating cost and the service quality while taking demand uncertainty into account. Extensive numerical results were presented to demonstrate the advantages of the proposed scheme, which is less conservative compared to the RO approach and more robust compared to the deterministic approach.

There are several interesting directions for future work. First, we would like to study more efficient methods to speed up the computation of both the master problem and the subproblem. Second, instead of polyhedral sets, we would like to explore techniques to tackle more sophisticated and/or data-driven uncertainty sets (e.g., one that can better capture the highly dynamical and correlated uncertainty parameters). We also plan to extend the proposed model to the network slicing problem where a network operator optimizes the planning and operation of their edge network to serve multiple services with uncertain demand and different characteristics. Finally, we are interested in studying the impact of the placement and sizing decisions of the SP on the resource prices when there are multiple SPs in the system.

References

  • [1] M. Chiang and T. Zhang, “Fog and IoT: an overview of research opportunities,” IEEE Internet Things J., vol. 3, no. 6, pp. 854–864, Dec. 2016.
  • [2]

    W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, “Edge computing: vision and challenges,”

    IEEE Internet Things J., vol. 3, no. 5, pp. 637–646, Oct. 2016.
  • [3] D.T. Nguyen, L.B. Le, and V.K. Bhargava, “Price-based resource allocation for edge computing: a market equilibrium approach”, IEEE Trans. Cloud Comput., to be published.
  • [4] D.T. Nguyen, L.B. Le, and V.K. Bhargava, “A market-based framework for multi-resource allocation in fog computing,” IEEE/ACM Trans. Netw., vol. 27, no. 3, pp. 1151–1164, June 2019.
  • [5] S. Chaisiri, B. Lee, and D. Niyato, “Optimization of resource provisioning cost in cloud computing,” IEEE Trans. Serv. Comput., vol. 5, no. 2, pp. 164–177, Apr.–Jun. 2012.
  • [6] S. Mireslami, L. Rakai, M. Wang, and B. H. Far, “Dynamic cloud Resource Allocation Considering Demand Uncertainty,” IEEE Trans. Cloud Comput. to be published.
  • [7] H. Badri, T. Bahreini, D. Grosu, and K. Yang, “Energy-aware application placement in mobile edge computing: a stochastic optimization approach,” IEEE Trans. Parallel Distr. Syst., vol. 31, no. 4, pp. 909–922, Apr. 2020.
  • [8] A. Ben–Tal, L. El Ghaoui, and A. Nemirovski, “Robust Optimization,” Princeton, NJ, USA: Princeton Univ. Press, 2009.
  • [9] S. Chaisiri, B. Lee, and D. Niyato, “Robust cloud resource provisioning for cloud computing environments,” in Proc. IEEE SOCA, Perth, WA, USA, 2010.
  • [10] R. Kaewpuang, D. Niyato, P. Wang, and E. Hossain, “A framework for cooperative resource management in mobile cloud computing,” IEEE J. Sel. Areas Commun., vol. 31, no. 12, pp. 2685–2700, Dec. 2013.
  • [11] Y. Li, J. Liu, B. Cao, and C. Wang, “Joint optimization of radio and virtual machine resources with uncertain user demands in mobile cloud computing,” IEEE Trans. Multimedia, vol. 20, no. 9, pp. 2427–2438, Sept. 2018.
  • [12] D. Bertsimas, E. Litvinov, X. A. Sun, J. Zhao, and T. Zheng, “Adaptive robust optimization for the security constrained unit commitment problem,” IEEE Trans. Power Syst. vol. 28, no. 1, pp. 52–63, Feb. 2013.
  • [13] B. Zeng and L. Zhao, “Solving two-stage robust optimization problems using a column-and-constraint generation method”, Operations Research L., pp. 457–461, vol. 41, no. 5, 2013.
  • [14] V. Farhadi, F. Mehmeti, T. He, T.L. Porta, H. Khamfroush, S. Wang, K.S. Chan, “Service placement and request scheduling for data-intensive applications in edge clouds,” in IEEE INFOCOM, Apr. 2019.
  • [15] D.T. Nguyen, L.B. Le, and V.K. Bhargava, “Edge computing resource procurement: an online optimization approach,” in Proc. IEEE WF-IoT, pp. 807–812, Singapore, 2018.
  • [16] Z. Xu, W. Liang, W. Xu, M. Jia, and S. Guo, “Efficient algorithms for capacitated cloudlet placements,” IEEE Trans. Parallel Distrib. Syst., vol. 27, no. 10, pp. 2866–2880, Oct. 2016.
  • [17] M. Jia, J. Cao, and W. Liang, “Optimal cloudlet placement and user to cloudlet allocation in wireless metropolitan area networks,” IEEE Trans. Cloud Comput., vol. 5, no. 4, pp. 725–737, Oct.–Dec. 2017.
  • [18] L. Ma, J. Wu, and L. Chen, “DOTA: delay bounded optimal cloudlet deployment and user association in WMANs,” in Proc. IEEE/ACM CCGRID, Madrid, Spain, 2017.
  • [19] L. Wang, L. Jiao, T. He, J. Li, and M. Muhlhauser, “Service entity placement for social virtual reality applications in edge computing”, in Proc. IEEE INFOCOM, Honolulu, HI, USA, 2018.
  • [20] L. Zhao, W. Sun, Y. Shi, and J. Liu, “Optimal placement of cloudlets for access delay minimization in SDN-based internet of things networks,” IEEE Internet Things J., vol. 5, no. 2, pp. 1334–1344, April 2018.
  • [21] S. Yang, F. Li, M. Shen, X. Chen, X. Fu, and Y. Wang, “Cloudlet placement and task allocation in mobile edge computing,” IEEE Internet Things J., vol. 6, no. 3, pp. 5853–5863, Jun. 2019.
  • [22] A. Ceselli, M. Premoli and S. Secci, “Mobile edge cloud network design optimization,” IEEE/ACM Trans. Netw., vol. 25, no. 3, pp. 1818–1831, Jun. 2017.
  • [23] S. Mondal, G. Das, and E. Wong, “CCOMPASSION: a hybrid cloudlet placement framework over passive optical access networks”, in Proc. IEEE INFOCOM, Honolulu, HI, USA, 2018.
  • [24] F. Haider, D. Zhang, M. St–Hilaire, and C. Makaya, “On the planning and design problem of fog computing networks,” IEEE Trans. Cloud Comput., to be published.
  • [25] L. Yang, J. Cao, G. Liang, and X. Han, “Cost aware service placement and load dispatching in mobile cloud systems,” IEEE Trans. Comput., vol. 65, no. 5, pp. 1440–1452, May 2016.
  • [26] D. Zeng, L. Gu, S. Guo, Z. Cheng, and S. Yu, “Joint optimization of task scheduling and image placement in fog computing supported software-defined embedded system,” IEEE Trans. Comput., vol. 65, no. 12, pp. 3702–3712, Dec. 2016.
  • [27] R. Yu, G. Xue, and X. Zhang, “Provisioning QoS-aware and robust applications in internet of things: a network perspective,” IEEE/ACM Trans. Netw., vol. 27, no. 5, pp. 1931–1944, Oct. 2019.
  • [28] S. Pasteris, S. Wang, M. Herbster, and T. He, “Service placement with provable guarantees in heterogeneous edge computing systems”, in Proc. IEEE INFOCOM, Paris, France, Apr. 2019.
  • [29] Q. Fan and N. Ansari, “Application aware workload allocation for edge computing-based IoT,” IEEE Internet Things J., vol. 5, no. 3, pp. 2146–2153, Jun. 2018.
  • [30] A. Yousefpour et al., “FogPlan: a lightweight QoS-aware dynamic fog service provisioning framework,” IEEE Internet Things J., vol. 6, no. 3, pp. 5080–5096, Jun. 2019.
  • [31] B. Gao, Z. Zhou, F. Liu, and F. Xu, “Winning at the starting line: joint network selection and service placement for mobile edge computing,” in Proc. IEEE INFOCOM, pp. 1459–1467, Paris, France, 2019.
  • [32] N. Kherraf, H.A. Alameddine, S. Sharafeddine, C. Assi, and A. Ghrayeb, “Optimized provisioning of edge computing resources with heterogeneous workload in IoT networks,” IEEE Trans. Netw. Serv. Manag., to be published.
  • [33] K. Poularakis, J. Llorca, A. Tulino, I. Taylor, and L. Tassiulas, “Joint service placement and request routing in multi-cell mobile edge computing networks”, in Proc. IEEE INFOCOM, Paris, France, Apr. 2019.
  • [34] T. Ouyang, R. Li, X. Chen, Z. Zhou and X. Tang, “Adaptive user-managed service placement for mobile edge computing: an online learning approach,” in Proc. IEEE INFOCOM, pp. 1468–1476, Paris, France, 2019.
  • [35] L. Chen, J. Xu, S. Ren, and P. Zhou, “Spatio–temporal edge service placement: a bandit learning approach,” IEEE Trans. Wirel. Commun., vol. 17, no. 12, pp. 8388–8401, Dec. 2018.
  • [36] B. Zeng and L. Zhao, “Electronic companion – Solving two-stage robust optimization problems using a column-and-constraint generation method”, Operations Research L., 2013.
  • [37] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A survey on mobile edge computing: the communication perspective,” IEEE Commun. Surv. Tut., vol. 19, no. 4, pp. 2322–2358, Fourthquarter 2017.
  • [38] H. Zhang, Y. Xiao, S. Bu, D. Niyato, F.R. Yu, and Z. Han, “Computing resource allocation in three-tier IoT fog networks: a joint optimization approach combining stackelberg game and matching,” IEEE Internet Things J., vol. 4, no. 5, pp. 1204–1215, Oct. 2017.
  • [39] M. Jia, W. Liang, Z. Xu, M. Huang, and Y. Ma, “QoS-aware cloudlet load balancing in wireless metropolitan area networks,” IEEE Trans. Cloud Comput., to be published.
  • [40] R. Deng, R. Lu, C. Lai, T.H. Luan, and H. Liang, “Optimal workload allocation in fog-cloud computing toward balanced delay and power consumption,” IEEE Internet Things J., vol. 3, no. 6, pp. 1171–1181, Dec. 2016.
  • [41] R. Albert, H. Jeong, and A.L. Barabasi, “Internet: diameter of the world–wide web”, Nature, vol. 401, no. 6749, pp. 130–131, 1999.
  • [42] Z. Xu, W. Liang, A. Galis, Y. Ma, Q. Xia, and W. Xu, “Throughput optimization for admitting NFV-enabled requests in cloud networks,” Computer Networks, vol. 143, pp. 15–29, 2018.
  • [43] N. Kherraf, S. Sharafeddine, C. Assi, and A. Ghrayeb, “Latency and reliability-aware workload assignment in IoT networks with mobile edge clouds,” IEEE Trans. Netw. Ser. Manag., , vol. 16, no. 4, pp. 1435-1449, Dec. 2019.
  • [44] https://aws.amazon.com/ec2/pricing/on-demand/
  • [45] S. Boyd and L. Vandenberghe, “Convex Optimization”, Cambridge, U.K.: Cambridge Univ. Press, 2004.