I Introduction
Edge computing (EC) has emerged as a vital technology that works in tandem with the cloud to mitigate network traffic, improve user experience, and enable various IoT applications. By distributing computational and storage resources to the proximity of endusers and data sources, the new computing paradigm offers remarkable capabilities, such as local data processing and analytics, resource pooling and sharing, realtime computing and learning, enhanced security and reliability, distributed caching, and localization. Additionally, EC is the key to satisfying the stringent requirements of exciting new systems and lowlatency applications such as virtual/augmented reality (VR/AR), embedded artificial intelligence, autonomous driving, manufacturing automation, and tactile Internet. In future networks, edge resources form an intermediary layer between multitude of diverse but constrained enddevices and the large cloud data centers (DCs)
[1].Despite the rapid growth witnessed in EC technology and tremendous potential it holds for upcoming years, it is still in its infancy stage and many challenges remain to be addressed. One of the most important challenges is the problem of multitenancy of shared and heterogeneous edge resources, which is also the main focus of this paper. In particular, we study the interaction between an EC platform and multiple services (e.g., AR/VR applications, Google Maps). The platform (e.g., a telco, a cloud provider, a thirdparty [2]) manages a set of ENs and can monetize the edge resources by selling them to the services. By placing and running the services at the ENs, the service providers (SPs) can drastically enhance the quality of experience for their users since the user requests can be served directly at the network edge.
Our work aims to address two fundamental questions: (1) how can the platform set the edge resource prices optimally, and (2) how much resources should a service purchase from each EN. These questions are challenging due to the interdependence between the decisions of the platform and the services. Specifically, the resource procurement decisions of the services depend on the resource prices set by the platform. On the other hand, the pricing decisions of the platform depends on the resource demands of the services.
Also, because of the heterogeneity of the ENs, the services may have diverse preferences towards them. Consequently, the valuations of different ENs to a service can be different. In general, a service prefers lowpriced edge resources as well as ENs with powerful hardware and geographically close to it. To minimize the network delay between its users and computing nodes, a service tends to procure resources from its closest ENs. Hence, some ENs can be overdemanded (e.g., ENs in or near highdemand areas) while some other nodes are underdemanded, which leads to low resource utilization. Intuitively, the platform can reduce the resource prices of underutilized ENs to encourage load shifting from the overloaded ENs.
To this end, we formulate a joint edge resource management and pricing problem between the platform and the services, and propose to cast it as a bilevel optimization model [3] (i.e., a Stackelberg game). The proposed model can not only assist the platform to determine the optimal edge resource prices to maximize its profit, but also help each service find an optimal resource procurement and workload allocation solution to minimize its cost while improving the user experience. In the formulated bilevel problem, the platform is the leader and each service is a follower. The leader decides the optimal resource prices to assign to different ENs, while anticipating the reaction of the followers. Given the edge resource prices computed by the leader, each service solves a follower problem to identify the optimal amount of resource to buy from each EN, considering its delay and budget constraints.
To the best of our knowledge, this is the first bilevel programming formulation for the joint edge service placement, resource procurement, and pricing problem. Note that while Stackelberg games have been used extensively to study various problems in EC, most of the existing models contain a simplified follower problem that normally has a special closedform solution to facilitate the backward induction method. For our problem, we followed a similar procedure to obtain an analytic solution for the case with a single EN in the system. However, for the general case with multiple ENs, the follower problem becomes sophisticated. Also, our formulation contains integer service placement variables. Hence, backward induction cannot be applied to solve our bilevel optimization problem. Our formulation makes it easier and more flexible for the services to express their objective functions and constraints.
Bilevel optimization problems are generally extremely hard to solve [3]. In this paper, we present two solutions to compute an exact optimal solution to the formulated mixed integer nonlinear bilevel program (MINBP) in the general case. The first solution relies on the Karush–Kuhn–Tucker (KKT) conditions to convert the bilevel problem into an equivalent mathematical program with equilibrium constraints (MPEC) [4]
, which is a mixed integer nonlinear program (MINLP). By employing the strong duality theorem and some linearization techniques, we transform this MINLP into a mixed integer linear program (MILP) that can be solved efficiently using offtheshelf MILP solvers such as Gurobi
^{1}^{1}1https://www.gurobi.com/ and Cplex^{2}^{2}2https://www.ibm.com/analytics/cplexoptimizer.Although the first solution can solve the bilevel program optimally, the resulting MILP has a large number of constraints and auxiliary integer variables due to the complimentary slackness constraints from the KKT conditions. Therefore, we propose an alternative solution that uses linear programming (LP) duality and a series of linearizations to convert the original bilevel problem into an equivalent MILP with significantly less number of constraints and integer variables compared to the one obtained by the KKTbased approach. Our main contributions can be summarized as follows:

Modeling: We propose a novel bilevel optimization framework for joint edge resource management and pricing, where the platform optimizes the resource pricing, EN activation, and service placement decisions in the upper level while each service optimizes its workload allocation decisions in the lower level. The service preferences are explicitly captured in the proposed framework.

Solution approach: The formulated problem is a challeging MINBP. We first present an analytic solution for the special case with a single EN. When there are multiple ENs in the system, we develop two efficient approaches based on the KKT conditions and LP duality, respectively, to optimally solve the bilevel problem.

Simulation: Extensive numerical results are shown to illustrate the effectiveness of the proposed scheme, which provides a winwin solution for both the EC platform and the services. In particular, it can help increase the profit for the platform, decrease the costs for the services, and improve the edge resource utilization.
The rest of the paper is organized as follows. The system model is described in Section II. Section III introduces the problem formulation. The solution approaches and simulation results are presented in Section IV and Section V, respectively. Section VI discusses related work followed by the conclusions and future work in Section VII.
Ii System Model
We consider a system that consists of an EC platform, also known as an edge infrastructure provider, and a set of K services. The platform manages a set of N ENs to provide computational resources to the services. The services can be proactively installed onto selected ENs to reduce the communication latency and improve service quality. In practice, various sources (e.g., underutilized DCs in schools/malls/enterprises, idle PCs in research labs, edge servers at base stations, telecom central offices) can serve as ENs [1]. In addition to its own ENs, the platform may also control ENs owned by other entities (e.g., telcos, malls, universities). The EN owners can offer their idle edge resources to the platform in exchange for a certain compensation. The service requests from enddevices normally arrive at a point of aggregation (e.g., switches, routers, base stations, WiFi access points), then will be forwarded to an EN or the cloud for processing.
Throughout this paper, the points of aggregation are referred to as access points (AP). We assume there is a set of M APs in the system. Each service serves users located in different areas, each of which is represented by an AP. Note that an EN can be colocated with an AP. For instance, edge servers can be placed at a base station. In enterprise data centers, edge servers can be deployed near routers/switches. Similar to the previous literature [5, 6, 7, 9, 12, 8, 13, 10, 11], we study the service placement and request scheduling problem from the APs to the ENs and the cloud only. Fig. 1 depicts the system model.
Let , and be the AP index, EN index, and service index, respectively. The network delay between AP and EN is , and the delay between AP and the cloud is . The goal of each service is to minimize not only the resource procurement cost but also the network delay for its users. Define as the resource demand (i.e., workload) of service at AP . Given the resource prices, locations, and specifications of the ENs, each service decides how to optimally divide its workload to the active ENs and the cloud for processing. In Fig. 1, each active EN is represented by a green dot while a red dot indicates an inactive EN.
Each service has a budget for resource procurement. The amount of workload of service at AP assigned to EN is denoted by . Also, is the amount of workload of service at AP routed to the cloud. Define , , and . Clearly, to enhance the user experience, a service prefers to have its requests processed by ENs closer to its users rather than the remote cloud. Let and represent the amounts of computing resources that service purchases from the cloud and EN , respectively. Also, . Define as the maximum delay threshold of service . The average delay of service at AP is . Denote by the delay penalty parameter for service . Let be the size of service . The placement cost of service at EN is
, which includes the downloading, installation, and storage costs. The binary variable
indicates if service is installed at EN or not. Define and .For each EN , its storage capacity and computing capacity are denoted by and , respectively. Since the services may have different preferences towards the ENs, some ENs can be overdemanded while others are underdemanded. Hence, a natural solution is to efficiently price the edge resources to balance supply and demand. The unit price of computing resource of EN is denoted by . Define . Moreover, due to the limited storage resource, each EN can host only a subset of services. The operating cost of an active EN includes a fixed cost and a variable cost depending on its computing resource utilization. Let be a binary variable that equals one if EN is active and zero otherwise. Define . The platform needs to jointly decide which ENs to be active, which service to place at which node, and the resource prices of individual ENs to maximize its revenue while minimizing the total operation cost. The main notations are summarized in Table I.
Notation  Meaning 

, M,  Index, number, and set of APs 
, N,  Index, number, and set of ENs 
, K,  Index, number, and set of services 
Storage capacity of EN  
Computing resource capacity of EN  
Fixed cost of EN  
Variable cost of EN  
Network delay between AP and EN  
Network delay between AP and the cloud  
Delay threshold of service  
Average delay of service in area  
Budget of service  
Storage resource requirement of service  
Delay penalty parameter for service  
Resource demand of service at AP  
Placement cost of service at EN  
Binary variable, 1 if service is placed at EN  
Binary variable, 1 if EN is active  
Unit price of computing resource at EN  
Workload of service at AP assigned to EN  
Workload of service at AP assigned to the cloud  
Amount of resource of EN allocated to service  
Amount of resource of cloud allocated to service 
Our work focuses on the interaction between the platform and multiple latencysensitive services. The platform needs to properly price resources at different ENs to maximize its profit and ensure load balancing, while considering diverse service preferences. The edge resource prices are interdependent because whether a service chooses to offload its tasks to an EN or not depends not only on the price at that EN but also the prices at other ENs. Besides resource pricing, the platform is also responsible for downloading and installing the services onto different ENs. The placement decision is subject to the storage capacity constraints of the ENs.
By anticipating the reaction of the services, the platform optimizes the resource prices and service placement. Given the pricing and placement decisions announced by the platform, each service responds by computing its favorite edge resource bundle (i.e., the optimal amount of resource to purchase from each EN). Since the platform acts first and the services make their decisions based on the platform’s decisions, the process is sequential. Thus, we model the interaction between the platform and the services as a bilevel program, where the platform and services are the leader and followers, respectively.
Iii Problem Formulation
In this section, we formulate the interaction between the platform (i.e., the leader) and the services (i.e., the followers) as a bilevel program which consists of an upperlevel optimization problem and lowerlevel problems, each for one service. The platform solves the upperlevel problem to maximize its profit, and then announces the resource prices and service placement decisions to the services. After receiving the information from the platform, each service solves a lowerlevel problem to minimize its cost under the delay and budget constraints, and then send the optimal resource procurement and workload allocation solution back to the platform.
Fig. 2 summarizes the interaction between the platform and services. In bilevel programming, the upperlevel problem is commonly referred to as the leader problem while the lowerlevel ones are the follower problems. The optimal solutions of the followers and the leader are interdependent. In particular, the decisions of the followers serve as input to the profit maximization problem of the leader. The output of the leader problem also directly affects the followers’ decisions. The follower problems are indeed constraints to the leader problem. In the following, we will describe the follower problem for each service, the leader problem for the platform, as well as the bilevel optimization model.
Iiia The Follower Problem
Given the resource prices and service placement decisions announced by the platform, each service aims to minimize not only the resource procurement cost but also the total network delay by judiciously distributing its workload to the cloud and the ENs that have installed the service. The cost of service for purchasing cloud resource is , where is the unit resource price at the cloud. The total cost of service for purchasing edge resources is . Thus, the total resource procurement cost for service is . The delay cost between AP and EN is proportional to the amount of workload allocation from AP to EN and the network delay between them. Hence, the delay cost of service can be expressed as . The goal of service is to minimize the following objective function, which is the sum of its resource cost and delay cost:
(1) 
where the delay penalty parameter can be adjusted by the service to control the tradeoff between the resource procurement cost and the delay cost. A higher value of implies that the service is more delaysensitive and willing to pay more to buy edge resources to reduce the overall delay. Note that the actual payment of each service is the resource procurement cost only. The delay penalty cost is a virtual cost, which is used to express the delaysensitive level of the service.
The constraints of the follower problem are described in the following. First, the total workload of service allocated to EN cannot exceed the amount of computing resource purchased from the EN, i.e., we have:
(2) 
Similarly, for the resource purchased from the cloud, we have:
(3) 
The resource demand from AP must be served by either the cloud or some EN, which implies:
(4) 
While the capacity of the cloud is virtually unlimited, the resource of each EN is limited. Hence, the amount of resource purchased from an EN cannot exceed the capacity of that node. Additionally, service buys resources from EN only if the service is placed on EN (i.e., ). Therefore:
(5) 
Since the total resource procurement cost of a service is limited by its budget, we have:
(6) 
The average delay of service in area can be expressed as:
(7) 
For a delaysensitive service, it may require that the average delay in every area should not exceed a certain delay threshold, which imposes .
Furthermore, each service may have certain hardware and software requirements for the ENs that can host the service. For example, some service can only be deployed on ENs that support TensorFlow and Ubuntu. Additionally, if a service is delaysensitive, its requests from any area should be handled by ENs that are not too far from that area. Thus, we use a binary indicator
to indicate whether EN is eligible to serve the demand of service at AP or not. Clearly, we have:(8) 
Overall, the follower problem for service can be written as follows:
(9) 
subject to
(10)  
(11)  
(12)  
(13)  
(14)  
(15)  
(16)  
(17)  
(18)  
(19) 
where the notations in the parentheses associated with the constraints are the Lagrange multipliers of the corresponding constraints. It is easy to see from the follower problem (9)(19) that a service buys resource from an EN only if the gain from delay reduction outweighs the cost increment due to the price difference between the cloud resource and edge resource. Note that we have follower problems, each for one service. In addition, although and are variables in the leader problem, they are parameters in the follower problems.
IiiB The Leader Problem
The objective of the platform is to maximize its profit which is equal to revenue minus cost. The revenue of the platform obtained from selling computing resources is , where is the total amount of computing resource from EN allocated to the services. The total cost of the platform includes the operating cost of the ENs and the service placement cost. The operating cost of an EN depends on the electricity price and power consumption of the node. For simplicity, as commonly assumed in the literature [14, 15], the operating cost of a node is approximated by a linear function. When an EN is active, its operating cost is the sum of a fixed cost and a variable cost which depends on its computing resource utilization. Thus, the operation cost of EN can be expressed as:
(20) 
The second term is actually . However, we later enforce that if (i.e., if EN is not active). Hence, we can ignore in the second term. Note that if EN is owned by a third party, we can simply set in (20), and interpret as the price of the EN offered by the third party and as a binary indicator which equals one if the platform buys EN and zero otherwise.
Besides the electricity cost, in this work, we consider the setting where the platform is also responsible for the service placement cost. The placement cost captures the downloading, installation, and storage costs of service at EN . Since a service can only operate on an active EN, the cost of running service on EN is
Overall, the profit of the platform is:
(21) 
Next, we describe the leader problem’s constraints. The EN activation and service placement decisions are binary. Thus:
(22) 
Since a service can only be installed on active ENs, we have:
(23) 
We can only allocate computing resource from an active EN to the services. Furthermore, the total allocated computing resource from an EN cannot exceed its computing capacity. Therefore, we have:
(24) 
which implies if = 0, then . Hence, the services cannot receive computing resource from an inactive EN. Similarly, the total storage resource of an EN allocated to the services is limited by its storage capacity, i.e., we have:
(25) 
where is the storage size required for storing service .
We assume that the unit resource price at each EN belongs to a predefined discrete set, i.e., we have
(26) 
where represent different price options . This is a natural assumption since the price options can express different levels of the price (e.g., very low price, low price, medium price, high price, very high price). Another reason that we use discrete sets to express the prices is due to the linearization procedure described later in the solution approach section. Note that if the price is continuous, we can discretize the price range into a large number of intervals. Please refer to Appendix A for more details. Let be a binary variable which equals one if the resource price at EN is . Since the price can take only one value, we have:
(27) 
We are now ready to present the leader problem, which is indeed a bilevel program as presented below:
(28) 
subject to
(29) 
where is the feasible set of satisfying constraints of the follower problem for service . The platform aims to maximize its profit by jointly optimizing the EN activation, service placement, and resource pricing decisions. The follower problems (i.e., the lowerlevel problems) serve as constraints of the leader problem, as shown in (IIIB).
Iv Solution Approach
The bilevel program (28)(IIIB) is generally hard to solve due to not only the constraints (IIIB) in forms of optimization problems but also the bilinear terms in the objective function (28). For the special case with a single EN, we can solve the bilevel problem analytically since the number of possible resource prices (i.e., ) at the EN is finite and small. Please refer to Appendix B for the detailed solution.
In the following, we tackle the general case with multiple ENs. First, we present the KKTbased approach to reformulate the bilevel problem into an equivalent singlelevel MILP. Specifically, the bilevel program is transformed into an MPEC by replacing each follower problem by its KKT conditions. Then, by combining several linearization approaches and the strong duality theorem [4, 16], the resulting MPEC can be recast as an MILP. Instead of relying on the KKT conditions, the second approach employs LP duality to convert the bilevel problem into an equivalent MILP with considerably less number of constraints and integer variables compared to the one obtained from the KKTbased approach.
Iva KKTbased Reformulation
First, recall that the optimization variables and of the leader problem are parameters of the follower problems. Thus, for fixed values of and , the lowerlevel problem (9)(19) is a linear program, and thus convex. As a result, the KKT conditions are necessary and sufficient for optimality. Consequently, we can replace each follower problem by its corresponding KKT conditions, including the stationary, complementary slackness, primal feasibility, and dual feasibility conditions [17]. The primal feasibility conditions are (10)–(19). The Lagrangian of the follower problem (9)–(19) is:
(30)  
Thus, the KKT stationary conditions are given as follows:
(31)  
(32)  
(33)  
(34)  
(35) 
Also, the complementary slackness, dual feasibility, and the primal feasibility conditions of the follower problems render:
(36)  
(37)  
(38)  
(39)  
(40)  
(41)  
(42)  
(43) 
Note that means , and . Constraints (36)(43) are called complementarity constraints or equilibrium constraints. By replacing constraints (IIIB) for the follower problems with the set of constraints (10)–(19) and (31)–(43), the bilevel program (28)(IIIB) becomes an MPEC problem. This MPEC problem has three sources of nonlinearity, including: i) the complementarity constraints (36)(43); ii) the bilinear terms in (35); and iii) the bilinear term in the objective function (28). To convert the MPEC problem (i.e., an MINLP) into an MILP, we need to linearize these nonlinear terms.
First, the nonlinear complementarity constraints (36)–(43) can be transformed into equivalent exact linear constraints by using the FortunyAmat transformation [18]. Specifically, the complementarity condition is equivalent to the following set of mixedinteger linear constraints:
(44) 
where M is a sufficiently large constant, , which is often referred to as “bigM”. Therefore, the set of constraints (36)–(43) can be rewritten as:
(45)  
(46)  
(47)  
(48)  
(49)  
(50)  
(51)  
(52)  
(53)  
(54)  
(55) 
where , , , , , , and are sufficiently large numbers. For the bilinear terms , using (27), we can rewrite it as:
(56) 
where . Note that is a continuous variable and we have if and , otherwise. Hence, using (56), the bilinear term can be written as a linear function of . Additionally, the constraints can be implemented through the following linear inequalities [19]:
(57)  
(58) 
where is a sufficiently large number.
We assume that the bilevel problem has an optimal solution and the strong duality theorem holds for every follower problem. Then, the strong duality theorem gives us the following:
(59) 
Hence, using (IVA), the bilinear term can be written as the sum of several linear terms. Note that the bilinear terms in (IVA) is a product of a continuous variable and a binary variable. Therefore, we can linearize it similar to what we did for the bilinear terms using (57) and (58).
Based on the linearization steps described above, we can then express the bilevel problem (28)(IIIB) with an equivalent singlelevel MILP as follows:
subject to
where Rev is the revenue of the platform from selling edge resources, i.e., . Problem () is a largescale MILP, which can be solved by MILP solvers.
IvB Dualitybased Reformulation
Instead of using KKT conditions, we can utilize the LP duality to transform the original bilevel problem into an equivalent MILP. We first write the dual maximization form of each lowerlevel minimization problem (9)–(19). Subsequently, we can replace each lowerlevel problem by its corresponding dual feasibility conditions, as well as equating the primal and dual objective functions [17]. The dual problem of the follower problem (9)–(19) of service is given below:
(60) 
subject to
(61)  
(62)  
(63)  
(64)  
(65)  
(66)  
(67) 
The dual feasibility constraints are (61)(67). Thus, the complete form of the final MILP optimization problem resulting from the dualitybased reformulation is given as follows:
Comments
There are no comments yet.