Mobile Cloud Computing (MCC) extends the capabilities of mobile devices to improve user experience. Mobile users can offload tasks to the cloud, using abundant cloud resources to help them gather, store, and process data. However, the interaction between mobile devices and the cloud introduces some key challenges in the system design. For example, the decision on whether to offload tasks to the cloud needs to balance the tradeoff between energy consumption and computing performance. Furthermore, the communication delay between mobile users and the cloud needs to be taken into consideration .
With an aim to reduce the communication delay in task offloading, Mobile Edge Computing (MEC), as defined by the European Telecommunications Standards Institute (ETSI), is a distributed MCC system where computing resources are installed locally at or near the base station of a cellular network . MEC shares similarities with micro cloud centers , cloudlets, cyber-foraging, and fog computing , except that the MEC computing servers are managed by a mobile service provider, which allows more direct control and resource management.
Similar to the concept of MEC, in this work, we use the general term computing access point (CAP), which refers to a wireless access point or a cellular base station with built-in computation capability. For example, CAPs may be provided by Internet service providers as a value-added service. Mobile devices that wish to offload a task first sends it to the CAP. The CAP may serve its conventional networking function and forward the task to the remote cloud server, or directly process the task by itself. The additional option of computation by the CAP reduces the need for access to the remote cloud server, and hence can potentially decrease the communication delay and also the overall energy and computation cost. However, the availability of CAP adds an extra dimension of variability for offloading decisions. Each task may be processed locally at the mobile device, at the CAP, or at the remote cloud server. Furthermore, both computation and communication resources need to be considered in different offloading choices. This makes optimizing the mobile task offloading decision even more challenging.
In this work, we study the interaction among multiple users, the CAP, and the cloud. In a multi-user scenario, to offload tasks, we need to allocate communication and computation resources among competing users. We jointly consider both the offloading decision and resource allocation among all users, with an aim to conserve energy and maintain service quality for all of them. For this joint optimization problem, an optimal offloading decision must take into consideration the computation and communication energies, computation costs, and communication and processing delays at all local user devices, as well as the resource constraints and capabilities of the CAP and the remote cloud. The contributions of this work are summarized as follows:
We focus on jointly optimizing the offloading decisions as well as the computation and communication resource allocation for multiple mobile users with one CAP and one remote cloud server. We formulate the joint optimization problem to minimize a weighted sum of costs of energy, computation, and the maximum delay among all users. This results in a mixed integer programming problem. To solve this challenging problem, we first reformulate and transform the problem into a non-convex quadratically constrained quadratic program (QCQP) 
, which is still NP-hard in general. To obtain a solution to this problem, we then propose an efficient heuristic algorithm, termedshareCAP, based on semidefinite relaxation (SDR)  and a novel randomization mapping method.
We further study the scenario where there is a strict processing deadline for each user’s task. With these additional delay constraints, the proposed shareCAP method can no longer be directly applied to find a solution due to the absence of a feasibility guarantee. To solve this more complicated optimization problem, we further propose a three-step algorithm named shareCAP-D, consisting of SDR, adaptive adjustment, and sequential tuning, to iteratively find a solution. We show that shareCAP-D guarantees a locally optimal solution.
Through numerical study, by comparing with an optimal offloading policy obtained by exhaustive search, we demonstrate that the proposed shareCAP and shareCAP-D methods give nearly optimal performance under a wide range of parameter settings. Furthermore, we observe that the addition of a CAP can significantly reduce the energy and computational costs of the system, as compared with the conventional MCC where only the remote cloud server is available for task offloading.
The rest of this paper is organized as follows. Related works are reviewed in Section 2. In Section 3, we describe the system model for mobile cloud computing with a CAP and formulate the optimization problem. In Section 4, we transform our problem to a QCQP problem and solve it through the SDR approach. In Section 5, we further study the scenario with strict delay constraints. In Section 6, we extend our work to sum delays optimization. Numerical results are presented in Section 7, followed by conclusion in Section 8.
Notations: We denote by and
the transpose of vectorand matrix , respectively. The notation denotes the diagonal matrix with diagonal elements being elements of vector . The trace function of matrix is denoted by . We use to denote the entry of matrix . We use to indicate that is a positive semi-definite matrix.
2 Related Work
Many existing works study task offloading from mobile users to the local (or remote) processor in two-tier cloud systems. For a single mobile user offloading its entire application to the cloud, the authors of [14, 15, 16] presented different energy models to analyze whether or not to offload application to the cloud, and the tradeoff between energy consumption and computing performance was studied in [17, 18]. Furthermore, many studies have considered partitioning an application into multiple tasks. Among them, MAUI , Clonecloud , and Thinkair  are systems proposed to enable a mobile device to offload tasks to the cloud. These works focus on the implementation of offloading mechanisms from the mobile device to the cloud, and the discussion on optimizing the offloading decisions was limited. In  and , heuristic offloading policies were proposed for a mobile user with sequential tasks. In [24, 25, 26], the problem of cloud offloading for a mobile user with dependent tasks was studied. In , offloading a mobile user’s tasks in an intermittently connected cloud system was considered. The impact of mobility was considered in , where the authors proposed an opportunistic offloading algorithm. All of the studies above focus on a single mobile user.
Task offloading by multiple mobile users have been considered in [29, 30, 31, 32, 33, 34, 35, 36, 37], where each user has a single application or task to be offloaded to the cloud in its entirety. In [29, 30, 31], the authors considered optimizing offloading decisions, aiming to maximize the revenue of the mobile cloud service providers under a fixed resource usage per user. The cooperation among selfish service providers to improve the revenue was further studied in . The authors of [32, 33] studied the allocation of radio and computation resources in the scenario where all tasks are always offloaded. The joint optimization of offloading decision and communication and computation resources for system utility maximization was considered in , where where the number of tasks that can be offloaded is limited by the transmission bandwidth; a heuristic algorithm was proposed to obtain the resource allocation and offloading decision sequentially. Game theoretic approaches were adopted in [35, 36, 37] to study decentralized decision control in systems where offloading decisions are made by mobile users as selfish players. However, these game theoretic works focus on the offloading decisions for each user without considering the allocation of communication and computation resources. Furthermore, a multi-user scenario where each user has multiple independent tasks was considered in , where the offloading decision algorithm were proposed by minimizing the weighted cost of energy consumption and worst-case offloading delay. The authors of  considered a mobile device cloud, which is composed purely of proximal mobile devices, and a task scheduling mechanism was proposed for concurrent application management. Coordination of local mobile devices forming a mobile cloud has been studied in . All of the studies above focus on a two-tier cloud network consisting of only mobile users and another tier of local or remote processors.
The three-tier network consisting of mobile users, a local computing node (e.g., cloudlet or CAP), and a remote cloud server has been studied in [41, 42, 43, 44, 45]. Without considering resource allocation, centralized heuristic algorithms for offloading decisions were proposed in [41, 42, 43], while a game theoretic approach was considered to distributedly obtain the offloading decision in . Despite these works, the joint optimization of the offloading decision and the allocation of computation and communication resources for a general three-tier system has not been investigated before. The joint optimization problem is much more complicated to solve, because the offloading decision and resource allocation are inter-dependent.
In our recent work , a multi-user scenario where each user has multiple independent tasks was considered for joint optimization of offloading and allocation of communication and computation resources. The differences of this work and  are as follows: 1) The problem structures are different, leading to different problem formulations and solution approaches; 2) For the single-task per user case studied in this paper, we propose a low-complexity algorithm that is shown to achieve nearly optimal performance. This combined advantage in both the complexity and performance cannot be achieved by the algorithm proposed in ; 3) In this work, we further study the scenario where a strict processing deadline is imposed on each user’s task, which cannot be addressed by the solution approach proposed in .
3 System Model and Problem Formulation
In this section, we first introduce the model of mobile cloud computing with a CAP, detailing the costs of processing locally, at the CAP, and at the cloud. We then explain the joint offloading decision and resource allocation optimization problem to minimize a weighted sum cost.
3.1 System Model
3.1.1 Mobile Cloud with CAP
Consider a cloud access network with mobile users, one CAP, and one remote cloud server, as shown in Fig. 1. The CAP is a wireless access point (or a cellular base station) with built-in computation capability that may be provided by Internet service providers as a value-added service. Instead of just serving as a relay to always forward received tasks from users to the cloud, the CAP also has the capability to process user tasks subject to its resource constraint. We denote the set of all users by . Each mobile user has one task to be either processed locally or offloaded to the CAP, and the CAP determines whether to process each received task by itself or further offload it to the cloud for processing. Since there are multiple tasks offloaded to the CAP and some of them are processed by the CAP, we need to further allocate the communication and computation resources available at the CAP.
We assume that all tasks are available at time zero. This is similar to many existing studies [15, 16, 17, 18, 33, 35, 36, 37, 34]. If the tasks arrive dynamically in time, we may apply our model and the proposed solution in a quasi-static manner, where the system processes the tasks in batches that are collected over time intervals .
3.1.2 Offloading Decision
Denote the offloading decisions for user by , indicating whether user ’s task is processed locally, at the CAP, or at the cloud, respectively. The offloading decisions are constrained by
Notice that only one of , , and for user can be .
3.1.3 Cost of Local Processing
The input data size, output data size, and processing cycles of user ’s task are denoted by , , and 111The processing cycles of user ’s task depends on the input data size and the application type. For simplicity of illustration, we initially assume that a task requires the same value of on different CPUs so that its processing time is a function of the CPU’s clock speed only. We will explain later how the proposed solution can be trivially extended to the general case., respectively. Similar to [16, 17, 18, 33, 35, 36, 37, 34], we assume that these quantities are known, which may be achieved by applying a program profiler [19, 20, 21]. We assume that the additional instructions required for remote processing can be downloaded directly by the CAP or the cloud via their access to a high-capacity wired network. When the task is processed locally, the processing energy is denoted by and the processing time is denoted by .
3.1.4 Cost of CAP Processing
For user ’s task being offloaded to the CAP, we denote the energy consumed by wireless transmission (to the CAP) and reception (from the CAP) at user by and , respectively. We further denote the uplink and downlink transmission times between user and the CAP by and , respectively, where and are uplink bandwidth and downlink bandwidth allocated to user , and and are the spectral efficiency of uplink and downlink transmission between user and the CAP, respectively222The spectral efficiency can be approximated by where is the link quality between user and the CAP.. Furthermore, and are limited by the uplink bandwidth and downlink bandwidth as follows
Since some uplink and downlink transmissions may overlap with each other, there is also a total bandwidth constraint
If this task is processed by the CAP, denote its processing time by , where is the assigned processing rate, which is limited by the total processing rate at the CAP as
The usage cost associated with the CAP processing user ’s task is denoted by . The usage cost may depend on the data size and processing cycles of a task, as well as the hardware and energy cost to maintain the CAP. Detailed modeling of the usage cost is outside the scope of this work. Here we simply assume that is given for all .
3.1.5 Cost of Cloud Processing
|,||input data size and output data size of|
|user ’s task|
|processing cycles of user ’s task|
|local processing energy of user ’s task|
|,||uplink transmitting energy and|
|downlink receiving energy of user ’s|
|task to and from the CAP|
|, ,||local processing time, CAP processing|
|time, and cloud processing time of|
|user ’s task|
|,||uplink transmission time and|
|downlink transmission time of|
|user ’s task between the mobile user|
|and the CAP|
|transmission time of user ’s task|
|between the CAP and the cloud|
|,||uplink bandwidth and downlink|
|bandwidth for transmission between|
|mobile users and the CAP|
|total transmission bandwidth between|
|mobile users and the CAP|
|,||uplink bandwidth and downlink|
|bandwidth assigned to user|
|,||spectral efficiency of uplink and|
|downlink transmission between user|
|and the CAP|
|,||CAP usage cost and cloud usage cost|
|of user ’s task|
|transmission rate for each user|
|between the CAP and the cloud|
|cloud processing rate for each user|
|total CAP processing rate|
|CAP processing rate assigned to user|
|weight of the CAP usage cost|
|weight of the cloud usage cost|
|weight of the energy consumption of|
|user ’s task|
If a task is further offloaded to the cloud from the CAP, besides all the costs mentioned above (except and related to the task processing cost by the CAP), there is an additional transmission time between the CAP and the cloud, denoted by , where is the transmission rate between the CAP and the cloud. Also, the cloud processing time is denoted by , where is the cloud processing rate for each user. The rate is assumed to be a pre-determined value regardless of the number of users. This is because the CAP-cloud link is likely to be a high-capacity wired connection as compared with the limited wireless links between the mobile users and the CAP, thus there is no need to consider bandwidth sharing among the users. Similarly, is also assumed to be a pre-determined value because of the high computational capacity and dedicated service of the remote cloud server. Thus, and only depend on task itself. Finally, the cloud usage cost of processing user ’s task at the cloud is denoted by .
The above notations are summarized in Table I.
3.2 Problem Formulation
Our goal is to reduce the mobile users’ energy consumption and maintain the service quality to their tasks. To do so, we define the total system cost as the weighted sum of total energy consumption, the costs to offload and process all tasks, and the corresponding maximum transmission and processing delays among all users. We aim to minimize the total system cost by jointly optimizing the task offloading decision vector and the communication and CAP processing resource allocation vector .
For user ’s task being offloaded to the CAP, we define as the weighted transmission energy and processing cost. Similarly, we define as the weighted transmission energy and processing cost if the task is offloaded to the cloud. They are given by
where and are the relative weights between the transmission energy and the processing cost in and , respectively. The local processing delay at user , denoted by , is given by
Also, define and as the transmission and processing delay at the CAP and the cloud, respectively. We have
The values of , , and depend on the offloading decisions and the communication and CAP processing resource allocation . The joint optimization of offloading and resource allocation is formulated as follows
where is the weight on energy consumption relative to the delay. The proposed optimization problem (6) can be solved by any controller in this network after collecting all required information. In practice, the controller could be the CAP. That is, each user provides its information to the CAP, and the CAP will broadcast the obtained offloading decisions (and the corresponding resource allocations) to all users by solving problem (6).
Notice that in problem (6), the cost of delay is considered in the total system cost objective. We put different emphasis on delay by adjusting . Note that, since processing delay is in the objective instead of as a constraint in problem (6), any offloading decision and resource allocation are feasible. However, in practice, there are applications that require strict processing deadlines, and some offloading decisions may not satisfy the strict delay constraint for a task. This scenario will be further discussed in Section 5.
4 ShareCAP Offloading Solution
For the scenario without any delay constraint, we show in this section that optimization problem (6) has an equivalent QCQP formulation that is NP-hard in general. We then present our proposed solution through the SDR and randomization mapping approach.
4.1 Overview of the Proposed Solution
Given some offloading decisions , problem (6) concerns only the resource allocation vector as
Note that only depends on , and thus can be treated as a constant. The resource allocation problem (9) is convex. It can be solved optimally using standard convex optimization approaches such as the interior-point method. Since there are a finite number of offloading decisions, a globally optimal solution for problem (6) can be obtained by exhaustive search among possible offloading decisions. However, the complexity grows exponentially with the number of users and thus impractical.
In order to find an efficient solution to problem (6), we first transform it into a separable QCQP with a linear objective, and then propose a separable SDR approach and a novel randomization mapping method to recover the binary offloading decisions. Once we obtain the binary offloading decisions, we can easily solve problem (9) to find the corresponding optimal resource allocation. We name our method the shareCAP offloading and resource allocation solution.
4.2 QCQP Reformulation and Semidefinite Relaxation
We first replace the integer constraint (8) by
for . Then, we move the delay term from the objective to the constraints by introducing additional auxiliary variable . Optimization problem (6) is equivalent to the following problem
We now show that the optimization problem (11) can be transformed into a separable QCQP problem by the following steps.
First, we introduce additional auxiliary variables . Constraint (12) can be equivalently replaced by the following four constraints
Next, we vectorize the variables and parameters in (11). Define
which is the decision vector for user containing all decision variables. Then, we can rewrite the objective in (11) as
We then replace the offloading placement constraint (1) with
Similarly, the total bandwidth constraint (4) is as follows
where The constraint (5) on the CAP processing resource allocation can be rewritten as
where Constraint (7), which ensures that all variables are nonnegative, is replaced by
Finally, we rewrite integer constraint (10) as
where each is a standard unit vector with the th entry being . By further defining , for in , and together with the above matrix form expressions, optimization problem (11) can now be transformed into the following equivalent homogeneous separable QCQP formulation
Comparing the optimization problems (11) and (29), all constraints have one-to-one corresponding matrix representations. Specifically, constraint (30) is the overall delay constraint, constraint (31) comes from the additional auxiliary constraints (14)-(16), constraint (32) is the offloading placement constraint, constraints (33) and (34) correspond to uplink and downlink bandwidth resource constraints, constraint (35) is the total bandwidth constraint, constraint (36) is the constraint on the CAP processing resource allocation, and constraint (37) corresponds to the integer constraint (10). Therefore, optimization problem (29) is equivalent to the original problem (6).
Note that optimization problem (29) is a non-convex separable QCQP problem, which is NP-hard in general. To solve it, we apply a separable SDR approach to relax it into a separable semidefinite programming (SDP) problem. Define . We then have
with . By dropping the rank constraint , we relax problem (29) into the following separable SDP problem
The above problem can be solved efficiently in polynomial time using standard SDP software, such as SeDuMi . Denote as the optimal solution of the SDP problem (40). We need to obtain the offloading decision of the original problem (6) from . In the following, we propose a randomization method to obtain our binary offloading decisions.
4.3 Binary Offloading Decisions via Randomization
One might consider using a common approach 
to obtain an integer solution from the relaxed SDP problem, by randomly generating vectors from the Gaussian distribution with zero mean and covariancefor times, and then mapping them to the integer set by using the sign of each element in these vectors. Among the generated vectors, the one that yields the best objective value of the original problem would be chosen as the desired solution. However, the above randomization procedure does not produce a feasible solution due to the offloading decision placement constraint (1). Instead, using the structure of and constraints in problem (40), we propose the following randomization method for a feasible solution.
Denote the offloading solution vector as
where , for . Since we have removed the rank-1 constraint from problem (29) to arrive at the relaxed problem (40), the obtained solution for problem (40) does not directly provide a feasible binary solution for the offloading decisions. Our goal is to obtain appropriate offloading decisions from by mapping its elements to binary numbers. Note that only the first three elements in correspond to the offloading decision variables for user (see in (18)). Also, since and , we know that the last row of satisfies , for all . Hence, we can use the values of to recover the binary offloading decision , for . Before providing the details of the proposed randomization method, we first show the property of , for , in the following lemma.
For the optimal solution of problem (40), , for , and .
Based on Lemma 1, we consider a probabilistic mapping method to find . We take
as the probability of, i.e., . Denote
Equivalently, this means . We reconstruct using as marginal probabilities, while satisfying constraint (1). This leads to our proposed probabilistic randomization method as follows.
To satisfy the placement constraint (1), we define random vector , which represent the location that user ’s task will be processed, as follows:
We generate i.i.d. feasible offloading solutions using the above procedure, for and solve the corresponding resource allocation problem (9) for each . We then choose among these feasible solutions the one that gives the minimum objective value of the optimization problem (6) to obtain the offloading solution and the corresponding optimal resource allocation . For the best decision, in practice, we should also compare with the solutions from local processing only and cloud processing only methods, and select the one that gives the minimum cost as the final offloading decision and the corresponding optimal resource allocation .
The details of the overall shareCAP offloading and resource allocation algorithm are given in Algorithm 1. Notice that the SDP problem (40) can be solved within precision by the interior point method in at most iterations in which the amount of work per iteration is . This compares well with the choices in exhaustive search to find an optimal offloading decision. In addition, we observe from numerical results that a small number of randomization trials (e.g., ) is enough to give system performance very close to the optimal one.
The proposed solution can be easily extended to the general case where the number of processing cycles for each task on different CPUs are different because these quantities are constants in optimization problem (6).
5 Offloading with Delay Constraints
Time-sensitive applications in practice may have strict processing deadlines, which complicates the offloading decisions and resource allocation. In this section, we further study the scenario where each task must be completed before some given deadline. That is, there is a strict delay constraint for each user’s task given by
where we note that only one of , , and is non-zero by their definitions and constraint (1). To ensure that at least one feasible offloading solution exists, we assume that local processing time so that each user can at least process its task locally to meet the deadline regardless of the availability of the remote processing. With above additional delay constraints, the optimization problem becomes