## Authors

• 6 publications
• 17 publications
• 11 publications
• 24 publications
• ### Offloading and Resource Allocation with General Task Graph in Mobile Edge Computing: A Deep Reinforcement Learning Approach

In this paper, we consider a mobile-edge computing system, where an acce...
02/19/2020 ∙ by Jia Yan, et al. ∙ 0

• ### Decentralized Computation Offloading and Resource Allocation in Heterogeneous Networks with Mobile Edge Computing

We consider a heterogeneous network with mobile edge computing, where a ...
03/02/2018 ∙ by Quoc-Viet Pham, et al. ∙ 0

• ### An ADMM Based Method for Computation Rate Maximization in Wireless Powered Mobile-Edge Computing Networks

In this paper, we consider a wireless powered mobile edge computing (MEC...
02/09/2018 ∙ by Suzhi Bi, et al. ∙ 0

• ### Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks

In this paper, we consider a wireless powered mobile-edge computing (MEC...
08/06/2018 ∙ by Liang Huang, et al. ∙ 0

• ### On the Feasibility of Real-Time 3D Hand Tracking using Edge GPGPU Acceleration

This paper presents the case study of a non-intrusive porting of a monol...
04/30/2018 ∙ by Ammar Qammaz, et al. ∙ 0

• ### Joint Optimization of Service Caching Placement and Computation Offloading in Mobile Edge Computing System

In mobile edge computing (MEC) systems, edge service caching refers to p...
06/03/2019 ∙ by Suzhi Bi, et al. ∙ 0

This paper studies a wireless powered mobile edge computing (MEC) system...
07/17/2019 ∙ by Feng Wang, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

The explosive growth of Internet of Things (IoT) in recent years enables cost-effective interconnections between tens of billions of wireless devices (WDs), such as sensors and wearable devices. Due to the stringent size constraint and production cost concern, an IoT device is often equipped with a limited battery and a low-performance on-chip computing unit, which are recognized as two fundamental impediments for supporting computation intensive applications in future IoT. Mobile edge computing (MEC) [2, 3], viewed as an efficient solution, has attracted significant attention. The key idea of MEC is to offload intensive computation tasks to the edges of radio access network, where much more powerful servers will compute on behalf of the resource-limited WDs. Compared with the traditional mobile cloud computing, MEC can overcome the drawbacks of high overhead and long backhaul latency.

further considered a wireless powered MEC and maximize the probability of successful computations. The performance optimization of multi-user wireless powered MEC system was later studied in

[4, 5]. On the other hand, [8] and [9] considered a more general scenario, where the binary offloading model is applied to multiple independent tasks. Specifically, [8] considered multiple mobile users that each offloads multiple independent tasks to a single access point. In [9], a single user can offload independent tasks to multiple edge devices, which then minimizes the weight sum of WD’s energy consumption and total tasks’ execution latency.

The call graphs considered by most of the existing studies on MEC, such as in [11, 12, 13, 14, 15], only take into account the dependency among tasks executed by an individual WD. In practice, tasks executed by different WDs usually have relevance as well. For example, an IoT sensor often needs to combine the processed data from other sensors. The inter-user task dependency has significant impact to the offloading and resource allocation decisions. For instance, a WD is likely to offload its task to the edge server even when the channel condition is poor, because another WD with time-critical applications is urgently in need of its computation output. Besides, the exchange of computation results for dependent tasks also consumes extra energy and time. In general, the case with inter-user dependency requires the joint optimization of tasks executions of all correlated users, which is a challenging problem yet lacking of concrete study.

In this paper, we consider a task call graph in a two-user MEC system as shown in Fig. 1, where the computation of an intermediate task at WD2 requires the output of the last task at WD1. To the authors’ best knowledge, this is the first work that exploits the task dependency across different users in an MEC system. The main contributions of this paper are as follows:

• With the inter-user task dependency in Fig. 1, we formulate a mixed integer optimization problem to minimize the weighted sum of the WDs’ energy consumption and task execution time. The task offloading decisions, local CPU frequencies and transmit power of each WD are jointly optimized. The problem is challenging due to the combinatorial nature of the offloading decisions among all tasks in such call graph and the strong coupling with resource allocation.

• Given the offloading decisions, we first derive closed-form solutions of the optimal local CPU frequencies and transmit power of each WD, respectively. We then establish an inequality condition of the completion time between the two dependent tasks, based on which an efficient bi-section search method is proposed to obtain the optimal resource allocation.

• We further show that the optimal offloading decisions follow an one-climb policy, where each WD offloads its data at most once to the edge server at the optimum. Based on the one-climb policy, we propose a reduced-complexity Gibbs sampling algorithm to obtain the optimal offloading decisions.

Simulation results show that our proposed algorithm can effectively reduce the energy consumption and computation delay compared with other representative benchmarks. In particular, it significantly outperforms the scheme that neglects the task dependency and optimizes the two WDs’ performance individually. Meanwhile, the proposed method has low computational complexity with respect to the size of call graph.

The rest of the paper is organized as follows. In Section II, we describe the system model and formulate the problem. The optimal CPU frequencies and transmit power of each WD under fixed offloading decisions are derived in Section III. In Section IV, we first prove that the optimal offloading decisions follow an one-climb policy and based on that, a reduced-complexity Gibbs sampling algorithm is proposed. In Section V, the performance of the proposed algorithms is evaluated via simulations. Finally, we conclude the paper in Section VI.

## Ii System Model and Problem Formulation

We consider an MEC system with two WDs and one access point (AP), all equipped with single antenna. The AP is the gateway of the edge cloud and has a stable power supply. As shown in Fig.1, WD1 and WD2 have and sequential tasks to execute, respectively. For simplicity of exposition, we introduce for each WD an auxiliary node 0 as the entry task, and auxiliary nodes , as the exit tasks for WD1 and WD2, respectively. In particular, we assume that the computations of the two WDs are related, such that the calculation of an intermediate task of WD2, denoted as , for , requires the output of the last task of WD1.

Each task of WD is characterized by a three-item tuple , where when , and when . Specifically, denotes the computing workload in terms of the total number of CPU cycles required for accomplishing the task, and denote the size of computation input and output data in bits, respectively. As for the two auxiliary nodes of each WD, . For WD1, it holds that , . As for the WD2, we have

 Ii,2={Oi−1,2+OM,1,i=k,Oi−1,2,otherwise. (1)

Moreover, for the entry node and for the exit node of each WD.

We assume that the two series of tasks must be initiated and terminated at the respective WD. That is, the auxiliary entry and exist tasks must be executed locally, while the other actual tasks can be either executed locally or offloaded to the edge server. We denote the computation offloading decision of task of WD as , where denotes edge execution and denotes local computation.

, respectively. Besides, we assume additive white Gaussian noise (AWGN) with zero mean and equal variance

at all receivers for each user.

In the following, we discuss the computation overhead in terms of execution time and energy consumption for local and edge computing, respectively.

### Ii-a Local Computing

We denote the CPU frequency of WD for computing task as . Thus, the local computation execution time can be given by

 τli,j=Li,jfi,j, (2)

and the corresponding energy consumption is [2]

 eli,j=κLi,jf2i,j=κ(Li,j)3(τli,j)2, (3)

where is the fixed effective switched capacitance parameter depending on the chip architecture.

### Ii-B Edge Computing

 Rui,j=Wlog2(1+pi,jhi,jσ2). (4)

 τui,j=Oi−1,jRui,j. (5)

Define . It follows from (4) and (5) that

 pi,j=1hi,jf(Oi−1,jτui,j). (6)

Then, the transmission energy consumption is

 eui,j=pi,jτui,j=τui,jhi,jf(Oi−1,jτui,j). (7)

Notice that (7) is convex in since (7) is the perspective function with respect to of a convex function [16].

We assume that the edge server can compute the tasks of different WDs in parallel. The execution time of task of WD on the edge is given by , where is the constant CPU frequency of the edge server.

Furthermore, as for the downlink transmission, we denote the fixed transmit power of the AP by . Thus, the downlink data rate for feeding the -th task’s input of WD from the AP when computing task locally can be expressed as

 Rdi,j=Wlog2(1+P0gi,jσ2). (8)

Likewise, the time needed for the downlink transmission is given by .

As shown in Fig. 2, the task dependency model between the two WDs can be one of the following four cases, depending on the values of and .

• Case 1: When both the -th task of WD1 and the -th task of WD2 are executed locally, i.e., and , the AP acts as a relay node. First, the WD1 uploads its output of -th task to the AP. Then, the AP forwards this information to the WD2. Specifically, the uplink transmission time and energy in this process are

 τuM+1,1=OM,1RuM+1,1 (9)

and

 euM+1,1=pM+1,1τuM+1,1, (10)

respectively, where and are the corresponding uplink data rate and uplink transmit power, respectively. As for the downlink transmission, the transmission time is denoted as

 τdk′,2=OM,1Rdk,2. (11)
• Case 2: When the -th task of WD1 is executed at the edge and the -th task of WD2 is computed locally, i.e., and , the output of -th task of WD1 is downloaded to the WD2 after execution at the edge.

• Case 3: In this case, the -th task of WD1 is executed locally and the -th task of WD2 is offloaded to the edge, i.e., and . The WD1 needs to upload the result before the computation of the -th task of WD2 at the edge.

• Case 4: In this case, both the -th task of WD1 and the -th task of WD2 are executed at the edge, i.e., and . Therefore, neither uplink nor downlink transmission is needed.

### Ii-D Problem Formulation

From the above discussion, in order to obtain the total tasks execution time of WD1, we first denote the time spent on computations both locally and at the edge server by , which can be expressed as

 Tcomp1=M∑i=1[(1−ai,1)τli,1+ai,1τci,1]. (12)

 Ttran1 =M+1∑i=1[ai,1(1−ai−1,1)τui,1+(1−ai,1)ai−1,1τdi,1] =M+1∑i=1[ai,1τui,1+ai−1,1τdi,1−ai−1,1ai,1(τui,1+τdi,1)]. (13)

Note that there is no communication delay for the -th task if , i.e., the two tasks are computed at the same device. Otherwise, if and , the communication delay is due to the uplink transmission time , whereas, if and , the communication delay is due to the downlink transmission time . Therefore, the total tasks execution time of WD1 is

 T1=Tcomp1+Ttran1. (14)

Furthermore, we can calculate the total energy consumption of WD1 by

 E1=M∑i=1[(1−ai,1)eli,1+ai,1(1−ai−1,1)eui,1]+(1−aM,1)euM+1,1, (15)

which consists of the total execution energy of tasks and the energy consumption on offloading the final result if the -th task is computed locally, i.e., when . Note that the energy cost for the uplink transmission occurs in (15) only if and .

Similarly, the total computation energy consumption of WD2 can be expressed as

 E2=N∑i=1[(1−ai,2)eli,2+ai,2(1−ai−1,2)eui,2]. (16)

As for the execution time of WD2, we first consider the waiting time until the output of the -th task of WD1 reaches WD2, denoted by , as follows.

 Twait1= M∑i=1[(1−ai,1)τli,1+ai,1(τci,1+τui,1)+ai−1,1τdi,1−ai−1,1ai,1(τui,1+τdi,1)] +(1−aM,1)τuM+1,1+(1−ak,2)τdk′,2. (17)

It consists of the total execution time of tasks of WD1, and the transmit time of the output of the -th task as shown in the four cases of Fig. 2.

Meanwhile, the waiting time until the output of the -th task of WD2 is ready, denoted by , is given by

 Twait2= k−1∑i=1[(1−ai,2)τli,2+ai,2(τci,2+τui,2)+ai−1,2τdi,2−ai−1,2ai,2(τui,2+τdi,2)] +ak,2τuk,2+ak−1,2τdk,2−ak−1,2ak,2(τuk,2+τdk,2), (18)

 Twait=max{Twait1,Twait2}. (19)

Accordingly, the total task execution time of WD2 equals to plus the execution time of tasks from to , i.e.,

 T2= Twait+N∑i=k[(1−ai,2)τli,2+ai,2τci,2]+N+1∑i=k+1[ai,2τui,2+ai−1,2τdi,2−ai−1,2ai,2(τui,2+τdi,2)]. (20)

In this paper, we consider the energy-time cost (ETC) as the performance metric [2, 9], which is defined as the weighted sum of total energy consumption and execution time, i.e.,

 η1=βE1E1+βT1T1, (21)

where and denote the weights of energy consumption and computation completion time for WD1, respectively. Without loss of generality, it is assumed that the weights are related by . Accordingly, the ETC of WD2 is

 η2=βE2E2+βT2T2, (22)

where and denote the two weighting parameters satisfying . It is worth noting that represents a special case which will be discussed in Section III, while leads to a trivial solution that the WD2 will take infinitely long time to finish its task executions.

Denoting , , and , we are interested in minimizing the total ETC of the two WDs by solving the following problem:

 (P1)  min(a,p,f) η1+η2, s.t. 0≤pi,j≤Ppeak, (23) 0≤fi,j≤fpeak, ai,j∈{0,1},∀i,j,

where the first two constraints correspond to the peak transmit power and peak CPU frequency. We assume in this paper. Because of the one-to-one mappings between and in (2) and between and in (6), it is equivalent to optimize (P1) over the time allocation . By introducing an auxiliary variable , (P1) can be equivalently expressed as

 (P2)  min(a,{τui,j},{τli,j},t) η1+η2, s.t. t≥Twait1,t≥Twait2, (24) τui,j≥Oi−1,jWlog2(1+Ppeakhi,jσ2), τli,j≥Li,jfpeak, ai,j∈{0,1},∀i,j.

Suppose that we have obtained the optimal solution of (P2). Then, we can easily retrieve the unique and in (P1) using (2) and (6

), respectively. Notice that (P2) is non-convex in general due to the binary variables

. However, it can be seen that for any given , the remaining optimization over is a convex problem. In the following section, we assume that the offloading decision is given and study some interesting properties of the optimal CPU frequencies and the transmit power of each WD, based on which an efficient method is proposed to obtain the optimal solutions.

### Iii-a Optimal Solution of (P2) given a

Suppose that is given. A partial Lagrangian of Problem (P2) is given by

 L({τui,j},{τli,j},t,λ,μ)=η1+η2+λ(Twait1−t)+μ(Twait2−t), (25)

where and denote the dual variables associated with the corresponding constraints.

Let and denote the optimal dual variables. We derive the closed-form expressions of the optimal CPU frequencies and transmit power of each WD as follows.

Proposition 3.1: with , the optimal CPU frequencies of the two WDs satisfy

 f∗i,1=min⎧⎨⎩3√βT1+λ∗2κβE1,fpeak⎫⎬⎭,∀i∈{1,..,M}, (26)
 (27)
###### Proof.

Please refer to Appendix A. ∎

From Proposition 3.1, we have the following observations:

• The optimal local CPU frequencies are the same for all the tasks of the same type, i.e., in WD1, or in WD2, regardless of the wireless channel conditions and workloads.

• For each task of WD1, when or increases (a larger corresponds to a tighter task dependency constraint at optimum), the optimal strategy is to speed up local computing. However, with the increase of , the WD1 prefers to save energy with a lower optimal .

• For the -th task of WD2, , a larger leads to a higher optimal . On the other hand, the optimal is not related to for , as the corresponding executions are not constrained by the WDs’ dependency.

Proposition 3.2: with , the optimal transmit power of WD1 is expressed in (28), where , , and . Besides, with , the optimal transmit power of WD2 is expressed in (29), where , , and .

Here, denotes the Lambert function, which is the inverse function of , i.e., .

###### Proof.

Please refer to Appendix B. ∎

From Proposition 3.2, we obtain the following observations:

• The optimal transmit power is inversely proportional to the channel gain when is above a threshold, and equals the peak power when the channel gain is below the threshold.

• With the increase of peak transmit power , the value of the threshold is decreasing. This means that for a larger , the WDs tend to transmit at the maximum power when meeting worse channel condition.

Based on Propositions 3.1 and 3.2, our precedent conference paper [1] applies an ellipsoid method [16] to search for the optimal dual variables . The ellipsoid method guarantees to converge because (P2) is a convex problem given . In general, the ellipsoid method may take a long time to converge.

In this paper, we further study some interesting properties of an optimal solution in the following Lemma 3.1 and 3.2, based on which a reduced complexity one-dimensional bi-section search method is proposed in the following subsection.

Lemma 3.1: and hold at the optimum of (P2).

###### Proof.

We prove this lemma by contradiction. Suppose that there exists an optimal solution with . According to the KKT conditions and , we have and . As , according to (26) and (28), the optimal and are finite, which means that are finite for all . Hence, is finite. However, when , we have the optimal from (27) and from (29). Thus, we have . This contradicts with the assumption that , and thus completes the proof. ∎

The above lemma indicates that the -th task’s waiting time for the input data stream from WD1 is not larger than that for the other input from WD2. In other words, WD2 always receives the task output from WD1 first and then waits until its local tasks finish before computing the -th task. In addition to the results in Lemma 3.1, the following lemma 3.2 shows two special cases, where is satisfied.

Lemma 3.2: holds at the optimum of (P2) if one of the following two sufficient conditions is satisfied:

1. ;

2. and .

###### Proof.

The proof is similar as that of Lemma 3.1 and is omitted here. ∎

Specifically, in the first case, the role of WD1 is solely to provide needed data to WD2 and minimizing its own execution time is not an objective. Nonetheless, the execution time of WD1 still affects that of WD2, which is to be minimized. In the second case, the -th task of WD1 chooses to perform local computing, i.e., .

### Iii-B A Low-complexity Bi-section Search Method

According to Lemma 3.1, we have . Therefore, Problem (P2) is simplified as

 (P3)  min(a,p,f) η1+η2, s.t. Twait1≤Twait2, 0≤pi,j≤Ppeak, 0≤fi,j≤fpeak, ai,j∈{0,1},∀i,j.

Similarly, the Lagrangian of Problem (P3) is

 L′(p,f,ν) =η1+η2+ν(Twait1−Twait2), (30)

where denotes the dual variable associated with the constraint .

By applying the KKT conditions in (P3), we can obtain the optimal solutions of and . The details are omitted here. By combining with the optimal solutions in Proposition 3.1 and Proposition 3.2, we have the following proposition.

Proposition 3.3: The optimal dual variables in (P2) and in (P3) are related by

 {λ∗=ν∗,μ∗=βT2−ν∗, (31)

where . In other words, we have

 λ∗+μ∗=βT2. (32)

Note that (P3) is convex given the offloading decision . Thus, is a sufficient condition for optimality. By defining , we can efficiently obtain the optimal based on the following proposition.

Proposition 3.4: is a monotonically decreasing function in . Besides, a unique that satisfies exists when .

###### Proof.

It can be proved that both and are monotonically increasing function in , while and , , are monotonically decreasing function in . Therefore, all terms in decrease with , thus is a monotonically decreasing function in . Meanwhile, when , it holds that and , , which leads to when . Together with the result that is a monotonically decreasing function, there must exist a unique that satisfies when . ∎

With Proposition 3.4, when , the optimal can be efficiently obtained via a bi-section search over that satisfies . If , we have according to the KKT condition . Now that is obtained, the optimal can be directly calculated using (31), (26), (27), (28) and (29). Due to the convexity, the primal and dual optimal values are the same for (P3) given .

The pseudo-code of the bi-section search method is illustrated in Algorithm 1. Given a precision parameter , it takes number iterations for Algorithm 1 to converge. In each iteration, the computational complexity is proportional to the number of tasks in WDs, i.e., . Therefore, the overall complexity of Algorithm 1 is .

In section III, we efficiently obtain the optimal of (P1) once is given. Intuitively, one can enumerate all feasible and choose the optimal one that yields the minimum objective in (P2). However, this brute-force search quickly becomes computationally prohibitive as increases. In this section, we propose an efficient approximate algorithm to reduce the complexity.

### Iv-a One-climb Policy

Here, we first show in the following Theorem 1 that the optimal offloading decision has an one-climb structure.

Theorem 1 (one-climb policy): Assuming that , the execution for each WD migrates at most once from the WD to the edge server at the optimum.

###### Proof.

In the following, we prove the one-climb policy by contradiction. Suppose that the optimal offloading decision allows a WD to offload its data two times, as shown in the Fig. 3(a). Under the two-time offloading scheme, tasks from to are migrated to the edge server for execution. Then, tasks from to execute at the WD , followed by tasks from to migrated to the edge server, where is the index of WDs. As for the one-climb scheme in Fig. 3(b), tasks of WD from to are, however, executed on the edge server.

We denote the optimal offloading decision, local CPU frequencies and transmit power of WD in the two-time and one-climb offloading schemes as and , respectively. By the optimality assumption, we have

 η1(^a1,^f1,^p1)+η2(^a2,^f2,^p2)<η1(~a1,~f1,~p1)+η2(~a2,~f2,~p2). (33)

For the two-time offloading policy in WD1, the total execution time from the -th task to the -th task can be expressed as

 ^Tm1∼n11=q1−1∑i=m1(τci,1)+τdq1,1+s1∑i=q1(τli,1)+τus1+1,1+n1∑i=s1+1τci,1. (34)

As for the one-climb policy in WD1, we have

 ~Tm1∼n11=n1∑i=m1τci,1. (35)

Since the computing speed of the edge server is higher than that of the WDs, i.e., , the following inequalities hold for the -th and -th tasks:

 τcq1,1<τlq1,1<τlq1,1+τdq1,1, (36)
 τcs1,1<τls1,1<τls1,1+τus1+1,1. (37)

In addition, we have for the tasks of WD1 between and . Therefore, it can be shown that .