# A Novel Cross Entropy Approach for Offloading Learning in Mobile Edge Computing

In this paper, we propose a novel offloading learning approach to compromise energy consumption and latency in multi-tier network with mobile edge computing. In order to solve this integer programming problem, instead of using conventional optimization tools, we apply a cross entropy approach with iterative learning of the probability of elite solution samples. Compared to existing methods, the proposed one in this network permits a parallel computing architecture and is verified to be computationally very efficient. Specifically, it achieves performance close to the optimal and performs well with different choices of the values of hyperparameters in the proposed learning approach.

## Authors

• 1 publication
• 168 publications
• 5 publications
• 38 publications
• 25 publications

Mobile-edge computing (MEC) is an emerging technology for enhancing the ...
11/19/2018 ∙ by Zezu Liang, et al. ∙ 0

• ### Learning Oriented Cross-Entropy Approach to User Association in Load-Balanced HetNet

This letter considers optimizing user association in a heterogeneous net...
06/09/2018 ∙ by Xietian Huang, et al. ∙ 0

• ### Towards Green Mobile Edge Computing Offloading Systems with Security Enhancement

Mobile edge computing (MEC) is an emerging communication scheme that aim...
04/11/2020 ∙ by Haijian Sun, et al. ∙ 0

Mobile Edge Computing (MEC) enables rich services in close proximity to ...
03/28/2020 ∙ by Mingxiong Zhao, et al. ∙ 0

08/21/2021 ∙ by Siqi Zhang, et al. ∙ 0

Multi-access edge computing (MEC) aims to extend cloud service to the ne...
08/05/2020 ∙ by Jin Wang, et al. ∙ 0

• ### A Novel Graph-based Computation Offloading Strategy for Workflow Applications in Mobile Edge Computing

With the fast development of mobile edge computing (MEC), there is an in...
02/24/2021 ∙ by Xuejun Li, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

With the rapid development of electronics and wireless networks, various services are currently supported by modern mobile devices (MD). However, most real-time applications require huge computation efforts from the MDs. Mobile edge computing (MEC) is a very promising technology to solve this dilemma for the next-generation wireless networks. According to MEC, edge servers, which are used to connect mobile terminals with a cloud server, provide high storage capability as well as fast computation ability.

In a MEC architecture both latency and energy consumption contribute to the network performance and it is of common interest to investigate the problem of balancing these factors with optimized offloading policies. Scanning the open literature, the authors in [1] proposed to offload a task from a single MD to multiple computational access points (CAP). Furthermore, a weighted sum of energy and latency was optimized by using convex optimization. This problem was recently extended in [2] to a scenario where multiple MDs perform offloading. The multiple tasks were scheduled based on a queuing state in order to adapt channel variations. Alternatively, in [3], the latency was minimized with scheduling the MEC offloading, while the energy consumption was considered as an individual constraint in MDs.

Recently, machine learning (ML) attracts much attention from both academia and industry, as an efficient tool to solve traditional problems in wireless communication

[4]-[7]. Specifically, the authors in [4]

proposed a payoff game framework to maximize the network performance through reinforcement learning. Furthermore, a deep Q-network was utilized in

knowledge of the network. Most of these methods, including those which use deep learning network (DNN), focused on the offloading design from a perspective of long-term optimization and at the cost of complexity and robustness

[6][7]. Moreover, these methods can hardly track fast channel changes, due to the requirement of offline learning. Thus, in general they cannot be applied for real-time applications in time-varying channel and it remains a problem of common interest to optimize offloading policies with a time-efficient method, which simultaneously ensures high-quality performance.

In this work, we introduce the cross entropy

(CE) approach to solve the offloading association problem, by generating multiple samples and learning the probability distribution of

elite samples. In contrary to the conventional algorithms, the proposed CE learning approach can use parallel computer architecture to reduce computational complexity, and it works for short-term offloading using online learning architecture, which has a stringent requirement on real-time evaluation. Our work generalizes the CE learning approach to solve the offloading problem with low complexity. The proposed approach is promising since it can effectively replace the traditional convex optimization tools.

## Ii System Model

 xnm={1,\emph\emphiftask$n$offloadstoCAP$m$0,\emph\emphotherwise (1)

and matrix ensembles all the indices. Assume that each task can be offloaded to a single CPU. In this case holds

 M∑m=0xnm=1. (2)

### Ii-a Latency

The execution latency consists of two components: transmission latency and computation time. The transmission time includes task data preparation at the MD, data transmission duration over the air, and received data processing at CAP before conducting computation. Also, the transmission time depends on the achievable rate of physical links. The uplink and downlink data rates can be defined as,

 Ry1m=log2(1+Py2η),m=1,⋯,M, (3)

where , , . () is the transmitting (receiving) power, and is the channel gain between CAP and MD. When it turns to the specific and , they are set infinitely large because computing at local CPU leaves out the process of offloading. Let denote the input data size in bits, is the computation data size (number of cycles required for CPU) and is the output data size after computation. Then, for the offloaded task , the computation time, the uplink and the downlink transmission time can be

 tCompnm=γnrm,tULnm=αnRULm,tDLnm=βnRDLm, (4)

where the CAP serves the tasks with a fixed rate of cycles/sec.

In fact, the CAP can start computing after either one or all scheduled tasks are offloaded. Here, we consider computation after one task offloading completes. For this case, there is no intra-CAP overlap when evaluating the overall latency. This latency is simple in expression, but the following proposed algorithm is still effective for other general expression. The three steps, offloading, computing and transmitting, take place sequentially, which results in the overall latency at CAP as follows

 Tm(X)=∑n∈Nxnm(αnRULm+βnRDLm+γnrm). (5)

Note that since all CAPs evaluate their tasks in parallel, the delay is the maximum one, given as

 T(X)=maxm∈MTm(X). (6)

### Ii-B Energy Consumption

An MD consumes battery to compute the tasks locally or to transmit and receive the task data. The energy consumption in the two cases can be written as

 E1(X)=P0∑n∈Nxn0tCompn0, (7)
 (8)

where denotes the energy for local computation. Then, the total energy consumption is

 E(X)=E1(X)+E2(X). (9)

### Ii-C Optimization Problem

Low computational latency and energy consumption are two main objectives of MEC. Unfortunately, these objectives cannot be minimized simultaneously and the problem turns out to be a multi-objective optimization. We define the weights, and , to compromise the two objectives. Then, the weighted objective can be defined as [1]

 Ψ(X)=λtT(X)+λeE(X), (10)

where is defined in (6) as the maximum delay consumed by all the CAPs instead of the sum or average one.

We aim to solve computation resource allocation scheme under specific situation where and are fixed. The joint minimization problem of both power and latency can be formulated as

 (11)

The problem in (11) is a binary integer programming one, which can be optimally solved via the branch-and-bound (BnB) algorithm with exponentially large computational complexity, especially when X is large [7]. In future wireless networks, the number of tasks will increase and more CAPs will be involved. Then, the BnB algorithm can hardly satisfy the requirements of real-time applications. Besides, there are studies on trying to solve the problem by using conventional optimization methods. The most popular solution is to use convex relaxation, e.g. to relax as

through linear programming relaxation (LPr) or to relax

 T(X)=maxm∈MTm(X) \emph\emphas T(X)≥maxm∈MTm(X)

by semidefine relaxation (SDR) [1]. The relaxation, however, causes performance degradation compared to BnB algorithm.

Besides the above methods, the problem in (11) with discrete optimization variables can be solved by using a probabilistic model based method, in the way of learning the probability of each policy . The CE approach is a probability learning technique in the ML area [9], [10]. To solve (11), we propose a CE approach with adaptive sampling, namely adaptive sampling cross entropy (ASCE) algorithm.

### Iii-a The CE Concept

Cross entropy, also known in probability theory as Kullback-Leibler (K-L) divergence, serves as a metric of the distance between two probability distributions. For two distributions,

and , the CE is defined as

 D(q||p)=∑q(x)lnq(x)H(q)−∑q(x)lnp(x)H(q,p). (12)

Note that in our proposed CE-based learning method represents a theoretically-tractable distribution model that we try to learn for obtaining the optimal solutions, while is the empirical distribution which characterizes the true distribution of the optimal solutions. Particularly, in machine learning, distribution is known from observations and is the entropy of , which leads to the equivalence of learning the CE in (12) and .

We are inspired by the definition of CE, a popular cost function in machine learning, to solve problem (11) via probability learning. We learn by iteratively training samples, and then generate the optimal policy of X according to , which is close to the empirical one, .

### Iii-B The ASCE-based Offload Learning

For probability learning, the probability distribution function is usually introduced with an indicator , e.g.,

can be a Gaussian distribution and

contains its mean and variance

[11]. Denoting that equals to , the indicator

is a vector of

dimensions, defined as where and represents the probability of . With this method, we learn by learning its parameter . Accordingly, X is vectorized as where

. Following the Bernoulli distribution, we have the distribution function

as [13]

 p(x,u)=L∏l=1ulxl(1−ul)(1−xl). (13)

According to (2), one task associates to at most one CAP. Thus if a task is assigned to one CAP, its probability of being associated to other CAPs becomes zero. Aiming to reduce the redundancy of generated samples, we divide one sample, i.e., a vector of dimensions, into independent blocks, , , , , and each of them associates to one CPU, e.g., the feasible block indicates the task assignment of tasks - to CAP . Let denote the set of indices of the selected blocks in sampling and is another set to store the samples satisfying the constraints in each iteration. We first uniformly choose , an index in . To generate an given , we draw the entries of

according to the probability density function

. For each in , it is drawn according to the Bernoulli distribution of parameter . The indicator of the remaining blocks in is then adjusted based on , that is, if we have for . When the cardinality of , denoted as , is equal to , one valid sample is generated. Note that we draw the sample, while the non-feasible samples are excluded on the way. All the valid samples gather in and the sampling repeats until the cardinality of , denoted as , reaches .

In the proposed CE approach, computations in each iteration can be conducted in parallel, while the iterations are implemented in sequential. As will be shown later in the simulation results, we can adjust the hyperparameters of the proposed algorithm, including , to compromise between the amount of parallel computations per iteration and the number of iterations for convergence. This makes a flexible tradeoff between performance and latency.

Now we take the CE in (12) as the lost function. It shows that the smaller is, the smaller the distance between and is. This implies

 minH(q,p)=max∑q(x)lnp(x)=max1S∑lnp(x,u), (14)

where is , since the probability of each independent solution in the set of samples is where is the cardinality of the set [9]. Regarding the problem in (14), the objective is equivalently to finding the optimal indicator minimizing . During the th iteration, series of random samples , serving as candidates, are drawn according to probability . The feasible samples generated by the adaptive sampling are under evaluation. We evaluate the objective of (11) and sort them as

 Ψ(x[1])≤Ψ(x[2])≤⋯≤Ψ(x[S]).

Then, samples, i.e., , yielding the minimum objective, are selected as elites. Now, the best indicator for policy can be determined as

 u∗=argmaxu1SSelite∑s=1lnp(x[s],u). (15)

Using (13) and (15) and by forcing , the saddle point can be evaluated as

 u∗l=1SeliteSelite∑s=1x[s]l. (16)

In the proposed learning algorithm, we choose the CE-based metric for updating the probability. Considering the randomness of sampling, especially when the number of samples is small, we update in the th iteration not only on the basis of which is handled with (15) and (16), but also learned in the last iteration. It follows

 u(t+1)=αu∗+(1−α)u(t), (17)

where is the learning rate [10]. In general, for the CE-based method, the iterations converge to an optimized solution of the problem [14].

The proposed algorithm is summarized in Algorithm 1. The CE approach combining with the indicator updating mechanism can replace conventional convex optimization methods, to compromise complexity and performance.

## Iv Simulations and Discussion

This section validates the efficiency of the proposed approach through simulations, by using the same parameters as in [1]. The MD is equipped with a CPU and Mcycles/sec, W, W and W. The CPU frequencies of the three CAPs are , and cycles/sec. The data rates, and , are set to be Mbps. The average objective in the figure results is the average value of the objective in (10) over a number of trials.

Fig. 1 shows the convergence of the proposed ASCE algorithm under various choices of hyperparameters and . From Fig. 1, it is evident that the algorithm converges fast and the average objective reduces with , which can be considered as closer to the optimum one. Moreover, the average objective converges to almost the same optimal value for all the different choices of the values of hyperparameters. We therefore conclude that, the proposed ASCE algorithm performs robustly to different values of parameters.

In Fig. 2, we compare the proposed ASCE algorithm with the LPr-based offloading algorithm in [1], BnB [7], No MEC and Full MEC. Among them, No MEC and Full MEC represent that all the tasks are arranged to local CPU and CAP 1, respectively. The proposed ASCE algorithm greatly outperforms the LPr method and it approaches the theoretically globally optimal solution obtained by BnB. By contrast of “Full MEC” and “No MEC”, “No MEC” is far inferior to “Full MEC”, which implies that the MDs of multiple tasks can work efficiently with the assist of MEC. From [12], the complexity of the CE approach and BnB algorithm is and , respectively. The latter is far larger because the CE-method of parallel architecture optimizes parameters in one iteration while BnB solves parameters sequentially. Besides, the BnB algorithm requires much more memory for storage.

The offloading policy is a tradeoff between latency and energy consumption to a certain extent. The value of is chosen to be , where grows from to with step size . While plays an increasing role in the objective function, the curve presents an increasing trend for . As for , there is only one CAP serving the MD, which makes the minimized latency much higher than the cases with multiple CAPs. Because the minimized reduces to the energy consumption of all the tasks computed locally, which is the same for all different values of , the curve of finally decreases to the minimized .

## V Conclusion

In this paper, we present an efficient computational offloading approach for a multi-tier Het-MEC network. We propose the ASCE algorithm, which occupies less memory and has lower computational complexity than traditional algorithms. The proposed algorithm performs robustly, while it approaches closely to the optimal performance.

## References

• [1] T. Q. Dinh et al., “Offloading in mobile edge computing: Task allocation and computational frequency scaling,” IEEE Trans. Commun., vol. 65, no. 8, pp. 3571–3584, Aug. 2017.
• [2] X. Lyn et al., “Energy-efficient admission of delay-sensitive tasks for mobile edge computing,” IEEE Trans. Commun., vol. 66, no. 6, pp. 2603–2616, Jun. 2018.
• [3] J. Liu et al., “Delay-optimal computation task scheduling for mobile-edge computing systems,” in Proc. IEEE ISIT, Barcelona, Spain, Jul. 2016.
• [4] T. Q. Dinh et al., “Learning for computation offloading in mobile edge computing,” IEEE Trans. Commun., vol. 66, no. 12, pp. 6353–6367, Dec. 2018.
• [5] X. Chen et al., “Optimized computation offloading performance in virtual edge computing systems via deep reinforcement learning,” IEEE IoT J., vol. 6, no. 3, Jun. 2019.
• [6] C. Lu et al., “MIMO channel information feedback using deep recurrent network,” IEEE Commun. Lett., vol. 23, no. 1, pp. 188–191, Jan. 2019.
• [7] C. Lu et al

., “Bit-level Optimized Neural Network for Multi-antenna Channel Quantization,”

IEEE Wireless Commun. Lett., accepted to appear, early access, 2019.
• [8] P. M. Narendra and K. Fukunaga, “A branch and bound algorithm for feature subset selection,” IEEE Trans. Comput., vol. C-26, no. 9, pp. 917–922, Sept. 1977.
• [9] X. Huang et al., “Learning oriented cross-entropy approach in load-balanced HetNet,” IEEE Wireless Commun. Lett., vol. 7, no. 6, pp. 1014–1017, Dec. 2018.
• [10] P. D. Boer et al., “A tutorial on the cross-entropy method,” Annals of Operations Research, vol. 134, no. 1, pp. 19–67, 2005.
• [11] M. Kovaleva et al., “Cross-entropy method for electromagnetic optimization with constraints and mixed variables,” IEEE Trans. Antennas Propag., vol. 65, no. 10, pp. 5532–5540, Oct. 2017.
• [12] D. MacKay, Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003.
• [13] A. Vinciarelli, “Role recognition in broadcast news using bernoulli distributions,” in Proc. IEEE ICME, Beijing, China, Jul. 2007.
• [14] S. Mannor et al., “The cross entropy method for fast policy search,” in Proc. ICICML, Washington, Aug. 2003.