I Introduction
The explosive growth of Internet of Things is driving the emergence of new mobile applications that demand intensive computation and stringent latency, such as intelligent navigation, online gaming, virtual reality (VR), and augmented reality (AR)[3]. The limited computation and energy resources at mobile devices pose a great challenge for supporting these new applications[4]. Mobile edge computing (MEC) is envisioned as a promising network architecture to address this challenge by providing cloudcomputing services at the edge nodes (ENs) of mobile networks, such as wireless access points and base stations[5, 6]. By offloading computationintensive tasks from mobile users to their nearby serverenabled ENs for processing, MEC systems have great potential to prolong the battery lifetime of mobile devices and reduce the overall task execution latency.
Task offloading and resource allocation are crucial problems in MEC systems. This is because offloading a task from a user to its associated EN involves extra overhead in transmission energy and communication latency due to the input data uploading and computation result downloading[7, 8, 9]. In the existing literature, the task offloading and resource (both radio resource and computation resource) allocation problems have been studied for singleuser singleserver MEC systems in [10, 11], for multiuser singleserver MEC systems in [12, 13, 14, 15, 16, 17, 18, 19], for singleuser multiserver MEC systems in [20], and for multiuser multiserver MEC systems in[21]. These problems are often formulated as minimizing the energy consumption under latency constraints or minimizing the latency subject to energy constraints, so as to strike a good balance between communication efficiency and computation efficiency. The task offloading strategy also depends on whether the task is dividable, known as partial offloading [11, 12, 13, 14, 15], or has to be executed as a whole, known as binary offloading [16, 17, 18, 19, 20, 21].
Note that in the aforementioned literature [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], each task or subtask is executed once at one EN only. As a result, the computed result of the task/subtask being offloaded is only available at one EN without any diversity. Consider the scenario where the downlink channel from the EN to the user experiences deep fading or suffers from strong interference, downloading the computed results back to the user may incur very low transmission rate and consequently cause significantly long delay in completing the overall task. This work aims to address this issue by exploiting computation replication when there are multiple edge servers.
The main idea of computation replication is to let mobile users offload their tasks or subtasks to multiple ENs for repeated execution so as to create multiple copies of the computed result at different ENs which then can enable the multiple ENs to cooperatively transmit the computed result back to users in the downlink. This transmission cooperation can mitigate interferences across users, overcome deep fading, and hence increase the transmission rate of downlink channels. Thus, the downloading time can be reduced. But note that a side effect of replicating tasks on multiple ENs is that it introduces more data traffic in the uplink and hence may increase the uploading time compared with offloading to a single EN. That means computation replication can induce a tradeoff between uploading time and downloading time. Take a 3user 3server MEC network for example. First, consider that each user offloads its task to all servers for repeated computing. In the uploading phase, each user takes turn to multicast its task input to all ENs, consuming 3 time slots in total. In the downloading phase, all the ENs have the same message to transmit, and the downlink channel thus becomes a virtual MISO broadcast channel, and all computed results can be delivered to users within time slot by using zeroforcing precoding[22]. For the baseline scheme where each task is offloaded to an individual EN, both the uplink and downlink channels are user interference channels[23]. Both task uploading and downloading can be completed within time slots by using interference alignment. It is seen that by computation replication, the downloading time is reduced from time slots to time slot while the uploading time is increased from time slots to time slots. For those computation tasks whose output data size is much larger than the input data size or the input data size is negligible, computation replication can bring significant benefits in reducing the overall communication latency. In general, computation replication should be carefully chosen by considering the balance between upload and download times, despite the obvious increase in the computation load.
Computation replication, i.e., replicating a task on multiple servers, has already demonstrated significant advantages in modern computer systems. It can mitigate the random server straggling and thus reduce the task service delay in queuing systems [24, 25, 26]. It can also create coded multicasting for data shuffling and thus reduce the communication load in distributed computing frameworks, like MapReduce and Spark [27, 28]. Our work is an attempt to exploit this idea of computation replication in MEC systems to enable the transmission cooperation for speeding up the computed result downloading. To our best knowledge, the only work that uses the similar idea of computation replication to enable transmission cooperation for computed result downloading in MEC systems is [29]. However, [29] relies on a strong assumption that the computation functions are linear so that the tasks can be executed on some linear combinations of the inputs, i.e., coded input, and then the computed coded outputs on all ENs are utilized to zeroforce the downlink interference at each user. In addition, the work [29] ignores the task uploading phase.
In this work, we exploit computation replication in a general multiuser multiserver MEC system with general computation function. Unlike [29], the transmission cooperation in the downlink enabled by computation replication does not assume linearity of the computation function. It relies purely on the replication of the computed result of each individual task at multiple ENs. Moreover, we consider both task uploading and results downloading and adopt the uploaddownload latency pair as the performance metric. In specific, we consider an MEC network, where a set of mobile users offload their tasks to a set of computingenabled ENs. Each task has an input data and an output data. We define the computation load as the average number of ENs to compute a task, i.e., the degrees of replication for executing a task. The communication latency is defined as the uploaddownload time pair, denoted as . A fundamental question we would like to address is: Given a computation load , what is the minimum achievable communication latency boundary ?
Our work attempts to address the above question for a general MEC network with any () ENs and any () users, in both binary and partial offloading cases from an informationtheoretic perspective. We reveal a fundamental tradeoff between computation load and communication latency, and also present the uploading time and downloading time tradeoff.
We first consider binary offloading where the computation tasks are not dividable. We propose a task assignment scheme where each task is offloaded to different ENs for repeated computing, and each EN has an even assignment of tasks. By utilizing the duplicated computation results on multiple ENs, transmission cooperation in the form of interference neutralization can be exploited in the data downloading phase. We characterize the communication latency by the pair of the normalized uploading time (NULT) and normalized downloading time (NDLT) . The main distinction in the communication latency
analysis lies at the degree of freedom (DoF) analysis of the socalled circular cooperative interferencemulticast channels. We obtain the optimal perreceiver DoF for the uplink channel, and an orderoptimal perreceiver DoF for the downlink channel. Based on these DoF regions, we then develop an orderoptimal achievable communication latency pair at any integer computation load. In particular, the NULT is exactly optimal and the NDLT is within a multiplicative gap of 2 to the optimum. We show that the NDLT is an
inversely proportional function of the computation load in the interval , which presents the computationcommunication tradeoff. We also reveal that the decrease of NDLT is at the expense of increasing the NULT linearly, which forms another NULTNDLT tradeoff. Part of this result is submitted to IEEE ISIT 2019[2].Next, we consider partial offloading where each task can be divided arbitrarily. We propose a task partition scheme where the task generated by each user is partitioned into subtasks, and each is offloaded to a distinct subset of ENs chosen from the total ENs for repeated computing. The uplink is formed as the Xmulticast channel whose optimal perreceiver DoF is obtained. The downlink is the cooperative X channel, and an achievable perreceiver DoF is derived with order optimality. We thus develop an orderoptimal achievable communication latency pair at any given computation load, and both the achievable NULT and NDLT are within multiplicative gaps of to their lower bounds. Moreover, the NDLT decreases linearly with the computation load in the interval , which is also at the expense of increasing the NULT linearly. Part of this result is presented in [1].
The rest of this paper is organized as follows. Section II presents the problem formulation and definitions. The computationcommunication tradeoffs are presented in Section III for binary offloading and Section IV for partial offloading. The conclusions are drawn in Section V.
Notations: denotes the set of complex numbers. denotes the set of positive integers. denotes the transpose. denotes the set of indexes . denotes the largest integer no greater than while denotes the minimum integer no smaller than . denotes the cardinality of set . denotes the set of integers . denotes the set of integers .
denotes the vector
. denotes the set .Ii Problem Formulation
Iia MEC Network Model
We consider an MEC network consisting of singleantenna ENs and singleantenna users, as shown in Fig. 1. Each EN is equipped with a computing server and they all communicate with all users via a shared wireless channel. Denote by the set of ENs and the set of users. The communication link between each EN and each user experiences both channel fading and an additive white Gaussian noise. Let denote the uplink (downlink) channel fading from user (EN ) to EN (user ). It is assumed to be independent and identically distributed (i.i.d.) as some continuous distribution.
The network is timeslotted. At each time slot, each user generates an independent computation task to be offloaded to the ENs for execution. The computation task on each user , for , is characterized by the input data to be computed, denoted as , with size bits, the computed output data, denoted as , with size bits^{1}^{1}1Here the equal size for both input data and output data from all tasks is assumed for analytical tractability. In the general case when each task has a distinct input or output data size, the problem may not be tractable.. We consider both binary and partial computation task offloading models. Binary offloading requires a task to be executed as a whole. Partial offloading, on the other hand, allows a task to be partitioned into multiple modules and executed in different nodes in a distributed manner. While binary offloading is suitable for simple tasks that are tightly integrated in structure and are not separable, partial offloading is more suitable for dataoriented applications that can be separated into multiple parts and executed in distributed nodes in parallel, such as image processing, voicetotext conversion, and components rendering in VR video display. For partial offloading, it is further assumed that the task partition is exclusive without intratask or intertask coding and the computed output size of each subtask is proportional to its corresponding input data size.
IiB Task Offloading Procedure
Before the task offloading procedure begins, the system needs to decide which EN or which set of ENs should each task (for binary offloading) or subtask (for partial offloading) be assigned to for execution. We denote by the part of task that is assigned exclusively to the set of ENs for computation with repetition order . Every task must be computed. Thus, for , we have , and for . By such task assignment, the set of tasks or subtasks to be computed at each EN can be denoted as .
Definition 1.
For a given task assignment scheme , the computation load , , is defined as the total number of task input bits computed at all the ENs, normalized by the total number of task input bits from all the users, i.e.,
(1) 
Similar to [27], the computation load can be interpreted as the average number of ENs to compute each task (for binary offloading) or each input bit (for partial offloading) and hence is a measure of computation repetition.
Given a feasible task assignment strategy , the overall offloading procedure contains two communication phases, an input data uploading phase and an output data downloading phase.
IiB1 Uploading phase
Each user employs an encoding function to map its task inputs and channel coefficients to a length codeword , where is the transmitted symbol at time . Each codeword has an average power constraint of , i.e., . Then, the received signal of each EN at time is given by
(2) 
where is the noise at EN . Each EN uses a decoding function to map received signals and channel coefficients
to the estimate
of its assigned task inputs. The error probability is given by
(3) 
IiB2 Downloading phase
After receiving the assigned task input data and executing them at the server, each EN obtains the output data of its assigned tasks, , and begins to transmit these computed results back to users. The computed results downloading is similar to the task uploading operation. Briefly, each EN maps the task outputs and channel coefficients into a codeword of block length over the downlink interference channel, with an average power constraint of . Each user decodes its desired task output data from its received signals and obtains the estimate . The error probability is given by
(4) 
A task offloading policy with computation load , denoted as , consists of a sequence of task assignment schemes , task input uploading schemes with time , and task output downloading schemes with time , indexed by the task input and output data size pair . It is said to be feasible when the error probabilities and approach to zero when and .
IiC Performance Metric
We characterize the performance of the considered MEC network by the computation load as well as the asymptotic communication time for task input uploading and output downloading.
Definition 2.
The normalized uploading time (NULT) and normalized downloading time (NDLT) for a given feasible task offloading policy with computation load are defined, respectively, as
(5)  
(6) 
Further, the minimum NULT and NDLT are defined, respectively, as
(7)  
(8) 
Note that (or ) is the reference time to transmit the input (or output) data of (or ) bits for one task in a Gaussian pointtopoint baseline system in the high SNR regime. Thus, an NULT (or NDLT) of or indicates that the time required to upload (or download) the tasks of all users is or times of this reference time period.
Definition 3.
A communication latency pair at a computation load is said to be achievable if there exists a feasible task offloading policy . The optimal communication latency region is the closure of the set of all achievable communication latency pairs at all possible computation load ’s, i.e.,
(9) 
Our goal is to characterize the optimal communication latency pair at any given computation load for both binary offloading and partial offloading.
Iii Communication Latency Analysis for Binary Offloading
In this section, we present the analysis of the optimal communication latency pair at any given computation load, including both achievable scheme and converse, for binary offloading.
Iiia Main Results
Theorem 1.
(Achievable result). An achievable communication latency pair at an integer computation load , for binary task offloading in the MEC network with ENs and users, is given by
(10)  
(11) 
when . If is not an integer, one can always find two integers and so that is the closest integer to and the above results still hold by adding more users and deactivating ENs.
Theorem 2.
(Converse). The optimal communication latency pair at any given computation load , for binary task offloading in the MEC network with ENs and users, is lower bounded by
(12)  
(13) 
Based on Theorem 1 and Theorem 2, we can obtain an inner bound denoted as and an outer bound denoted as , respectively, of the optimal communication latency region by collecting the latency pairs at all the considered computation loads ’s. Fig. 2 shows the bounds in the MEC networks with .
Corollary 1.
Now, we demonstrate how the computation load affects the achievable communication latency . By discussing the function terms in (10) and (11), we have the monotonicity of the achievable computationcommunication function :

The NULT increases strictly with the computation load for , and then keeps a constant for .

The NDLT keeps a constant for , and then is inversely proportional to the computation load for .
Remark 1.
The achievable computationcommunication function has two corner points and , corresponding to and , respectively. They are explained as follows:

For input data uploading, before increases to , the NULT is increasing since more traffic is introduced in the uplink. When grows to more than , there is no need to increase the NULT since all tasks can be uploaded within time slots by using TDMA.

For output data downloading, before increases to , the potential transmission cooperation gain brought by computation replication cannot exceed the existing interference alignment gain without computation replication and thus the NDLT keeps fixed. When grows to more than , interference neutralization can be exploited which outperforms interference alignment, and thus the NDLT begins to decrease with .
It can be easily proved that for all . Hence, we have the following remark to characterize the envelope of the inner bound of the optimal communication latency region, present the tradeoff between computation load and communication latency, and illustrate the interaction between the NULT and NDLT.
Remark 2.
The envelope of the inner bound of the optimal communication latency region for binary offloading can be divided into three sections, each corresponding to a distinct interval of the computation load :

ConstantNDLT section: , , when ;

NULTNDLT tradeoff section: , , when ;

ConstantNULT section: , , when .
In particular, in the NULTNDLT tradeoff section, as the computation load increases, the NDLT decreases in an inversely proportional way, at the expense of increasing the NULT linearly.
It is seen from Fig. 2(b) that the envelope of the inner bound is composed of three sections corresponding to three different intervals of the computation load, and the middle section at (dotted line) presents the NULTNDLT tradeoff, in an inversely proportional form.
IiiB Achievable task offloading scheme
IiiB1 Task assignment and uploading
Consider that the system parameters and satisfy such that holds for , where is the given integer computation load and is an integer in . In the proposed task assignment method, we let each task be executed at exactly different ENs and let each EN execute distinct tasks with even load. Note that if is not an integer, we can inject () tasks and let () ENs being idle and use the remaining ENs for task offloading, such that is the integer closest to , denoted as . In this way, we still have for , and can use the new and to replace and to obtain the corresponding analytical results.
To ensure even task assignment on each EN, we perform circular assignment. Specifically, the set of tasks assigned to EN is given by
(14) 
An example of the task uploading for and is shown in Fig. 3.
Given the above task assignment in (14), the uplink channel formed by uploading the tasks to their corresponding ENs is referred to as the circular interferencemulticast channel with multicast group size . This channel is different from the Xmulticast channel with multicast group size defined in [30, 31], where any subset of receivers can form a multicast group, resulting in multicast groups, and each transmitter needs to communicate with all the multicast groups. In our considered circular interferencemulticast channel, there are only multicast groups which are performed circularly by the receivers and each transmitter only needs to communicate with one multicast group. The optimal perreceiver DoF of this uplink channel is given as follows.
Lemma 1.
The optimal perreceiver DoF of the circular interferencemulticast channel with transmitters and receivers satisfying and multicast group size is given by
(15) 
Proof.
First, we use partial interference alignment scheme to achieve a DoF of for each receiver, which is similar to the achievability proof of the DoF of user interference channels in [23]. Then, we compare it to the DoF of achieved by TDMA. The detailed achievable scheme and proof of optimality are given in Appendix A. ∎
The perreceiver rate of this channel in the high SNR regime can be approximated as . The traffic load for each EN to receive its assigned tasks is bits, then the uploading time can be approximately given by . Let and , by Definition 2, the NULT for each EN at computation load can be given by
(16) 
IiiB2 Results downloading
After computing all offloaded tasks, ENs begin to transmit the computed results back to users via downlink channels. Recall that each task is computed at different ENs and each EN has the computed results of different tasks , as given in (14). Each user wants the computed results for , which is owned in different ENs. Multiple ENs with the same computed results can exploit transmission cooperation to neutralize interferences across users [22, 32]. The computation results downloading for and is shown in Fig. 3. We refer to the downlink channel formed by downloading the tasks as the circular cooperative interference channel with transmitter cooperation group size . This channel is different from the cooperative X channel with transmitter cooperation group size defined in [30, 33], where any subset of transmitters can form a cooperation group, resulting in groups in total, and each transmitter cooperation group has messages to send to all receivers. In our considered downlink channel, there are only cooperation groups which are performed circularly by the transmitters and each group only needs to communicate with one receiver. An achievable perreceiver DoF of this downlink channel is given as below.
Lemma 2.
An achievable perreceiver DoF of the circular cooperative interference channel with transmitters and receivers satisfying and transmitter cooperation group size is given by
(17) 
and it is within a multiplicative gap of 2 to the optimal DoF.
Proof.
When , we use partial interference alignment scheme to achieve a DoF of for each receiver. The achievable scheme is similar to that for the user X channel [34]. We then compare it to a DoF of achieved via TDMA. When , we prove that the achievable perreceiver DoF is , where we first use interference neutralization to achieve a DoF of for each receiver, and then compare it with the perreceiver DoF of achieved by only using interference alignment. Summarizing these two cases, we have (17). Please refer to Appendix B for the detailed achievable scheme and optimality proof. ∎
The perreceiver channel rate in the high SNR regime can be approximated as . The traffic load for each user to download its task output data is bits, then the downloading time can be approximately given by . Let and , by Definition 2, the NDLT for each user at computation load is given by
(18) 
IiiC Proof of Converse
IiiC1 Lower bound and optimality of NULT
We prove the lower bound of the NULT at any given computation load , i.e., . First, we use genieaided arguments to derive a lower bound on the NULT of any given feasible task assignment policy with computation load . Then, we optimize the lower bound over all feasible task assignment policies to obtain the minimum NULT for a given computation load .
Given a computation load . Consider an arbitrary task assignment policy where the number of tasks assigned to each EN is denoted as , , and satisfies
(19)  
(20) 
Note that we only need consider case since means no task is assigned to EN and we can remove EN from the EN set , which will not change the results. Consider the following three disjoint subsets of task input data (or message):
(21)  
(22)  
(23) 
where denotes the input message of task that is assigned to all ENs in subset , and denotes one of the users that do not offload their tasks to EN , i.e., . It is seen that the set indicates the messages that EN need decode, i.e., ; The set is a nonempty set with cardinality when EN is not assigned all tasks (or ), since user exists in this case; Otherwise, we have for . We will show that set has the maximum number of messages that can be decoded by EN .
Let a genie provide the messages to all ENs, and additionally provide messages to ENs in . The received signal of EN can be represented as
(24) 
where , , are diagonal matrices representing the channel coefficients from user to EN , signal transmitted by user , noise received at EN , over the block length , respectively. Note that we reduce the noise at EN from to by a fixed amount such that its received signal can be replaced by . The ENs in have messages , which do not include the message of user . Using these genieaided information, each EN can compute the transmitted signals and subtract them from the received signal. Thus, the received signal of EN can be rewritten as
(25) 
Since the message is intended for some ENs in , denoted as , the ENs in can decode it. By Fano’s inequality and (25), we have
(26) 
Consider EN , it can decode messages intended for it. By Fano’s inequality, we have
(27) 
Using genieaided messages and decoded messages , EN can compute the transmitted signals , and subtract them from the received signal. We thus have
(28) 
By reducing noise and multiplying the constructed signal at EN by , we have
(29) 
where represents the reduced noise. It is seen that is a degraded version of at EN in , so EN must be able to decode the messages that ENs in can decode. Thus, we have
(30) 
All the above changes including genieaided information, receiver cooperation, and noise reducing can only improve capacity. Therefore, we have the following chain of inequalities,
(31)  
(32)  
(33)  
(34)  
(35)  
(36)  
(37) 
where (a) is due to the independence of messages, (b) and (c) follow from the chain rule, (d) uses Fano’s inequalities (
27) and (30), (e) is the data processing inequality, and (f) uses the DoF bound of the MAC channel. By dividing on , and taking and , we have .Thus, for any given feasible task assignment , the NULT satisfies for , i.e., the minimum NULT of the task assignment policy is lower bounded by
(38) 
Hence, the minimum NULT of all feasible task assignment is given by
. It can be lower bounded by the optimal solution of the following linear programming problem,
By relaxing the integer constraint into a realvalue constraint , the optimal solution is still a lower bound of the minimum NULT . Since the objective is equivalent to minimizing the term , the optimal solution can be obtained easily as , . Hence, the minimum NULT is lower bounded by
(39) 
The proof of the lower bound of NULT is thus completed. Comparing (39) with (10) in Theorem 1, we see that they are the same. Thus, the achievable NULT in (10) is optimal.
IiiC2 Lower bound and gap of NDLT
Let denote the signal transmitted by each EN , and the signal received at each user , over the block length . Consider the computed results decoded by users, we have the following chain of inequalities,
(40)  
(41)  
(42)  
(43) 
where follows from , follows from the data processing inequality and Fano’s inequality, and uses the capacity bound of the MISO broadcast channel with a antenna transmitter and singleantenna receivers. By dividing on , and taking and , we have
(44) 
Hence, the minimum NDLT is lower bounded by . It can be easily proved that the multiplicative gap between the achievable NDLT in Theorem 1 and this lower bound is within for , i.e., . We complete the proof of the lower bound and gap of the NDLT for binary offloading.
Iv Communication Latency Analysis for Partial Offloading
In this section, we present the analysis of the optimal communication latency pair at any given computation load, including achievable scheme and converse, for partial offloading.
Iva Main Results
Theorem 3.
(Achievable result). An achievable communication latency pair at an integer computation load , for partial task offloading in the MEC network with ENs and users, is given by
(45)  
(46) 
For general , the achievable communication latency pair is given by the lower convex envelope of the above points .
Theorem 4.
(Converse). The optimal communication latency pair at any given computation load , for partial task offloading in the MEC network with ENs and users, is lower bounded by
(47)  
(48) 
Comments
There are no comments yet.