Future wireless networks are desired to provide diverse service requirements concerning throughput, latency, reliability, availability as well as operational requirements, e.g., energy efficiency and cost efficiency (Latva-aho and Leppänen, 2019; Rost et al., 2017). These service requirements are made by mobile networks and some novel application areas such as Industry 4.0, airborne communication, vehicular communication, and smart grid.
The International Telecommunication Union (ITU) has categorized these services into three primary use cases: enhanced mobile broadband (eMBB), massive machine-type communications (mMTC), and ultra-reliable and low latency communications (URLLC) (Series, 2015). In order to provide cost-efficient solutions, it is agreed by some telecommunication organizations including Third Generation Partnership Project (3GPP) and the Next Generation Mobile Network Alliance (NGMA), on the convergence of each use case onto a shared physical infrastructure instead of deploying individual network solution for each use case (Alliance, 2016).
To satisfy the requirement of reducing cost efficiency, the concept of network slicing has been proposed. The fundamental idea of network slicing is to logically isolate network resources and functions customized for specific requirements on a common physical infrastructure (Rost et al., 2017). A network slice as a virtual end-to-end (E2E) network for efficiently implementing resource isolation and increasing statistical multiplexing is self-contained with its virtual network resources, topology, traffic flow, and provisioning rules (Rost et al., 2017; Foukas et al., 2017b). Due to the significant role in constructing flexible and scalable future wireless networks, network slicing for mMTC, eMBB, and URLLC service (multiplexing) has received much attention from the academia (Albonda and Pérez-Romero, 2019; Alsenwi et al., 2019; Matera et al., 2018).
However, most of the current work did not study the impact of time-varying channel on the creation of slices and benefits of exploiting advanced radio access techniques (RATs) in network slicing systems. For example, the actual channel may vary in short timescales (e.g., milliseconds) while the creation of network slices may be conducted in relatively long timescales (e.g., minutes or hours). Therefore, network slicing needs to mitigate a multi-timescale issue. Additionally, the utilization of advance RATs (e.g., coordinated multipoint, CoMP) has been considered as a promising way of satisfying spectrum challenges and improving system throughput (Georgakopoulos et al., 2019; MacCartney and Rappaport, 2019).
A recent work in (Tang et al., 2019) developed a CoMP-based radio access network (RAN) slicing framework for eMBB and URLLC service multiplexing and proposed to tackle the multi-timescale issue of RAN slicing via an alternating direction method of multipliers (ADMM). However, this work assumed that URLLC traffic was uninterruptedly generated and ignored the significant bursty characteristic of URLLC traffic (Azari et al., 2019). The bursty URLLC traffic will further exacerbate the difficulty of slicing the RAN for URLLC involved service multiplexing from the following two aspects:
Resource efficiency: one of the efficient proposal in future wireless communication networks to handle the uncertainty (including bursty) is to reserve network resources, which may waste a large amount of valuable network resources. Therefore, it is important to develop resource orchestration schemes with high utilization for future networks, especially for some resource-constrained networks.
Immediate resource orchestration: bursty URLLC packets need to be immediately scheduled if there are available resources and the system utility can be maximized. Therefore, under the premise of improving resource efficiency, immediate resource orchestration schemes related to the number of flashing URLLC packets should be developed.
The difficulty motivates us to investigate the CoMP-enabled RAN slicing for bursty URLLC and eMBB service provision, and the primary contributions of this paper can be summarized as follows:
We re-cut physical resource blocks (PRBs) and derive the minimum upper bound of network bandwidth orchestrated for bursty URLLC traffic transmission to guarantee that the bursty URLLC packet blocking probability is of the order of a low value.
After correlating CoMP beamforming with channel uses according to the network capacity result for finite blocklength regime we derive the minimum upper bound of channel uses for transmitting a URLLC packet with a low codeword error decoding probability.
We define eMBB and URLLC long-term slice utilities and formulate the CoMP-enabled RAN slicing for bursty URLLC and eMBB service multiplexing as a resource optimization problem. The objective of the problem is to maximize the long-term total slice utility under constraints of total transmit power and network bandwidth. It is highly challenging to mitigate this problem due to the requirements of future channel information and tackling a two timescale issue.
To addressing the challenges, we propose a bandwidth and beamforming optimization algorithm. In this algorithm, we approximately transform the service multiplexing problem into a non-convex global consensus problem via a sample average approximate (SAA) technique. We exploit an ADMM method to mitigate the global consensus problem. Meanwhile, a semidefinite relaxation (SDR) scheme joint with a variable slack scheme are applied to transform the non-convex problem into a semidefinite programming (SDP) problem. We also perform theoretical analysis on the tightness and convergence of the proposed algorithm.
At last, the performance of the proposed algorithm is validated through the comparison with the state-of-the-art algorithm.
The rest of the paper is organized as the following. In Section 2, we review the related work. In Section 3, we describe our system model and formulate the studied problem in Section 4. In Sections 5 and 6, we discuss the problem-solving method. Simulation results are given in Section 7, and this paper is concluded in Section 8.
2. Related work
Network slicing and resource management. Enabling network slicing in 5G and beyond networks faces many challenges, in part owing to challenges in virtualizing and apportioning the RAN into several slices. To tackle these challenges, a rich body of previous work has been developed. In the following, we introduce some of the representatives on slice virtualization and resource apportionment.
In the research domain of slice virtualization, for example, a RAN slicing system for single RAT setting was developed to enable the dynamic virtualization of base stations (BSs) in (Foukas et al., 2017a). A control framework focusing on the balance of realistic traffic load and deployment of virtual network functions was designed in (Ni et al., 2019). Based on network function virtualization services, the work in (Landi et al., 2019; Buyakar et al., 2018) proposed to scale virtual network slices for content delivery automatically (e.g., eMBB and mMTC traffic). Based on SDN and NFV technologies, slow startup and virtual internet of things (IoT) network slices were created in (Wang et al., 2018) to meet different quality of service (QoS) requirements in IoT systems. To tackle the low speed of constructing virtual network slices a lightweight network slicing orchestration architecture was developed in (Li et al., 2019).
In the research domain of resource apportionment, most of the literature focused on the resource abstraction and sharing. For instance, many recent works mapped resource sharing problems as the interaction between network resource providers and network slice brokers (or tenants). Scheduling mechanisms (Mandelli et al., 2019; Marquez et al., 2018), game frameworks (Caballero et al., 2017, 2018; Zheng et al., 2018), optimization frameworks (Leconte et al., 2018; D’Oro et al., 2018; Sciancalepore et al., 2017; Jiang et al., 2016; Bega et al., 2017; Sciancalepore et al., 2019; Alsenwi et al., 2019; Matera et al., 2018; Liu and Han, 2019)
, and artificial intelligence-based methods(Gutterman et al., 2019; Albonda and Pérez-Romero, 2019) were then developed to help infrastructure providers improve profits (or utilities) and help tenants reap the benefits of resource sharing while guaranteeing their subscribers’ service requirements. Looking to resource abstraction, the work in (Ksentini and Nikaein, 2017) proposed a network slicing architecture featuring RAN resource abstraction, where a scheduling mechanism was crucial for abstracting network resources among slices. However, scheduling processes were not explored in more detail in this work. By leveraging diverse resource abstraction types, an approach of virtualizing radio resources for multiple services was developed in (Chang et al., 2018) with the assumption that the traffic arrival rate of each slice equalled the number of requested radio resources. However, few of the above literature researched the benefit of slicing RAN equipped with advance RATs, e.g., CoMP.
Coordinated multipoint. Recently, there are some papers separately studying the CoMP without exploiting the network slicing (Ali et al., 2018; Michaloliakos et al., 2017; Michaloliakos et al., 2016; Cha et al., 2017; Wu et al., 2019; Navarro-Camba et al., 2018). The fundamental principle of CoMP is similar to that of a distributed multiple-input multiple-output (MIMO) system, where CoMP cells act as a distributed antenna array under a virtual BS in the MIMO system (Ali et al., 2018; Georgakopoulos et al., 2019). In (Michaloliakos et al., 2017; Michaloliakos et al., 2016), a CoMP architecture coupled with a user-beam selection scheme aiming at achieving high-performance gains without generating high overhead were developed, where all beams were assumed to transmit at the same power level. The work in (Cha et al., 2017) discussed the frequent inter-beam handover issue, which was caused by covering high-speed moving devices, in a CoMP mobile communication system with a single BS. To improve ground users’ QoS, fronthual bandwidth allocation and CoMP were jointly optimized in (Wu et al., 2019) without considering the impact of time-varying channel on the scheme of bandwidth allocation. Besides, some measurement-based studies on CoMP to mitigate user outage and improve network reliability were conducted in (MacCartney and Rappaport, 2019; Navarro-Camba et al., 2018), respectively.
Unlike the literature mentioned above, this paper exploits the CoMP-enabled RAN slicing for bursty URLLC and eMBB service multiplexing, which is quite challenging.
3. System model
We consider a CoMP-enabled RAN slicing system for URLLC and eMBB multiplexing service provision. In this system, the time is discretized and partitioned into time slots and minislots, and a time slot includes minislots. There are and ground eMBB user equipments (UEs) and URLLC UEs and BSs. The eMBB UE set and URLLC UE set are denoted as , , respectively. We assume that eMBB and URLLC UEs are randomly distributed in a considered communication area, and BSs are regularly deployed. Besides, each BS is assumed to be equipped with
antennas, and each UE is equipped with a single antenna. All BSs cooperate to transmit signals to a UE such that the signal-to-noise ratio (SNR) of it can be significantly enhanced111This paper exploits the optimization of transmit beamformers, and the issues of beam alignment and beam selection are out of the scope of this paper.. Meanwhile, a flexible frequency division multiple access (FDMA) technique is exploited to achieve the inter-slice and intra-slice interference isolation.
Boldface uppercase letters represent matrices, boldface lowercase letters represent column vectors. Superscriptsand denote transpose and conjugate transpose, respectively. , , , and represent operators of trace, rank, absolute value, Eucilidean norm, respectively. denotes that the matrix is Hamiltonian positive semidefinite.
3.1. RAN slicing system
Figure 1 shows an architecture of a RAN slicing system adopted in this paper, which consists of four parts: end UEs, RAN coordinator (RAN-C), network slice management, and network providers. At the beginning of each time slot, the RAN-C will decide whether to accept or reject the received slice requests for serving end eMBB and URLLC UEs after checking the available resource information (e.g., PRBs and transmit power) and computing. If a slice request can be accepted, network slice management will be responsible for creating or activating corresponding types of virtual slices, the process of which is time costly and usually in a timescale of minutes to hours. Next, if a slice request admission arrives, network providers will find the optimal servers and paths to place virtual network functions to satisfy the required E2E service of the slice.
On the other hand, at the beginning of each minislot, coordinated BSs will generate beamformers matching time-varying channels for each accepted slice. In this RAN slicing system, we consider two types of slices, i.e., multicast eMBB slices and unicast URLLC slices. The set of eMBB slices is denoted by , and the set of URLLC slice is denoted by .
3.2. eMBB slice Model
According to the above mentioned concept of a network slice (especially from the perspective of the QoS requirement of a slice), we can define an eMBB network slice request as the following.
DEFINITION 3.1 (Multicast eMBB slice request).
A multicast eMBB slice request can be characterized as a tuple for any multicast slice , where is the number of multicast eMBB UEs in and is the data rate requirement of each UE in .
In this definition, eMBB UEs are partitioned into groups according to the data rate requirement of a UE. UEs in the same slice have the same data rate requirement. The slice request of each group of eMBB UEs will always be admitted by the RAN-C in this paper, and coordinated beamformers and PRBs will be effectively configured to accommodate data rate requirements of all eMBB UEs by way of multicast transmission.
We next describe data rate requirements of eMBB UEs. The generated transmit beamformers for UEs of slice () on BS at minislot is denoted by . The channel coefficient between BS and eMBB UE of at minislot is denoted by , which varies once every minislot. Suppose that the instantaneous channel coefficient can be exactly measured at the beginning of minislot and the channel fading process is ergodic over a time slot for each pair. The SNR received at UE of slice at can then be written as
where denotes the noise power, is the set of eMBB UEs of . Since the multicast transmission and flexible FDMA mechanism are exploited the interference is not involved.
According to Shannon formula, the achievable data rate of UE of slice at can be expressed as
where denotes the bandwidth allocated to at .
If the service request of an eMBB UE can be admitted, then the following data rate condition should be satisfied
3.3. URLLC slice Model
Similar to the definition of an eMBB slice request, we define the unicast URLLC slice request as follows.
DEFINITION 3.2 (Unicast URLLC slice request).
A unicast URLLC slice request can be characterized as four tuples for any unicast slice , where denotes the number of unicast URLLC UEs in , represents the latency requirement of each UE in , and are denoted as the data packet blocking probability and the packet error decoding probability of each URLLC UE, respectively.
In this definition, URLLC UEs are classified intoclusters according to the latency requirement of each UE. Owing to the ultra-low latency requirement URLLC traffic should be immediately scheduled upon arrival; thus, URLLC slice requests will always be accepted by the RAN-C in this paper. Then, coordinated beamformers will be correspondingly generated to cover UEs by way of unicast transmission at the beginning of each minislot.
As we all know, it is challenging to design a RAN slicing system to support the transmission of URLLC traffic owing to URLLC UEs’ stringent QoS requirements. What makes the issue more difficult is that URLLC traffic may be bursty. Bursty URLLC traffic, which may cause severe packet blocking, may significantly degrade the system performance of RAN slicing when URLLC slices are not well configured. To understand the characteristic of bursty URLLC traffic and mitigate the effect of bursty URLLC traffic on RAN slicing, we will address the following two questions.
How to model bursty URLLC traffic?
What schemes can be developed for the RAN slicing system such that the URLLC packet blocking probability can be significantly reduced?
During a time slot, bursty URLLC data packets destined to UEs of each URLLC slice and aggregated at the RAN-C are modelled as a Poisson arrival process in this paper, which has the merit of simplicity and tractability. The vector of URLLC packet arrival rates is denoted by , where is a constant and represents the average arrival rate of packets destined to UEs of slice during a unit of time.
On the basis of the URLLC traffic model, we next discuss how to reduce the URLLC packet blocking probability via re-cutting PRBs. To satisfy the QoS requirements of URLLC UEs, a portion of PRBs should be allocated to them. In the RAN slicing system, a URLLC UE of will be allocated a block of network bandwidth of size for a period of time at minislot . Since URLLC packets in have the deadline of seconds for E2E transmission latency, we shall always choose . Besides, a packet destined to the UE will be coded before sending out to improve the reliability; and the transmission of a codeword needs channel uses in PRBs. The channel use and bandwidth is related by , where is a constant representing the number of channel uses per unit time per unit bandwidth of the FDMA frame structure and numerology. We denote the channel use set of URLLC UEs as with .
Let us model the aggregation and departure of URLLC packets in the RAN-C as an queueing system with a finite bandwidth and arrival data rate . Owing to stochastic variations in the packet arrival process, there may not be enough spare bandwidth to serve new arrival URLLC packets occasionally; As a result, URLLC packets may be blocked. Thus, the PRBs should be effectively re-cut such that the URLLC packet blocking probability can be greatly reduced. Denote as the blocking probability experienced by arrival packets of UEs in at minislot where and . The following theorem provides us a clue of re-cutting PRBs for URLLC packets transmission.
Theorem 3.1 ().
At any minislot , for the given , , and a positive integer , define and . If , then there exists a bandwidth such that for we have (Anand and de Veciana, 2018).
This theorem tells us that if we shorten the packet latency, then fewer resource blocks will be available in the frequency plane, which will definitely cause more severe queueing effect and significantly increases the packet blocking probability. If we narrow resource blocks in the frequency plane, then more concurrent transmission is available, which is beneficial for decreasing the packet blocking probability.
Therefore, we should scale up and select and for any URLLC slice at any minislot according to the following equations
With (4), we can obtain the minimum upper bound of bandwidth allocated to URLLC slices via the following lemma.
Lemma 3.2 ().
At any minislot , for a given queueing system with packet arrival rates and packet transmit speeds , let denote the minimum upper bound of bandwidth allocated to all URLLC slices to ensure that and is of the order of , where represents the queueing probability. If , then
where , , , and
Please refer to Appendix A. ∎
In (5), the first summation item denotes the mean value of the bandwidth allocated to URLLC slices and the second summation item can be regarded as the redundant bandwidth allocated to mitigate the impact of stochastic variations in the arrival process.
We next discuss the URLLC capacity and channel uses. For URLLC slice , let be the transmit beamformer to UE from BS at , is the corresponding channel coefficient, the corresponding SNR received at can then be expressed as
The perception of channel state information (CSI) or channel fading distribution may require the signal exchange before transmission that entails extra transmit latency and potential reliability loss as well. Therefore it may be impossible to obtain perfect CSI for URLLC service provision, and a constant is involved in (7) to model the SNR loss for URLLC traffic transmission (Liu et al., 2014). Meanwhile, interference signals are not included in (7) as a flexible FDMA mechanism is exploited.
On the other hand, owing to the stringent low latency requirement, URLLC packets typically have very short blocklength. We therefore utilize the capacity result for the finite blocklength regime in (Polyanskiy et al., 2010; Schiessl et al., 2015) to calculate the URLLC capacity rather than the Shannon formula that cannot effectively capture the reliability of packet transmission. Particularly, for each in , the number of information bits of a URLLC packet that is transmitted at with a codeword error decoding probability of the order of in channel uses can be calculated by
where is the AWGN channel capacity per Hz, is the channel dispersion.
The expression of (8) is complicate; yet, the following lemma gives the approximate expression of channel uses in terms of codeword error decoding probability and SNR.
Lemma 3.3 ().
For any UE in , the required channel use of transmitting a URLLC packet of size of to can be approximated as
Please refer to Appendix B. ∎
4. Problem formulation
On the basis of the above system model, this section aims to formulate the problem of RAN slicing for URLLC and eMBB multiplexing service provision.
As each BS has a limitation on the maximum transmit power , we can obtain the following power constraint:
Besides, since the multicast eMBB and the unicast URLLC service provisions are considered, and network bandwidths allocated to eMBB and URLLC slices are separated in the frequency plane, the network bandwidth constraint can be written as
where represents the bandwidth allocated to eMBB slice over a time slot, denotes the maximum network bandwidth.
We next discuss the design of the objective function of service multiplexing. To achieve the maximum utility of service multiplexing, utilities of eMBB and URLLC service provisions should be maximized simultaneously. In this paper, we leverage a key performance indicator, i.e., energy efficiency, which is popularly exploited in resource allocation problems to model the utility.
On the one hand, as network states of any two adjacent slots can be seen as independent in the time-discrete RAN slicing system, we focus on the problem formulation in a time slot of duration of . On the other hand, during a time slot, channel coefficients followed by the beamforming may change over minislots; as a result, time-varying utility functions in terms of channel coefficients and beamforming should be designed. Specifically, the following two definitions describe the expression of objective function.
DEFINITION 4.1 (eMBB long-term utility).
Over a time slot, the eMBB long-term utility is defined as the time-average energy efficiency of serving all eMBB UEs, which is calculated as
where is an energy efficiency coefficient reflecting the tradeoff between energy consumption and gain.
DEFINITION 4.2 (URLLC long-term utility).
Over a time slot, the URLLC long-term utility is defined as the time-average energy efficiency of serving all URLLC UEs, which can be calculated as
With the above models, constraints, and utility definitions, we can formulate the problem of RAN slicing for bursty URLLC and eMBB service multiplexing as follows
where is a slice priority coefficient representing the priority of serving inter-slices, denotes the long-term total slice utility, beamformers , and .
The mitigation of (14) is highly challenging mainly because
future channel information is needed: the optimization should be conducted at the beginning of the time slot; yet the objective function needs to be exactly computed according to channel information during the time slot.
two timescale issue: the bandwidth and the beamformers and should be optimized at two different time scales. needs be optimized at the beginning of the time slot. and should be optimized at the beginning of each minislot.
In the following sections, we discuss how to address the challenging problem effectively.
5. Problem solution with system generated channel coefficients
To tackle the issue of requiring future status information at the beginning of the time slot, we resort to an SAA technique (Kim et al., 2015). Based on the results of the SAA, we exploit an ADMM method (Boyd et al., 2011) to address the two timescale issue.
5.1. Sample average approximation
Owing to the ergodicity of channel fading process over the time slot, the objective function can be approximated as
where denotes a set of all channel coefficient samples collected at the beginning of the time slot.
For SAA, its fundamental idea is to approximate the expectation of a random variable by its sample average. The following proposition shows that if the number of samplesis reasonably large, then for all , converges to uniformly on the feasible region constructed by constraints (14b) and (14c).
Proposition 5.1 ().
We omit the proof here as a similar proof can be found in the convergence proof of SAA in (Kim et al., 2015). ∎
Therefore, given a set of samples of channel coefficients with that are assumed to be independent and identically distributed, the original problem (14) can be approximated as
(16) can be considered as a global consensus problem with being a family of global consensus variables and being a family of local variables. Therefore, an ADMM method can be exploited to mitigate the problem effectively.
5.2. Alternating direction method of multipliers
According to the fundamental principle of ADMM method, ADMM for the problem (16) can be derived from the following augmented partial Lagrange problem
where, is a Lagrangian multiplier, is a penalty coefficient.
In our RAN slicing system, the RAN-C is responsible for executing the ADMM-based framework, and virtual machines (VMs) are activated to conduct (18) and (20). A central VM (CVM) is utilized to calculate the consensus variable. Additionally, in this framework, local dual variables are updated to drive local variables into consensus, and quadratic items in (18) help pull towards their average value.
5.3. Semidefinite relaxation scheme
Let for all , , and for all , , . Next, if we recall the properties: , and , (18) can then be reformulated as
where , , is a square matrix with blocks, and each block in is a matrix. Besides, in , the block in the -th row and -th column is a identity matrix, and all other blocks are zero matrices.
As power matrices (, ) and (, , ) are positive semidefinite, we then resort to the SDR scheme to handle the low-rank non-convex constraints (22f) and (22g). That is, directly drop the constraints (22f) and (22g). However, owing to the relaxation, power matrices and obtained by mitigating the problem (22) without low-rank constraints will not satisfy the low-rank constraint in general. This is due to the fact that the (convex) feasible set of the relaxed (22) is a superset of the (non-convex) feasible set of (22). If they satisfy, then the relaxation is tight; if not, then some manipulation, e.g., a randomization/scale method (Ma, 2010), should be performed on them to obtain their approximate solutions.
Although non-convex constraints are removed, constraints related to are complicate which hinder the optimization of the relaxed (22). Therefore, we next discuss how to equivalently transform the complicate constraints via a variable slack scheme.
5.4. Variable slack scheme
Lemma 5.2 ().
Given the family of slack variables , (16d) is equivalent to the following inequalities,
for all , , .
On the one hand, if the constraint (24) is active, then (23) and (24) are equivalent to (16d); on the other hand, if at the optimal solution to (22) constrained by (23) and (24), there is a sample (or UE , ) such that (24) is non-active, then we can always pull the value of towards without violating (23) and changing the value of the objective function. The constraints (23) and (24) are therefore equivalent to (16d). ∎
Besides, we can know that the objective function (22a) is convex. This is because it is linear w.r.t variables and with an addition of affine terms and nonnegative quadratic terms w.r.t . (22c) is an affine constraint. Other constraints are non-linear. Based on the above equivalent transformation, we show that (22) can be further transformed into a standard convex problem in the following lemma.
Lemma 5.3 ().
By introducing a family of slack variables, the problem (22) without low-rank constraints can be equivalently transformed into the following SDP problem.
Please refer to Appendix C. ∎
5.5. Performance analysis
In this subsection, we analyze the performance of DBO. We first present a lemma about the optimality of solving (22) and then state the computational complexity and the convergence of DBO.
Lemma 5.4 ().
For all , , , the SDR for both and in problem (22) is tight, that is,
Moreover, and are optimal solutions to (22).
Please refer to Appendix D. ∎
The computational complexity of DBO is dominated by that of solving the SDP problem. The SDP problem has matrices of size of and one-dimensional variables. An interior-point method can then be exploited to efficiently mitigate the SDP problem at the worst-case computational complexity of (Ye, 1997). Nevertheless, the actual complexity will usually be much smaller than the worst case.
The following lemma presents the convergence of the algorithm.
Lemma 5.5 ().
Let denote the optimal solutions, under the ADMM-based distributed algorithm, , , we have that is bounded and
For all , to proof that is bounded, we should proof that variables ,