The fifth-generation (5G) wireless networks are expected to offer high spectral efficiency, improved reliability, massive connectivity, and low end-to-end (E2E) latency [1, 2]. Especially for realizing tactile Internet services, a very low E2E latency of 1 ms should be guaranteed while providing reliable service quality . In particular, International Telecommunication Union (ITU) has defined ultra-reliable low latency communication (URLLC) as one of usage scenarios in 5G networks . With the proliferation of smart devices, particularly in the Internet of Things (IoT) network, URLLC should not only provide sufficiently high system throughput and low latency, but also support a massive scale of machine type communications [5, 6]. In addition, power-efficiency becomes critical especially for small and clumsy battery-powered IoT devices [7, 8, 9]. Further, flexibility is also important to communicate with diverse machine type devices as well as human users while meeting a variety of quality of service (QoS) requirements . Many researchers have studied a myriad of technical issues as mentioned above, and the works are in progress.
Actually, delay-constrained communication has long been a major challenge and interest. Given a delay constraint, the tradeoff between reliability and delay is studied in , and throughput analysis is also performed in . The packet delay can be reduced by designing a short frame structure [14, 13] and/or adjusting the transmission policy . The E2E delay consists of uplink (UL)/downlink (DL) transmission delays and queueing delay , and a short frame structure reduces UL/DL transmission durations. Meanwhile, deterministic queueing delay analysis is generally considered difficult due to the fact that queue dynamics in medium access control (MAC) is influenced by the randomness of time-varying channels and stochastic geometry in physical (PHY) layer. In , the effective capacity link-layer model is presented to define the statistical delay requirement. Based on the effective capacity link model , cross-layer transmission design for achieving queueing delay requirements has been investigated under the assumption of a constant service rate and static channel over transmission time [18, 19]. Further, based on the Little’s theorem  which establishes that the time-average queueing delay is proportional to the average queue backlog, dynamic resource allocations and scheduling policies for reducing queueing delay have been actively pursued [15, 21, 22, 23].
Since there exists a fundamental power-delay tradeoff studied in [24, 25], power efficiency also becomes critical for URLLC, especially where a massive number of devices are battery-powered . Energy-efficient resource allocations and scheduling policies for delay-constrained communications have been studied in [27, 28, 29, 30] and a delay-optimal scheduling policy for power-constrained transmission has been proposed in . In addition, system throughput maximization subject to a given constraint for low queueing delay is addressed in . Furthermore, the tradeoff between energy and delay to adapt to changes of network state distribution is discussed in [33, 34] based on a stochastic network optimization framework.
In IoT networks, it has been increasingly difficult for orthogonal multiple access (OMA), which allocates limited orthogonal resources to individual communication links, to handle the growing number of wireless devices. In order to overcome this issue, non-orthogonal multiple access (NOMA) has been actively researched as one of promising methods for efficient and flexible use of energy and spectrum, as well as for system throughput improvements [37, 36]. Power-multiplexing NOMA serves multiple users on the same time/frequency/code resources by employing the additional power domain ; thus it can provide better system throughput than OMA, employing successive interference cancellation (SIC) to remove superposed users’ signal components . In addition, NOMA has the advantage of allowing massive connectivity for IoT services [35, 40] and NOMA in short packet transmissions for achieving low latency has been discussed in [41, 42].
Since all users would not be served by NOMA due to high complexity of SIC, hybrid multiple access (MA), which allows the coexistence of NOMA and OMA, has been considered for next-generation communication systems. Representatively, multi-user superposition transmission (MUST) is adopted by the 3rd generation partnership project (3GPP) for 5G networks, which employs both power-domain NOMA and orthogonal frequency division multiple access (OFDMA) . For using hybrid MA, user scheduling and resource allocation are very critical issues. In [44, 45, 46], user pairing schemes for NOMA signaling among multiple users have been studied. The proposed schemes in [48, 47, 50, 49] focus on joint optimization of sub-channel assignment and power allocation. Further, cognitive-radio-inspired power control for NOMA is proposed in [44, 51] to guarantee secondary users’ QoS requirements. However, power efficiency and low latency are not considered in [44, 45, 46, 47, 50, 49, 51]. The authors of  and  proposed the power-efficient resource allocation policies, but they did not consider user pairing and latency problems.
This paper proposes dynamic algorithms of joint user pairing and power control to maximize power-efficiency while achieving low latency as well as sufficient reliability in hybrid MA. In particular, the long-term average data rate is considered as a user QoS requirement for sufficient reliability. In addition, user scheduling and flexible use of resources are also captured in the proposed technique since the proposed optimization framework enables to determine whether the communication link is activated or not. This paper shows that the proposed technique works well based on the short frame structure designed for URLLC. The main contributions of the proposed technique can be summarized as follows:
This paper constructs the stochastic network optimization framework for the transmission scheme of URLLC, which adaptively operates depending on time-varying channel and queue states. The proposed framework focuses on reducing the queueing delay, which is a main factor of E2E delay.
This paper contributes to URLLC based on NOMA. The proposed transmission scheme exploits the advantage of NOMA over OMA to increase data rate for reducing the queueing delay. Further, flexible use of resources is enabled for both OMA and NOMA users in the proposed framework.
Data-intensive simulation results show that the proposed technology can achieve an E2E delay smaller than 1 ms, on the basis of the short packet structure designed for URLLC .
In summary, the proposed algorithm pursues low latency, high power-efficiency, as well as diverse QoS requirement satisfactions.
The rest of the paper is organized as follows. The hybrid MA system and queue model are described in Section II. In Section III, we formulate the joint optimization problem of user pairing and power allocation in hybrid MA. The optimal power allocation rule with fixed user pairing is proposed in Section IV-B, and the matching algorithm for user pairing is presented in Section V. Simulation results are shown in Section VI, and Section VII concludes the paper.
Ii System Model
Ii-a Hybrid Multiple Access Model
This paper considers hybrid MA for power-efficient IoT networks where transmitters are battery-powered. Let a transmitter serve users by either OMA or NOMA, as shown in Figs. 2 and 2. The transmitter is deployed with queues in which data packets are waiting for transmissions to respective users. Assume that data packets for user are accumulated in queue . Each transmitter queue has a power budget of , and the transmit power for user is , so .
The Rayleigh fading channel is assumed for communication links from the transmitter to users. Denote the channel of user with . The path loss model is , where is the distance between the transmitter and user
, and the fast fading component has a complex Gaussian distribution, i.e.,. Let be the data rate of user , and denote as a threshold of user ’s instantaneous data rate that determines the outage event. In other words, when , the outage event occurs at user . In addition, represents the long-term average data rate as a QoS requirement for user , and the QoS constraint can be written as
Fig. 2 shows the system architecture employing only OMA. For example, many packets are in queue 2 and queue 3, so these links have a risk of excessive queueing delays. The transmitter can reduce queue backlogs by increasing transmission rates of these links. Simply, link 2 can consume more power to increase its transmission rate and to reduce queue backlogs. On the other hand, link 3 experiences the outage event even with the maximum transmit power in Fig. 2, so NOMA can be employed. Since NOMA is well-known to improve system throughput compared to OMA with identical power consumption, the link rate for user 3 can be increased by NOMA. In Fig. 2, link 3 and link 4 are paired for NOMA transmission. Meanwhile, the link outage occurs at user 1, but its queue is almost empty so it does not worry about the excessive queueing delay. Since we suppose that the transmitter observes the current CSI and QSI, the outage occurrence can be expected. In this case, the transmitter can allow link 1 to be deactivated and to save transmit power, as shown in Fig. 2. In this way, the system dynamically adjusts transmission rates of all communication links by controlling power consumption and employing NOMA, for low queueing delay and high power-efficiency.
Since this model determines link activation by allocating no transmit power, as shown by link 1 in Fig. 2, we can say that user scheduling is also performed in the system model. Further, when two users are paired for NOMA signaling, the transmitter even enables to allocate no power to one of paired users. In this case, the resource of the user with no transmit power becomes available for another user. It is easily seen that system resources are used more flexibly in this model.
Main issues of hybrid MA are summarized in Fig. 2. First, the queueing delay should be reduced to satisfy the E2E latency constraint, by achieving stability of the queueing system, i.e., limiting queue backlogs. Then, the user pairing problem arises for the transmitter to serve several users by NOMA. In addition, power allocation for both OMA and NOMA users should be jointly considered with the user pairing problem. In this paper, a two-user NOMA scenario is only considered, because a clumsy device in the IoT network is difficult to handle the high computational complexity of SIC processes for the multi-user NOMA scenario. Further, as the number of users for NOMA signaling grows up, the power budget should be large enough to provide reliable signal-to-interference-plus-noise ratios (SINRs) to NOMA users. However, a small battery-powered device is likely to have a limited power budget.
Ii-B Transmitter Queue Model
In general, the transmitter queue model has its own arrival and departure processes. When the departures are less frequent than the arrivals, the queue backlog grows. For each user , the queue dynamics in each unit time can be represented as follows:
where , , and stand for the queue backlog, the arrival and departure proesses of user at time , respectively. The queue states are updated in each time slot . In this paper, the interval of each slot is assumed to be the channel coherence time, .
In this paper, queue backlog counts the number of data bits accumulated in queue . and semantically mean the numbers of arrived and transmitted bits. Simply, suppose that is randomly generated for all . On the other hand, obviously depends on the data rate of user :
is an i.i.d. uniform random variable, i.e.,, indicating the number of data packets arrived in queue at time . Also, is the packet size in bits, and represents the index of the user paired with user . If OMA is employed for user , then , whereas for means that users and are paired for NOMA. is the data rate of user when transmit power is consumed and user is paired with user at time . is the indicator function, so is 0 if the outage occurs at user , or 1, otherwise. Since we suppose that the transmitter can observe the current CSI, if the outage is expected, no transmit power is allocated and the departure becomes zero.
Remark: If is too long, it is better to update power allocation and user pairing more frequently than channel variations. Consider a transmitter queue which is almost empty so that there is no worry about excessive queueing delay. In this case, the transmitter usually consumes a small power to improve power efficiency. However, if this situation persists for a long time , packets will be accumulated in the queue sooner or later and queueing delay will increase. Therefore, several updates of power allocation and user pairing are required over the time interval of .
Ii-C E2E Delay Requirement
Denote the E2E delay bound with . mainly consists of UL/DL transmission delays and the queueing delay . For low latency communications, a small packet structure is preferred because the UL/DL transmission durations can be reduced, and the summation of UL/DL durations becomes identical to the transmit time interval (TTI), denoted by . Therefore, the margin of the queueing delay is , i.e., data transmission is successful only when the queueing delay is smaller than . Although the instantaneous queueing delay determines whether data transmission is successful or not, making the instantaneous queueing delay bounded to a deterministic value is very difficult due to the time-varying transmission scheme and channel.
To this end, this paper focuses on limiting the time-average queueing delay. According to the Little’s theorem , low time-average queueing delay can be achieved by reducing the expected value of queue backlogs. This paper introduces the concept of strong stability of a queue to make queue backlogs bounded, as follows:
Iii Joint Optimization Problem Formulation of User pairing and Power Allocation in Hybrid Multiple Access
This paper pursues both high power efficiency and low queueing delay. In addition, the long-term average data rates are considered as one of the QoS requirements. Specifically, the transmit power of depends on whether the transmitter serves user by OMA or NOMA and which user is paired with user for NOMA. Thus, we can formulate the joint optimization problem to find the optimal power allocation and user pairing:
where , is the user index set, and is the subset of . and
denote the column vectors ofand for all , respectively. Note that the transmit power depends on user pairing and time . The constraint (8) represents strong stability of the queueing system, which makes queue backlogs bounded. In addition, sufficient time-average data rates of for user for are guaranteed as one of QoS by the constraint (9). The power budget of is independently assumed for each communication link, so the constraint (10) is given. For simplicity, will be written simply as .
The problem (7)-(10) can be solved by the theory of Lyapunov optimization . We first transform the inequality constraint (9) into the form of queue stability. Specifically, define the virtual queue for , with the update equation:
The strong stability of the virtual queue pushes the average of to be close to the QoS guarantee .
Let and denote the column vectors of and for and at time , respectively, and let be a concatenated vector of actual and virtual queue backlogs. Define the quadratic Lyapunov function as follows:
Then, let be a conditional quadratic Lyapunov function that can be formulated as , i.e., the drift on . The dynamic policy is designed to solve the given optimization problem (7)-(10) by observing the current queue state, , and determining power allocation and user pairing to minimize a upper bound on drift-plus-penalty :
First, find the upper bound on the change in the Lyapunov function.
Then, the upper bound on the conditional Lyapunov drift is given by
where we assume that departure and arrival rates are bounded, and is a constant such that . According to (13), minimizing a bound on drift-plus-penalty is consistent with minimizing
because is not controllable and all values of for are constants.
We now use the concept of opportunistically minimizing the expectations and specifically go after the following drift-plus-penalty problem:
Since there are so many possible combinations of user pairing, it is very difficult to exhaustively minimize the optimization metric of (21). Therefore, we first find the optimal power allocation depending on the fixed user pairing policy. Then, several pairs of two users are generated for NOMA to minimize the optimization metric of (21), based on the matching theory.
Iv Optimal Power Allocation for Hybrid MA
For simplicity, notations for the dependency of all parameters on are omitted in this section, because the optimal power allocation depends only on CSI and QSI at current time . Therefore, , in this section.
Iv-a Optimal power allocation for OMA
First, the power allocation policy for OMA users is presented. The data rate of user which employs OMA is given by
where , is single-sided noise spectral density, and is bandwidth. represents the degradation coefficient of channel capacity due to finite blocklength codes appropriate for the short packet structure [16, 54]. We assume that the bandwidth is equally allocated to users. Note that for all in OMA. The power interval for avoiding the outage, i.e. , can be obtained, as written by
If , then .
Remark: Since Shannon capacity assumes channel codes of infinite length, it is not appropriate to directly apply Shannon capacity to low latency communications with the short packet structure. The authors of [54, 55] obtained channel capacity with finite blocklength codes in a variety of channel models. Strictly, the capacity with finite blocklength codes has a different form from (22); however, we approximate the data rate by weighting the degradation factor to Shannon capacity in a similar way to that in , and is assumed in this paper.
When the transmitter serves all of users by OMA, each user’s data rate is independent of each other, so the optimization problem (18)-(19) can be solved by independently minimizing for all . When , let , . Therefore, the optimization problem (18)-(19) can be transformed into
for all in OMA system, where
Assume that , i.e., the outage event does not occur. Then, differentiating (26) by ,
where . and the local minimizer is obtained from , i.e.,
Further, is shown to be the global minimizer in the region of by
However, when , , so is the minimizer and . Therefore, if , always. Otherwise, i.e. when , the relative value of to and determines .
When , is still the global minimizer. However, when , the minimizer in the interval of becomes , and still is the minimizer in . Therefore, if , is the global minimizer in . Otherwise, , i.e., no power is allocated to user .
In the case of , the global minimizer is in the outage region. Then, is the minimizer in the interval of . Thus, if , becomes the minimizer in , and if not, is the solution. Finally, (27) is obtained.
Remark: As we mentioned earlier, when the outage is expected at user , the transmitter can save the power, i.e., . Further, when queue backlogs of and are small so the second and third terms of (26) are small compared to the system parameter , the transmitter cannot schedule the link of user to save the power, even though the link is not in outage. In this way, link activation is determined by power allocation, so it can be said that user scheduling is also performed.
Iv-B Optimal Power Allocation for Two-user NOMA
In this section, the optimal power allocation is obtained for a given NOMA pair of user and user , i.e., and . Assume that . For employing NOMA, the larger power is usually allocated to the user with weaker channel condition. Throughout the paper, the user with the weaker channel who does not perform SIC and the user with the stronger channel who performs SIC will be referred to as non-SIC user and SIC user, respectively. Let user and user be the non-SIC user and the SIC user, respectively, without loss of generality, with the assumption of . The data rates of NOMA users are given by
Suppose that signals for other users are orthogonally multiplexed with the NOMA signaling of user and user . Then, the power allocation problem for user and user can be independently formulated from the power allocation problem of (18)-(19), as follows:
where , which is the optimization metric of user when user is paired with user and . represents the optimal transmit power for user , when user is paired with user for NOMA. However, is not concave, so the optimization problem of (33)-(34) is not a convex problem. Therefore, the auxiliary variable is introduced. represents power summation of a user pair, so should be satisfied. Then, the problem of (33)-(34) can be resolved by solving two sequential subproblems.
The first subproblem is to find the power allocation for NOMA users with the fixed value of , as formulated by:
The power intervals for avoiding the outage event at both NOMA users is considered to solve the subproblem of (35)-(37). should be guaranteed for user to avoid the outage, in other words, the transmit power should be
Let denote the outage region of user . When , the objective function of (38) is given by
Similarly, user can prevent the outage event when , and it corresponds to
Let denote the outage region of user . When , the objective function of (38) is given by
When and , let , then
When and , let , then
When and , let , then
When , then
When , let , then
When and , then
When and , then