With the explosive proliferation of mobile devices, future wireless networks should provide an increasing number of multimedia applications with more stringent Qualities of Service (QoS). Among various QoS metrics, latency and energy efficiency are two key metrics of interest[1, 2]. Low latency is highly expected when providing delay-sensitive or time-critical applications such as in Tactile Internet  and is an important metric in URLLC (Ultra-Reliable Low Latency Communications) which is a new feature brought by 5G . Meanwhile, high energy efficiency is urgently required especially for mobile devices powered by rechargeable batteries of finite capacity. Thus, it is of great importance to study the delay optimal and energy/power efficient transmission strategy for users in wireless communications[5, 6]. Intuitively, to reduce the latency, the transmitter would conduct transmission more frequently or increase the transmission data rate, which inevitably costs more transmission power. Therefore, there exists a fundamental tradeoff between the average queueing delay and the average power/energy consumption.
In general, it is very challenging to derive the delay-power tradeoff in wireless communication systems, considering the randomness of data packet arrivals, and the time-varying characteristics of wireless channels. These randomness occur in different layers of the transmitter, which increases the difficulty of characterizing the delay-power tradeoff. To deal with this issue, the cross-layer design framework, first presented in , was proposed to capture the uncertainties occurring at different layers in the last decades[9, 10, 11, 12, 13, 14].
Within the cross-layer architecture, many works have focused on revealing the delay-power tradeoff, which can be classified into two major categories. One line of the works attempt to find the analytical delay-power tradeoffs by considering some ideal or simplified assumptions on the system model[15, 16, 17, 18, 19]. In , the authors proposed a scheduling policy named Lazy scheduling which assigns transmission chances based on the backlog in the queue under the assumption that the arrival times of the packets are known in advance. In , the authors minimized the transmission power with QoS constraits by assuming that the data arrival is known ahead of schedule and the channel is static or slow fading. The power constrained delay minimization problem was studied for a cognitive multi-access channel and a two-state block fading channel in  and , respectively. This line of works mainly provide theoretical value more than engineering value, since the assumptions are too ideal to be practical. However, they are able to provide deeper insights to guide for engineering applications such as protocol design.
The works in the other category consider more complex and practical system models[9, 10, 11, 20, 21, 22]. In , Berry and Gallager proposed adapting the users’ transmission rate and power by regulating the average power and average buffer delay over a wireless fading channel. They also focused on studying the cross-layer resource allocation in wireless fading channels for  and deriving the optimal power-delay tradeoff for a single user in the regime of asymptotically small delays in . Ata investigated the power minimization problem subject to the packet drop rate in 
, assuming the fixed channel state, Possion packet arrival and exponentially distributed packet size. In[21, 22], the authors studied the delay-bounded packet scheduling problem with bursty traffic arrival over wireless channels. This line of works studied the delay-power curve and analyzed its property under some circumstances. While it is difficult to derive theoretical solutions in general cases. This line of works mainly focus on studying resource allocation solutions and designing efficient algorithms for practical usage, which is of great importance in designing delay/power-efficient wireless transmission strategies.
More recently, a simple probabilistic scheduling policy was proposed to achieve the minimum queueing delay under power constraint in our previous work , where Bernoulli packet arrivals and a two-state fading channel model were considered. Further, arbitrarily random packet arrival patterns were considered to capture the impact of bursty network traffic in [23, 24] and adaptive transmission is considered in . In these works, we proved that the optimal delay-power tradeoff can be achieved by applying the optimal scheduling polices which determine packet transmissions based on the threshold imposed on the queue length. The structured policy is appealing for the scheduler thanks to its ease of deployment. Hence, it inspires us to further dig into this topic. We naturally wonder if the optimal solution still has a special structure in more general scenarios and what kind of structure it may have.
In this paper, we study the delay-power tradeoff in wireless packet transmissions in a more realistic but complex communication system, where data packets are generated from an arbitrarily bursty traffic and a multi-state wireless fading channel is considered. Some potential application scenarios are shown in Fig.1. The major challenges of this work lie in two aspects: 1) how to perform probabilistic scheduling jointly based on the randomness of the data packet arrival, the occupancy of the transmission data queue, and the time-varying characteristics of the wireless channel, and 2) how to reveal the structure of the optimal policy.
At the first task, the major challenge confronted is to build a proper cross-layer framework which includes all the system dynamics. Incorporating all these effects, our proposed scheduling policy performs joint scheduling based on the time-varying environment. Hence, it is very challenging to formulate the optimal cross-layer scheduling problem while facilitating theoretical analysis of its optimal solution. To deal with this difficulty, we propose a stochastic scheduling policy being aware of packet arrival, buffer and channel states. Then, we formulate a non-linear optimization problem to find the optimal probabilistic scheduling parameters. The challenge behind the second task is how to solve the optimal scheduling problem and derive the closed-form solution. This lies in the fact that the dimensionality of solving the optimal scheduling problem increases significantly due to the enlarged number of scheduling parameters that increases linearly with the number of channel and packet arrival states. By solving the obtained non-linear problem, we can surely obtain the optimal delay-power tradeoff. However, it is not trivial to search for the optimal solution to the non-linear optimization problem, let alone derive the optimal scheduling solution theoretically. To deal with this challenge, we first find a method to convert it to an LP problem, through which we can further analyze the structure of the optimal solution and reveal that the optimal scheduling policy has a dual-threshold-based structure step by step. By dual-threshold-based, we mean that packets should be transmitted based on the thresholds imposed on not only the queue state but also the on channel state.
The remainder of this paper is organized as follows. The system setting is introduced in Section II. In Section III, we propose the probabilistic scheduling policy to schedule packet transmissions based on the buffer and the channel states simultaneously. In Section IV, we formulate a non-linear power constrained delay minimization problem and then convert it to an equivalent LP problem. In Section V, we reveal that the optimal scheduling policy is dual-threshold-based with a rigorous mathematic proof and propose an algorithm to find simplified suboptimal policy. Simulation results are demonstrated in Section VI to validate the dual-threshold-based policy and concluding remarks are presented in Section VII. Some notations frequently used are explained as follows. Given a positive integer , the notation denotes an integer set while denotes integer set . Sets and , and are defined in the same way.111Part of this work was published in , where main results were presented while most important derivations for some conclusions towards the dual-threshold-based structure were omitted due to the limited space.
Ii System Model
We consider a wireless communication system where the source node transmits to the destination over a time-varying wireless link. As shown in Fig.2, packets of bursty traffic generated by higher-layer applications arrive at the network layer randomly, and are stored at the buffer in the data link layer. In the physical layer, the transmitter determines when to transmit the queued packets over a multi-state wireless channel, with the aid of efficient scheduling policies.
Let denote the number of packets randomly arriving in the th slot. To capture the burstiness and variability of real-time applications, we assume an arbitrarily packet arrival pattern, i.e., the number of newly arriving packets could follow any distribution. Suppose that is independent and identically distributed (). Thus, the mass probability function of can be characterized by
where . Considering traffic shaping and admission control adopted in the system, the number of packets newly arriving in each time slot must be upper-bounded by a large integer . In other words, there exists a positive integer such that , for all , and . The average packet arrival rate is obtained as
At the source node, a buffer is employed to store the backlogged packets which cannot be sent immediately. The queue state, denoted by , is characterized by the number of packets in the buffer at the end of th slot and updated as
where is the transmitted packets in the th time slot and is the capacity of the buffer222Packet overflow will occur if is quite small. In this work, we assume that is a sufficiently large constant such that no packet overflow will occur. In Section V, we give the conclusion that if is greater than a threshold, the queueing length will never reach the capacity according to our proposed scheduling scheme. Thus, the max operation in (3) can be omitted..
We adopt a -state block fading channel model, where is a positive integer. Let denote the channel state in the th time slot. By block fading, we mean that the channel state stays invariant during each time slot and follows an fading process across the time slots. Here, the discrete channel states indicate different wireless channel qualities. Let be the channel power gain levels. If the channel gain in the th time slot ranges in interval , we say that the wireless channel is at state . Since the channel quality becomes better with the increase of the index, and represent the worst and the best channel condition, respectively. The mass probability function of is described as
where and .
Suppose that there exists a feedback channel through which the Channel State Information (CSI) is sent back from the receiver to the transmitter. Intuitively, the transmission power shall be adapted to the channel state to meet the requirement of successful packet delivery. Let () denote the power needed to transmit one packet successfully in the channel sate . Since more power is required to combat wireless channel fading when the channel condition is worse, it is reasonable to assume .
In our model, we consider a fixed-rate transmission scheme which has been widely adopted in practice . Without loss of generality, we assume the transmission rate is one packet per slot. Hence, at most one data packet can be delivered in each slot, namely, .
In the cross-layer design framework shown in Fig.2, the scheduler will schedule packets transmissions in each slot based on the packet arrival state , the queueing state , and the channel state subjected to a power constraint, as will be discussed in details in the next section where the scheduling problem is treated as a power constrained Markov Decision Process (MDP), and discussed in Section IV.
Iii Probabilistic Scheduling Policy
In this section, we introduce a probabilistic scheduling policy based on which the transmitter decides whether or not to deliver one data packet to its receiver in each slot.
Iii-a Probabilistic Scheduling
To improve the power efficiency, the transmitter should exploit a better channel state to deliver the packets to spend much less power. Thus, the source is more willing to keep silent till the channel state gets better. However, this may induce undesirable large latency waiting for good channel states, which is intolerable for serving delay-sensitive or time-critical traffics. To overcome this issue, some backlogged packets should be transmitted immediately at the cost of consuming higher power, even when the channel state may not be so good. Hence, the proposed scheduler must achieve a balance between the average delay and the power consumption.
In this work, a probabilistic cross-layer scheduling policy is proposed to schedule packet transmissions in each time slot. At the beginning of the th time slot, the scheduler collects the current system state including the queueing state , the packet arrival state , and the channel state . Given , , and , it decides to transmit one packet with probability or keep silent with probability . By , we mean that the scheduler can schedule packet transmissions based on the updated queue state after one packet arrival. The reason lies in the fact that one of the packets newly arriving at this slot can be delivered immediately. Hence, it is not necessary to distinguish between the backlogged packets and the newly arriving packets. Clearly, the transmission probability lies in the interval .
According to the above probabilistic scheduling policy, the number of transmitted packets
for the current slot is a random variable, the probability mass function of which is given by
where and the abbreviation is short for 333In Eq. (5), when , there is no packet waiting to be transmitted, and when , packet loss will happen. Thus, () and () are set as zero for notational consistence..
We aim to find the optimal policy with a set of optimal transmission probabilities that can minimize the average queueing delay under an average transmission power constraint.
Iii-B Markov Decision Process
Based on the scheduling policy in section III-A, the scheduler makes decision of transmitting packet(s) in every slot. The transmission decision affects the number of the packets queueing in the buffer as well as the transmission power. In this sense, we model the scheduling problem as a constrained MDP with the queue length being the system state. The decision, either waiting or transmitting (), is treated as one candidate action taken at the current state. Executing each action certainly causes some system costs, namely, the delay cost associated with the queue length and the power cost associated with the packet transmission. Let denote the one-step state transition probability from state to state , i.e.,
The transition probabilities of the underlying Markov chain are presented in Lemma 1.
The forward and backward state transition probabilities denoted by and are obtained as
where and . The state transition probability is the probability that the queue length remains the same, given by
From Eq. (3), the transition from state to takes place with probability when .444Due to the assumption in Footnote , the operators ”max” and ”min” in Eq. (3) can be omitted here. This happens in two cases. In Case I, when there are packets arrive at the queue over channel state , no packet is delivered () with probability . In this situation, the transition probability is .In case II, when there is packets arrive at the queue over channel state , one packet is transmitted () with probability . Accordingly, the transition probability is . Combining these two cases,
can be calculated using the law of total probability, and shown in Eq. (7). Similarly, the probability can be derived, as given by Eq. (8). The probability given by Eq. (9) is obtained using probability normalization. ∎
Notice that, holds for , since the queue length increases from up to after one packet arrival. In Fig.3, we show an example of the MDP model with . In each time slot, increases by no more than due to one new data arrival, while decreases by one since at most one packet can be delivered. Let matrix denote the -by- transition probability matrix of the underlying Markov chain. The -th element of is transition probability . The transition probability matrix is a banded matrix, since the number of the newly arrival packets and departing packets are limited in one slot.
Let denote the steady-state probability of the queue length being equal to
. The stationary distribution of the system state is denoted by the vector, where the superscript denotes matrix transpose. Vectors and are used to denote the -dimensional column vectors whose entries are zero and one, respectively. According to the property of the steady-state probability, we have and . Hence, the stationary distribution is the solution to the following linear equations
where is a matrix consisting of the first rows of the generator matrix . From Eq. (10) and Lemma 1, we can see that the steady-state probability is determined by the scheduling policy with the parameters .
Iv Delay and Power Tradeoff
In this section, we first analyze the two key performance metrics: the average queueing delay and the average power consumption. Then, we formulate optimization problems to describe the delay minimum power constrained scheduling problem, based on the stationary probability of the built Markov Decision Process.
Iv-a Delay and Power Metrics
In accordance with every transmission action , the scheduler spends some system costs due to queue occupation and packet transmission. Given action , the queueing cost for buffer occupation is denoted by and the power cost for packet transmission is denoted by , respectively, expressed as
As time goes by, the time-average costs can be built up as
respectively. Considering the minus and connotative plus operators before in Eq. (11), an action exerts opposite influences on the buffer occupation and power consumption, which naturally leads to a tradeoff between the average delay (the Little’s Law) and the average power.
The above analyses explain the average delay and the average power from the cost perspective of the scheduling policy. It’s much easier to understand the tradeoff from the expressions of the two metrics given in Eq. (11). To mathematically derive the two metrics, we refer to the MDP model built in Section III-B. Once the stationary distribution is obtained, the average queueing delay and power consumption can be derived and shown in the following theorem.
Given a probabilistic scheduling policy , the average queueing delay and power consumption can be expressed as
Given the stationary probability distribution of the Markov chain, the average queue length can be expressed as. Then according to the Little’s Law , the average queueing delay can be derived as and shown in Eq. (13).
With , we have and , respectively, when one packet is transmitted over the channel state , i.e., , and no transmission takes place, i.e., . Let denote the conditional probability of given the queue state and channel state . It can be expressed as
By the law of total probability, the average power can be derived as
We notice that, the steady state probability is an implicit function of the transmission probabilities, since it is uniquely determined by the transmission probabilities based on the analyses in Section III-B. Thus, from Theorem 1, the average queueing delay and the average power consumption are both functions of transmission probabilities.
Iv-B Delay-Power Tradeoff
To find the optimal scheduling policy with a set of transmission probabilities , we formulate an optimization problem to minimize the average queueing delay under the power constraint as follows:
where , and symbol ’’ represents the component-wise inequality between vectors. In problem (17), the objective is to minimize the average queueing delay. Constraint (17.a) denotes the maximum power constraint. Constraint (17.b) indicates the range of the optimization variables . Constraints (17.c-17.e) are derived from the properties of the Markov chain. Constraint (17.e) specifies the range of the steady-state probabilities. Since problem (17) is a non-linear programming problem, it is rather difficult to obtain the optimal solution analytically. To make it tractable, we first convert problem into an equivalent LP problem via variable substitution.
Iv-C LP Problem Formulation
To formulate an LP problem, we introduce a set of new variables as
In Eq. (18)555We assume the steady-state probability whose subscript is negative is zero for notation convenience. Otherwise, variable should be defined as ., is the probability of transmitting one packet, i.e., , when there are data packets in the buffer and data packets newly arriving at the transmitter. Thus, is the probability that there are packets backlogged in the queue after one packet transmission over channel state . This procedure allows us to express the objective function and the constraints of (17) as linear functions of . Hence, we are able to convert the non-linear problem (17) into a more tractable LP problem, as shown below.
Let be a constant. The optimization problem (17) is equivalent to the following LP problem:
where is the -th element of matrix which describes the relationship between the steady-state probabilities of the Markov chain and the variables , as given by
The detail is given in Appendix A. ∎
As shown in problem (19), there exists a minimum queueing delay for any feasible power constraint . Hence, the optimal queueing delay can be expressed as a function of , i.e., . In the following theorem, we reveal the decreasing property of the delay-power function to discuss the structure of the optimal scheduling policy in the next section.
The delay function monotonically decreases with the maximum transmission power .
The detail is given in Appendix B. ∎
Till now, we construct an LP problem to describe the delay-minimal scheduling problem under power constraint. After deriving the optimal solution , we can then obtain the steady-state probability by Eq. (20) and the optimal scheduling probability by Eq. (18). In the sequel, we show how to derive the optimal solution as well as the optimal probabilities.
V Dual-threshold-based Policy
In this section, we focus on revealing the dual-threshold-based structure of the optimal scheduling policy. We first present the definition of the threshold-based structure.
Let denote an integer set. A probability set has a -threshold-based structure if and only if there exists an optimal threshold such that and .
In what follows, we show that the optimal scheduling policy has such a structure on both the buffer state dimension and the channel state dimension, referred to as a dual-threshold-based policy. An example of the structure is illustrated in Fig.4, where positive scheduling probabilities with the indexes of buffer and channel states are plotted, and zero scheduling probabilities are omitted for briefness. In particular, given the queue state , the optimal scheduling probabilities follows a threshold-based structure, i.e., for and for , where is the optimal threshold on the channel state dimension. Similarly, given the channel state , the optimal scheduling probabilities has a threshold-based structure on the queue state dimension. That is, there exists an optimal threshold on the queue state such that for and for , respectively. The proof of the dual-threshold-based policy is presented in two steps in subsections A and B, in accordance with the two dimensions of the channel and buffer states. What’s more, we show that there is at most one threshold state () at which the optimal scheduling probability is non-zero in subsection C. Simplified threshold policy is proposed to achieve suboptimal performance in subsection D.
V-a Threshold-based Structure on the Channel State Dimension
We firstly reveal the non-decreasing property of the optimal solution to problem (19). Then, we equivalently transform problem (19) into a new problem, which facilitates us to prove that has a -threshold-based structure. By mapping back to , the optimal scheduling policy is shown to have a threshold-based structure.
The optimal solution to problem (19) has the following property, for any queue length ,
The detail is given in Appendix C. ∎
Recall that, is the probability that there are packets left in the queue after one packet transmission over channel . Thus, the physical meaning of Lemma 2 is that, it reveals the tendency of exploiting a better channel state when one transmission has to be performed for the optimal policy.
The LP Problem (19) is equivalent to the following problem
where is a constant.
The detail is given in Appendix D. ∎
With above two lemmas, we derive the threshold structure imposed on the channel state for a given queue length of the optimal solution to problem (19) as follows:
For any queue length , there exists an optimal integer threshold such that the variables has a -threshold-based structure, i.e.,
The detail is given in Appendix E. ∎
On one hand, Theorem 4 is a stronger conclusion compared to Lemma 2 where the tendency of the optimal policy is revealed. It illustrates that one packet can only be transmitted if the channel state is better than a threshold. On the other hand, with the bond between and , the threshold structure in Theorem 4 reflects the structure of the optimal scheduling policy . Specifically, the optimal scheduling probability is derived according to Eq. (18) and given as 1) , if ; 2) , if ; 3) . Thus, also satisfies Definition 1 and the optimal scheduling policy has a threshold-based structure on the channel state dimension for any given queue length .
V-B Threshold-based Structure on the Queue Length Dimension
It is not a trivial work to reveal the threshold structure on the queue state dimension straightforwardly due to the highly complicated relationship between the variables . Thus, we turn to the scheduling action taken by the optimal policy. Then, we map the transmission action back to the scheduling probability and find that the optimal policy also has an -threshold-based structure on the queue state dimension.
For a given channel state , there exists an optimal integer threshold such that the optimal transmission action has the -threshold structure, namely
where denotes the updated queue state after one new packet arrival in the th time slot.
The detail is given in Appendix F. ∎
In Lemma 4, we show that the optimal transmission action is determined based on the updated queue state and the optimal threshold . Together with Eq. (5), we can connect to the scheduling probability , and reveal that the probabilities also depend on the updated queue state and the optimal threshold : 1) , if ; 2) , if . Thus, the optimal policy is proved to has a threshold-based structure on the queue length for any given channel state.
V-C Dual-threshold-based Policy
The optimal scheduling policy turns out to be a dual-threshold-based policy, as illustrated in Fig.4. We complete it as follows by specifying the values on the threshold points.
(1) The optimal scheduling policy corresponds to a dual-threshold policy. In detail, a) for any queue length , there exists a threshold , for and for ; b) there exists . (2) There is at most one threshold state () at which the optimal scheduling probability is non-zero.
Conclusion (1-a) is exactly the threshold structure obtained in subsection A. Combining the threshold structure imposed on the queue length for a given channel state, we obtain conclusion (1-b) which describes the non-increasing property of . The proof of conclusion (2) is given in Appendix G. ∎
According to our proposed scheduling scheme, once the queue length exceeds , one packet will be transmitted whatever the channel state. Thus, if we set the buffer capacity , the queueing length will never reach the capacity and no packet overflow will occur. The threshold structure is a tradeoff result of reducing the queueing delay and saving power resource. An intuition explanation that explains why the policy has such a structure can be found in Appendix H.
V-D The Suboptimal Policy
It is not a trivial work to obtain closed-form expressions of the thresholds even if we have revealed their properties in Theorem 5. By solving the LP problem, we surely can obtain optimal thresholds and the non-zero scheduling parameter that might exist at one of the joint threshold points. Otherwise, we have to resort to some search methods to find these optimal thresholds directly. In what follows, we come up with two search methods in two different scenarios.
Scenario I: we develop a structured search algorithm to find a suboptimal solution by fully exploiting the non-increasing properties of the optimal thresholds, as presented in Theorem 5. In other words, this property helps to reduce the search space of the candidate threshold points significantly. In detail, combing the non-increasing property of the threshold points , i.e., if , and the fact that the buffer capacity is usually greater than the number of channel states , we know some neighbor queue lengths are likely to share a same threshold . Based on this property, we can reduce the number of the thresholds points that need to be searched. In detail, the queue length range can be divided into several small intervals, each of which is assigned one threshold imposed on the channel states. Thus, we only need to determine how to divide the queue states and assign one threshold for each small interval. The simplified suboptimal policy is given in Algorithm 1, where the total queue states is divided into two sub-intervals. A table can be built up to store the induced delay and power metrics for all the simple policies. Then, to obtain the suboptimal policy for a given power constraint, we only need to look up the table and return the thresholds. The performance can be further improved by assigning one scheduling probability to some threshold points.
Scenario II: we find the optimal thresholds by looking up a preset table. Specifically, we first set up a table containing the delay and power information of all possible candidates of the optimal thresholds by performing intensive computations. The table formulation surely costs a lot of computation resources, but it can be done once for all. Afterwards, when designing the optimal scheduling policy given the power constraint , we can first find the optimal thresholds by comparing with the power information in the table. The optimal scheduling parameter can be further determined with the obtained and . And different power constraints lead to different optimal thresholds and hence the optimal scheduling parameters.
Vi Numerical Results
In this section, simulation results are given to validate the derived dual-threshold-based scheduling policy and to demonstrate its potential. For performance comparison, theoretical results of the optimal delay-power function are obtained by solving the LP problem (19). Meanwhile, simulation results are obtained by applying the dual-threshold-based scheduling policy with the optimal transmission parameters. In simulations, data packets are generated following a given probabilistic distribution . The -state block fading channel model is adopted and follows with probability . Each simulation runs over time slots. As shown in Fig.5-8, the theoretical and simulation results are plotted by lines (solid or dashed) and marked by red square dots, respectively.
Fig.5 plots the delay-power tradeoff curves under different average packet arrival rates. The simulation results are in good agreement with the theoretical results, which validates the optimality of the derived dual-threshold-based policy. The delay-power tradeoff curve is piecewise linear since the threshold-based is obtained as the linear combinations of deterministic scheduling parameters. Besides, the average delay monotonically decreases with the maximum average power, as stated in Theorem 3. When the power constraint decreases to zero, the queueing delay increases dramatically to infinity, which implies that the queueing system is unstable. Given the same power constraint, the queueing delay increases with the packet arrival rate since more packets are detained in the buffer due to lack of transmission opportunities.
, we evaluate the effect of the burstiness of the packet arrival on the optimal delay-power tradeoff curves, considering different packet arrival patterns, namely, the Bernoulli arrival and the bursty arrival. We can see that the proposed scheduling policy has a better delay-power tradeoff performance when the packet arrivals follow the Bernoulli distribution rather than the more bursty probabilistic distribution (with larger variance), subject to the same average arrival rate. This is due to the fact that the bursty packet arrivals bring more randomness to the queueing system. The average queueing delay decreases with the increase of the power constraint and remains constant when the power constraint exceeds a constant. In other words, the delay-power curve becomes flat after an inflection point , where is the globally minimum delay and denotes the power consumption that the source spends to keep transmitting packets as long as the buffer is not empty, regardless of the channel state. However, the value of is identical for the two different patterns. The value of is able to reach zero for the Bernoulli arrival since the transmission rate is fixed as one packet per slot and is greater than zero due to the burstiness.
Inspired by the observation in Fig.6, we further demonstrate the delay-power tradeoffs in Fig.7 for the packet arrivals have the same average arrival rate and different variances. It is observed that a higher queueing delay is induced when the data arrival variance is larger. Due to higher bursty arrivals, some packets have to wait for a longer time before they are transmitted, which leads to a larger queueing delay.
In Fig.8, we demonstrate the theoretical results to validate the dual-threshold-based structure of the optimal scheduling policy, which are in agreement with the structure shown in Fig.4. The transmission probabilities reveal a threshold-based structure on both the channel state dimension and the queue length dimension. In Fig.8(a), the threshold is in channel state and queue length while in Fig.8(b), it is in channel state and queue length . Thus, transmission is much easier to occur in Fig.8(b), which corresponds to a higher power consumption. That is, the scheduler makes use of the power resource mainly by adjusting the threshold point for quite different power constraints. In Fig.8(b) and Fig.8(c), it’s calculated for both scenarios that the threshold is in channel state and queue length . However the scheduler makes a decision of transmitting one packets with probability in Fig.8(b) and in Fig.8(c) on the threshold point, respectively. That is, the scheduler makes full use of the power resource mainly by adjusting the transmission probability on the threshold point for slight different power constraints.
In Fig.9, we plot the optimal delay-power curve of our proposed scheme and 1000 delay-power points of the deterministic policy with the binary transmission parameters randomly generated. As can be seen from this figure, the delay-power tradeoff curve is the lower boundary of the convex hull of the achievable delay-power region, which is in accordance with the conclusion proved in  that the optimal probabilistic policy can be constructed by the convex combination of deterministic scheduling policies. Hence, our proposed optimal scheduling policy outperforms any deterministic scheduling policies given the same power constraint. Meanwhile, our proposed stochastic scheduling policy with the optimal thresholds and scheduling parameters can achieve the same optimal delay-power tradeoff performance as the optimal scheduling policies found by the DP method.
In Fig.10, we plot the delay-power tradeoff curves induced by the optimal policy and suboptimal policy proposed in subsection V-D. Since same suboptimal policy will be assigned to close power constraints, the suboptimal curve remains flat sometimes. Also, the suboptimal policy may indeed be the optimal policy for some power, thus, there exist intersections for the two curves. The suboptimal policy determines the scheduling with less threshold points compared to the optimal one, thus, it is easier to be found and apply but with less accuracy. One can also assign a probability for the threshold points and further make use of the power as given in the purple curves.
In this paper, we studied the power-constrained delay-optimal scheduling problem in wireless systems, where arbitrary packet arrivals and multi-state block-fading channels were considered. A probabilistic queue-aware and channel-aware scheduling policy was proposed to schedule packet transmissions over a -state wireless fading channel and investigated in the framework of constrained MDP. Through theoretical analysis, we reveal the dual-threshold-based structure of the optimal scheduling policy. It is found that the optimal scheduler always seeks to exploit a good channel while maintaining a relatively short queue as possible to reduce the latency. To this end, the scheduler should schedule packet transmissions based on the queue state and the channel state. Specifically, given a channel state, if the queue length exceeds the threshold, the transmitter should transmit to decrease the latency. Otherwise, it should keep silent to save power. In the future, we will extend this work to more general scenarios with adaptive-rate transmission and/or multi-user scheduling.
Appendix A The proof of Theorem 2
In this appendix, we show that problem (17) can be equivalently converted into LP problem (19) with variables being the optimization variables. To make it clear, we explain the transformation procedure in the following five steps. We first present the equivalent expressions of the average queueing delay and the power constraint in Part A-1. Secondly, we specify the ranges of optimization variables corresponding to constraint (17.b) in Part A-2. Then, we reformulate constraints (17.c) and (17.d) in Part A-3 and Part A-4, respectively. Finally, we explain why constraints (17.c) and (17.e) are not shown in the LP problem (19) in Part A-5.
A-1 The average queueing delay and the power constraint can be re-expressed as
Firstly, we re-express the average queueing delay. Adding the weighted sum of the terms with being the weight, we have
where equality 1⃝ is derived by substituting (30), equality 2⃝ is obtained by expressing each term in separately for , equality 3⃝ comes from the definition of the average queue length, equality 4⃝ is obtained by substituting , and equality 5⃝ stems from . Thus, the average queue length is:
According to the Little’s Law, we obtain the average queueing delay as given in Eq. (25).
A-2 The variable satisfies the following inequalities
We know that probability takes its value from the interval . By substituting and