Queueing Subject To Action-Dependent Server Performance: Utilization Rate Reduction

by   Michael Lin, et al.

We consider a discrete-time system comprising a first-come-first-served queue, a non-preemptive server, and a stationary non-work-conserving scheduler. New tasks arrive at the queue according to a Bernoulli process. At each instant, the server is either busy working on a task or is available, in which case the scheduler either assigns a new task to the server or allows it to remain available (to rest). In addition to the aforementioned availability state, we assume that the server has an integer-valued activity state. The activity state is non-decreasing during work periods, and is non-increasing otherwise. In a typical application of our framework, the server performance (understood as task completion probability) worsens as the activity state increases. In this article, we expand on stabilizability results recently obtained for the same framework to establish methods to design scheduling policies that not only stabilize the queue but also reduce the utilization rate, which is understood as the infinite-horizon time-averaged expected portion of time the server is working. This article has a main theorem leading to two main results: (i) Given an arrival rate, we describe a tractable method, using a finite-dimensional linear program (LP), to compute the infimum of all utilization rates achievable by stabilizing scheduling policies. (ii) We propose a tractable method, also based on finite-dimensional LPs, to obtain stabilizing scheduling policies that are arbitrarily close to the aforementioned infimum. We also establish structural and distributional convergence properties, which are used throughout the article, and are significant in their own right.



There are no comments yet.


page 1


Stabilizing a Queue Subject to Action-Dependent Server Performance

We consider a discrete-time system comprising an unbounded queue that lo...

On Learning the cμ Rule: Single and Multiserver Settings

We consider learning-based variants of the c μ rule -- a classic and wel...

Analysis of the Symmetric Join the Shortest Orbit Queue

This work introduces the join the shortest queue policy in the retrial s...

Qubits through Queues: The Capacity of Channels with Waiting Time Dependent Errors

We consider a setting where qubits are processed sequentially, and deriv...

An Infinite Dimensional Model for a Many Server Priority Queue

We consider a Markovian many server queueing system in which customers a...

Improved queue-size scaling for input-queued switches via graph factorization

This paper studies the scaling of the expected total queue size in an n×...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In this article, we adopt the discrete-time framework proposed in [14], in which a scheduler governs when tasks waiting in a first-come-first-serve queue are assigned to a server. The server is non-preemptive, and it has an internal state comprising two components labelled as availability state (busy or available) and activity state, which accounts for the intensity of the effort put in by the server. The activity state depends on current and previous scheduling decisions, and it is useful for modelling performance-influencing factors, such as the state of charge of the batteries of an energy harvesting module that powers one or more components of the server. As a rule, the activity state may increase while the server is busy and, otherwise, decrease gradually while the server is available (or resting). In our approach [14], a service rate function ascribes to each possible activity state, out of finitely many, a probability that the server can complete a task in one time-step. According to our model of non-preemptivity, once the server becomes busy working on a task it gets to be available again only when the task is completed. When the server is available, the scheduler decides, based on the activity state and the size of the queue, whether to assign a new task to the server. Although our results remain valid for any service rate function, in many applications it is decreasing, which causes the server performance (understood as task completion probability) to worsen as the activity state increases. A vital trade-off the scheduler faces, in this case, is whether to assign a new task when the server is available (resting) or allow it to remain available to possibly ameliorate the activity state as a way to improve future performance.

I-a Problem Statements and Comparison to [14]

Besides introducing and justifying in detail the formulation adopted here, in [14] the authors characterize the supremum of all arrival rates for which there is a scheduler that can stabilize the queue. The analysis in [14] also shows that such a supremum can be computed by a finite search and it identifies simple stabilizing scheduler structures, such as those with a threshold-type configuration.

In this article, we build on the analysis in [14] to design schedulers that guarantee not only stability but also lessen the rate at which the server is used, which we denote as utilization rate —understood as the average portion of time in which the server is busy working on a task. More specifically, throughout this article, we will investigate and provide solutions to the following two problems.

Problem 1

Given a server and a stabilizable arrival rate, determine a tractable method to compute the infimum of all utilization rates achievable by a stabilizing scheduling policy. Such a fundamental limit is important to determine how effective any given policy is in terms of the utilization rate.

Problem 2

Given a server and a stabilizable arrival rate, determine a tractable method to design policies whose utilization rate is arbitrarily close to the fundamental limit.

I-B Overview of Main Results and Technical Approach

In §III, Theorem 1 states our main result, from which we obtain Corollaries 1 and 2 that constitute our solutions to Problems 1 and 2, respectively. The following are key consequences of these corollaries:

  • According to Corollary 1, the infimum utilization rate (alluded to in Problem 1) can be computed by solving a finite-dimensional linear program.

  • If the arrival rate is stabilizable by the server then Corollary 2 guarantees that for each positive gap there is a stabilizing scheduling policy whose utilization rate exceeds the infimum (characterized by Corollary 1) by at most . Notably, such a scheduling policy can be obtained from a solution of a suitably-specified finite-dimensional linear program.

Our main technical approach builds on that of [14]

, inspired on which we find ways to solve Problems 1 and 2 by tackling simplified versions of them suitably adapted for an appropriately-constructed finite-state controlled Markov chain (denoted as

reduced process).

This article is mathematically more intricate than [14], which is unsurprising considering that it tackles not only stabilization but also regulation of the utilization rate. Among the new concepts and techniques put forth to prove Theorem 1, the distributional convergence results of §V, and the potential-like method used to establish them, are of singular importance —they are also original and relevant in their own right.

I-C Related literature

As mentioned earlier, to the best of our knowledge, our work is the first to study the problem of minimizing the utilization rate of a server whose performance is time-varying and dependent on an internal state that reflects its activity history. For this reason, there are no other results to which we can compare our findings.

An earlier study that examined a system that closely resembles ours is that of Savla and Frazzoli [19]. They studied the problem of designing a maximally stabilizing task release control policy, using a differential system model. Under an assumption that the service time function is convex, they derived bounds on the maximum throughput achievable by any admissible policy for a fixed task workload distribution. In addition, they showed the existence of a maximally stabilizing threshold policy when tasks have the identical workload. Finally, they also demonstrated that the maximum achievable throughput increases when the task workload is not deterministic. However, they did not consider the problem of minimizing utilization ratio in their study.

In addition to the aforementioned study, there are a few research fields that share a key aspect of our problem, which is to design a scheduling policy to optimize the performance with respect to one objective, subject to one or more constraints. For instance, wireless energy transfer has emerged as a potential solution to powering small devices that have low-capacity batteries or cannot be easily recharged, e.g., Internet-of-Things (IoTs) devices [3, 18]. Since the devices need to collect sufficient energy before they can transmit and the transmission rate is a function of transmit power, a transmitter has to decide (i) when to harvest energy and (ii) when to transmit and what transmission rate it should use. The studies reported in [13, 6, 23] examined the problem of maximizing throughput in wireless networks in which communication devices are powered by hybrid access points via wireless energy transfer. In a related study, Shan et al. [21] studied the problem of minimizing the total transmission delay or completion time of a given set of packets.

Integrated production scheduling and (preventive) maintenance planning in manufacturing, where machines can fail with time-varying rates, shares similar issues as scheduling devices powered by wireless energy transfer [4, 16, 24]. In more traditional approaches, the problems of production scheduling and maintenance scheduling are considered separately, and equipment failures are treated as random events that need to coped with. When the machine failure probability or rate is time-varying and depends on the age since last (preventive) maintenance, the overall production efficiency can be improved by jointly considering both problems. For instance, the authors of [24] formulated the problem using an MDP model with the state consisting of the system’s age (since the last preventive maintenance) and the inventory level, and investigated the structural properties of optimal policies.

Another area that shares a similar objective is the maximum hand-offs control or sparse control [17, 5, 10, 11, 9]. The goal of the maximum hands-off control is to design a control signal that maximizes the time at which the control signal is equal to zero and inactive. For instance, the authors of [17] showed that, under the normality condition, the optimal solution sets of a maximum hands-off control problem and an associated -optimal control problem coincide. Moreover, they proposed a self-triggered feedback control algorithm for infinite-horizon problems, which leads to a control signal with a provable sparsity rate, while achieving practical stability of the system. In another study [5], Chatterjee et al. provided both necessary conditions and sufficient conditions for maximum hands-off control problem. Ikeda and Nagahara [10] considered a linear time-invariant system and showed that, if the system is controllable and the dynamics matrix is nonsingular, the optimal value of the optimal control problem for the maximum hands-off control is continuous and convex in the initial condition.

Finally, another research problem, which garnered much attention in wireless sensor networks and is somewhat related to the maximum hands-off control, is duty-cycle scheduling of sensors. A common objective for the problem is to minimize the total energy consumption subject to performance constraints on delivery reliability and delays [7]. The authors of [15]

proposed using a reinforcement learning-based control mechanism for inferring the states of neighboring sensors in order to minimize the active periods. In another study, Vigorito et al. studied the problem of achieving energy neutral operation (i.e., keep the battery charge at a sufficient level) while maximizing the awake times

[22]. In order to design a good control policy, they formulated the problem as an optimal tracking problem, more precisely a linear quadratic tracking problem, with the aim of keeping the battery level around some target value.

I-D Paper structure

This article has five sections. After the introduction, in §II we describe the technical framework, including the controlled Markov chains that models the server and specifies a relevant auxiliary reduced process, define key quantities and maps that quantify the utilization rate, characterize key policy sets, specify the notion of stability used throughout the article, and state and prove certain preliminary results. Our main theorem and key results are stated in §III, while §IV and §V present continuity and distributional convergence properties, respectively, that are required in the proof of our main theorem. We defer the most intricate proofs, some of which also require additional auxiliary results, to appendices at the end of the article.

Ii Technical Framework and Key Definitions

This section starts with a synopsis of the framework put forth thoroughly in [14]. It replicates from  [14] what is strictly necessary to make this article self-contained. In this section, we also introduce the concepts, sets, operators and notation that are required to formalize and solve Problems 1 and 2.

We adopt a discrete-time approach in which each discrete-time instant

can be associated with a continuous time epoch, as described in 


Ii-a Stochastic Discrete-time Framework

As in [14], we consider that the server is represented by the MDP . The state of the server at time has two components representing the activity state and the availability state, respectively. There are possible activity states. The server is either available or busy at time , as indicated by or , respectively. Consequently, the state-space of the server is represented as:


where and are the sets of possible operational and availability states, respectively.

The MDP represents the overall system comprising the server and the queue length. More specifically, the state of the system is , where is the length of the queue at time , and the state-space of is:


Notice that excludes the case in which the server would be busy while the queue is empty.

The action of the scheduler at time is represented by , which takes values in the set . The scheduler directs the server to work at time when and instructs the server to rest otherwise. The assumption that the server is non-preemptive and the fact that no new tasks can be assigned when the queue is empty, lead to the following set of available actions for each possible state in :


We assume that tasks arrive according to a Bernoulli process . The arrival rate is denoted with and takes values in .

Ii-A1 Action-Dependent Server Performance

In our formulation, the efficiency or performance of the server during an epoch is modeled with the help of a service rate function . More specifically, if the server works on a task during epoch , the probability that it completes the task by the end of the epoch is . This holds irrespective of whether the task is newly assigned or inherited as ongoing work from a previous epoch. Thus, the service rate function quantifies the effect of the activity state on the performance of the server. The results presented throughout this article are valid for any choice of with codomain .

Ii-A2 Dynamics of the activity state

We assume that (i) is equal to either or when is and (ii) is either or if is . This is modeled by the following transition probabilities specified for every and in .


where the parameters , which take values in , model the likelihood that the operational state will transition to a greater or lesser value, depending on whether the action is or , respectively.

Ii-A3 Transition probabilities for

We consider that is independent of when conditioned on . Under this assumption, the transition probabilities for can be written as follows:

for every , in and in .

We assume that, within each epoch , the events that (a) there is a new task arrival during the epoch and (b) a task being serviced during the epoch is completed by the end of the epoch are independent when conditioned on and . Hence, the transition probability in (II-A3) is given by the following:

Definition 1

(MDP ) The MDP with input and state , which at this point is completely defined, is denoted by .

Table I summarizes the notation for MDP .

set of activity states
server availability ( available, busy)
server availability at epoch (takes values in )
server state components
server state at epoch (takes values in )
natural number system .
queue size at epoch (takes values in )
state space formed by
system state at epoch (takes values in )
possible actions ( = rest, = work)
MDP whose state is at epoch
set of actions available at a given state in
action chosen at epoch .
PMF probability mass function
TABLE I: A summary of notation describing MDP .

Ii-A4 Stationary Policies, Stability and Stabilizability

We start by defining the class of policies that we consider throughout the paper.

Definition 2

A stationary randomized policy is specified by a mapping that determines the probability that the server is assigned to work on a task or rest, as a function of the system state, according to

Definition 3

The set of stationary randomized policies satisfying (3) is denoted by .


Although the statistical properties of subject to a given policy depend on the parameters specifying , including , we simplify our notation by not representing this dependence, unless noted otherwise. With the exception of , which we do not pre-select, we assume that all the other parameters for are given and fixed throughout the paper.

From (II-A3) - (6), we conclude that subject to a policy in evolves according to a time-homogeneous Markov chain (MC), which we denote by . Also, provided that it is clear from the context, we refer to as the system.

The following is the notion of system stability we adopt throughout this article.

Definition 4 (System stability)

For a given policy in , the system is stable if it satisfies the following properties:

  • There exists at least one recurrent communicating class.

  • All recurrent communicating classes are positive recurrent.

  • The number of transient states is finite.

We find it convenient to define to be the set of randomized policies in , which stabilize the system for an arrival rate .

Before we proceed, let us point out a useful fact under any stabilizing policy in .

Lemma 1

[14, Lemma 1] A stable system has a unique positive recurrent communicating class, which is aperiodic. Therefore, there is a unique stationary probability mass function (PMF) for .

Definition 5

Given an arrival rate and a stabilizing policy in , we denote the unique stationary PMF and positive recurrent communicating class of by and , respectively.

Ii-B Utilization Rate: Definition and Infimum

Subsequently, we proceed to define the concepts and maps required to formalize the analysis and computation of the utilization rate, and its infimum alluded to in the statements of Problems 1 and 2.

Definition 6

(Utilization rate function) The function that determines the utilization rate in terms of a given stabilizable arrival rate and a stabilizing policy , is defined as:


The utilization rate quantifies the proportion of the time in which the server is working. Notably, the expected utilization rate , computed for with arrival late and stabilized by , coincides with the probability limit of the utilization rate, as defined for instance in [12] (with ), when the averaging horizon tends to infinity. Using our notation, the aforesaid probability limit can be stated as follows:


where is when and otherwise.

Definition 7

The infimum utilization rate for a given stabilizable arrival rate is defined as:


Ii-C Auxiliary MDP

We proceed with describing an auxiliary MDP whose state takes values in and is obtained from by artificially removing the queue-length component. We denote this auxiliary MDP by and its state at epoch by in order to emphasize that it takes values in . The action chosen at epoch is denoted by . We use the overline to denote the auxiliary MDP and any other variables associated with it, in order to distinguish them from those of the server state in .

Under certain conditions, which we will determine later on, we can determine important properties of by analysing . Notably, we will use the fact that is finite to compute via a finite-dimensional linear program, and also to simplify the proofs of our main results.

As the queue size is no longer a component of the state of , we eliminate the dependence of admissible action sets on , which was explicitly specified in (3) for MDP , while still ensuring that the server is non-preemptive. More specifically, the set of admissible actions at each element of is given by


Consequently, for any given realization of the current state , is required to take values in .

We define the transition probabilities that specify , as follows:

where and are in , and is in . Subject to these action constraints, the right-hand terms of (II-C) are defined, in connection with , as follows:


Ii-D Stationary policies and stationary PMFs of

Analogously to the MDP , we only consider stationary randomized policies for , which are defined below.

Definition 8 (Stationary randomized policies for )

We restrict our attention to stationary randomized policies acting on , which are specified by a mapping , as follows:

for every in and in . The set of all stationary randomized policies for which honor (10) is defined to be .

Following the approach in [14], henceforth we restrict our analysis to the subset of defined as follows:


The main benefit of focusing on policies in , as stated in [14, Corollary 1], is that has a unique stationary PMF for every in . More specifically, that strategies in rule out the case in which is an absorbing state guarantees the uniqueness of the stationary PMF. Furthermore, from [14, Lemmas 2 and 4] we conclude that restricting to any search that seeks to determine bounds or fundamental limits with respect to stabilizing policies incurs no loss of generality.

Ii-E Service rate of and précis of stabilizability results

We start by defining the service rate of for a given policy in :


The maximal service rate for is defined below.


As stated in [14, Theorems 3.1 and 3.2], any arrival rate lower than is stabilizable. Furthermore, these theorems also assert that any arrival rate above is not stabilizable and that can also be computed by determining which threshold policy , among the finitely many defined in [14, (6)], maximizes .

Definition 9

We define the map as follows:




It follows from its definition that yields a policy for that acts as the given in when the queue is not empty and imposes rest otherwise.


We reserve , without a superscript, to denote a design parameter. It acts as a constraint in the definition of the following policy sets.

Definition 10

(Policy sets and ) Given in , we define the following policy sets:


where is defined as:


We also define the following class of policies generated from and through :


The following proposition establishes important stabilization properties for the policies in .

Proposition 1

Let the arrival rate in be given. If is in then is stable, irreducible and aperiodic for any in .

Stability of can be established using the same method adopted in [14] to prove [14, Theorems 3.2], which uses [14, Lemma 8] to establish a contradiction when is assumed not stable. That is irreducible follows from the fact that, under any policy in , all states of communicate with . That the probability of transitioning away from is less than one implies that the chain is aperiodic.

An immediate consequence of Proposition 1 is that is a nonempty subset of when . This implies that, as far a stabilizability is concerned, there is no loss of generality in restricting our analysis to policies with the structure in (18). More interestingly, from Theorem 1, which will be stated and proved later on in Section III, we can conclude that restricting our methods for solving Problem 2 to policies of the form (18) also incurs no loss of generality.

The following projection map will be important going forward.

Definition 11 (Policy projection map )

Given in , we define a mapping , where




Notice that although the map depends on , for simplicity of notation, we chose not to denote that explicitly. It is worthwhile to note that the map , for a given less than , allows us to establish the following remark comparing the service rate notions for and .

Remark 1

Given in and in , our analysis in [14] implies that the following hold:


Notably, and follow from [14, Lemma 4]. Using a similar argument, follows from the fact that is stabilizing, as guaranteed by Proposition 1 when is in .

Ii-F Utilization rate of and computation via LP

We now proceed to defining the utilization rate of for a given in . Subsequently, we will define and propose a linear programming approach to computing the infimum of the utilization rates attainable by any policy for subject to a given service rate.

Definition 12

Given a policy in , the following function determines the utilization rate of :

Definition 13

(Infimum utilization rate and ) The infimum utilization rate of for a given departure rate is defined as:


We also define the following approximate infimum utilization rates:


Notice that the infimum that determines and is well-defined because there is a unique stationary PMF for each policy in .

Remark 2

Notice that since , we conclude the that following holds:


We now proceed to outlining efficient ways to compute , which is relevant because, as Corollary 1 indicates in §III, we can use it to compute when . Hence, below we follow the approach in [1, Chapter 4] to construct approximate versions of that are computable using a finite-dimensional linear program (LP). Subsequently, we will obtain the policies in  corresponding to solutions of the LP, as is done in [1, Chapter 4]. The policies obtained in this way will form a set for each in that will be useful later on.

Definition 14

(-LP utilization rate )
Let be a given constant in and be a pre-selected departure rate in . The -LP utilization rate is defined as:


where the minimization is carried out over the following set:

Every solution is subject to the following constraints and is compactly represented as :


and the equality below guarantees that every solution will be consistent with :

Definition 15

(Solution set ) For each in and in , we use  to represent the set of solutions of the LP specified by (30). We adopt the convention that is empty if and only if the LP is not feasible.

Ii-G LP-based policy sets

For each solution in we can obtain a corresponding policy in for as follows:

Remark 3

Subject to the definition in (31), the constraint (30b) is equivalent to , which will, then, hold for every solution in .

Definition 16

(Policy set ) For each in and in , we define the following set of policies :


Here, we adopt the convention that  is empty if and only if  is empty.

The following proposition will justify choices for we will make at a later stage to guarantee that is nonempty for in .

Proposition 2

If in is such that is nonempty then is nonempty for any in and in .

We start by invoking [14, Lemma 7] to conclude that is nonempty, and consequently that