A token-based central queue with order-independent service rates

02/06/2019 ∙ by U. Ayesta, et al. ∙ University of Amsterdam 0

We study a token-based central queue with multiple customer types. Customers of each type arrive according to a Poisson process and have an associated set of compatible tokens. Customers may only receive service when they have claimed a compatible token. If upon arrival, more than one compatible token is available, an assignment rule determines which token will be claimed. The service rate obtained by a customer is state-dependent, i.e., it depends on the set of claimed tokens and on the number of customers in the system. Our first main result shows that, provided the assignment rule and the service rates satisfy certain conditions, the steady-state distribution has a product form. We show that our model subsumes known families of models that have product-form steady-state distributions including the order-independent queue of Krzesinski (2011) and the model of Visschers et al. (2012). Our second main contribution involves the derivation of expressions for relevant performance measures such as the sojourn time and the number of customers present in the system. We apply our framework to relevant models, including an M/M/K queue with heterogeneous service rates, the MSCCC queue, multi-server models with redundancy and matching models. For some of these models, we present expressions for performance measures that have not been derived before.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The discovery of queueing systems with a steady-state product-form distribution is probably one of the most fundamental contributions in queueing theory. In a pioneering work,

[17] showed that in a queueing network formed by M/M/1 nodes, the joint steady-state distribution is given by the product of the marginal distributions of the individual nodes. Roughly speaking, this implies that the stationary distribution of the network can be obtained by multiplying the stationary distributions of the individual nodes assuming that each node is in isolation. Due to this property, the analysis of a queueing network reduces to that of single-node queues, simplifying the analysis tremendously. Product-form distributions provide insight into the impact of parameters on the performance and allow efficient calculation of performance measures. As a consequence, since Jackson’s discovery, considerable effort has been put in understanding the conditions such that a stochastic model has a product-form steady-state distribution. An important step forward was made by [8] and [19], who introduced BCMP networks and Kelly networks, respectively, which have product-form steady-state distributions. These networks demonstrate that models with multiple types of customers and general service time distributions could also have a product-form distribution. Since then, further studies have shown that networks with negative arrivals, instantaneous signals and blocking might have a product-form distribution, see [10] for an overview.

Recent years have witnessed a surge of interest in parallel server models with different types of customers. The main application is in the study of data centers, which consists of a pool of resources that are interconnected by a communication network. Indeed, data centers provide the main infrastructure to support many internet applications, enterprise operations and scientific computations. In two relevant studies, [22] and [20], sufficient conditions have been obtained for a multi-server system to have a product form. We note that these product-form distributions are not expressed as the product of per-type or per-server terms. In fact, they are expressed as a product of terms that correspond to a unique customer in the system. In that respect, they do not allow an interpretation in terms of a product of marginal distributions, as is the case with classical product-form distributions for Jackson, BCMP and Kelly networks. A notable difference between the two papers is in the state descriptor considered therein. In the multi-type customer and server model of [22], the authors consider an aggregated descriptor that keeps track of the servers being active but not of the type of customers being served or waiting. On the other hand, in the order-independent queue of [20], the state descriptor keeps track of the type of customers in the system, but not of the servers being active. These two modelling approaches have led to two separate streams of papers, where each of the approaches covers applications that are not covered by the other. Some of the applications studied are systems with blocking, redundancy and computer clusters, see Section 7 for more details. A natural question that arises is whether the original models of [22] and [20] can be generalised while preserving the product-form distribution in steady state.

We answer this question in the affirmative in this paper. We analyse a token-based central queue with multiple types of customers and multiple tokens. As will be proved in the paper, this model is a generalisation of both the model of [22] and the order-independent queue of [20]. Customer of each type arrive according to a Poisson process and have an associated set of compatible tokens. To receive service, a customer must claim a compatible token. Therefore, an arriving customer will immediately claim a compatible token if there is one available, otherwise it will wait until it can claim one. As will become clear later on in the paper, the meaning of a token is application-dependent. It might represent a physical server or it might represent the total service rate allocated to a given customer type. If upon a customer arrival more than one compatible token is available, an assignment rule determines which token will be claimed by the customer. A customer without a token receives no service and a customer holding a token receives service at a rate given by a state-dependent service rate function that satisfies certain conditions. As we will show later, these conditions are reminiscent of those in the order-independent queue.

Our first main result shows that, provided the assignment rule and the service rate function satisfy the required conditions, the steady-state distribution has a product form. As in the case of [22] and [20], this product-form distribution cannot be expressed as the product of per-type or per-token terms. We further show that the order-independent queue and the multi-type customer and server model of [22] are particular instances of our model and that our model includes examples that were not covered by either. In other words, our model and main results provide a unifying framework for parallel-server models with a product-form distribution. Our second main contribution is that we use the steady-state distribution of the general model to characterise transforms of relevant performance measures, including the sojourn time and the number of customers in the system. We illustrate the applicability of the framework by computing the steady-state distribution and analysing the performance of many relevant models, including an queue with heterogeneous service rates, the MSCCC queue, multi-server models with redundancy and matching models. For some of these models, we present expressions for performance measures that have not been derived before. It is important to note that, even though our model is based on a central-queue architecture, some of the applications, in particular the redundancy models, correspond to topologies without a central queue, where instead every server has its own queue. We explain this in more detail in Section 7.

The rest of the paper is organised as follows. In the next section, we discuss studies related to this paper. Section 3 then describes the token-based central queue that we study in more detail and introduces the required notation. Section 4 shows that the token-based central queue has a product-form stationary distribution, which allows for the calculation of other performance measures in Section 5. We show in Section 6 that the models of [20] and [22] are captured by our model, after which we discuss applications of our model in Section 7.

2 Related work

As mentioned in the introduction, there has been a surge of interest in multi-server queueing models in recent years. The main two references related to our work are [22] and [20], which identify classes of models that have a product-form stationary measure.

Subsequently, several studies have used the results of these two models to analyse a variety of other models. An important application area that has received a lot of attention is formed by redundancy models. While there are several variants of a redundancy-based system, the general notion of redundancy is to create multiple copies of the same customer that will be sent to a subset of servers. Depending on when replicas are deleted, there are two classes of redundancy systems: cancel-on-start (COS) and cancel-on-completion (COC). In redundancy systems with COC, once a copy has completed service, the other copies are deleted and the customer is said to have received service. On the other hand, in redundancy systems with COS, copies are removed as soon as one copy starts being served. In [9], the authors observe that the COC model is a special case of the order-independent queue [20], which enables the authors to derive the steady-state distribution directly. We also refer to [14] for a thorough analysis of the COC system. On the other hand, [7] shows that while the COS based redundancy system is not an order-independent queue, it fits within the multi-type customer and server model of [22]. They also show that, while the COC model does not the framework of [22], it does fit an extension of it, where the state descriptor used in [22] is endowed with a more general service rate function. We will use the resulting state descriptor also in this paper (see Section 3 for more details).

An important application area, which fits the framework of [22], is that of matching models, which have been studied in several recent papers, see for instance [1]. We also refer to [4] and [3], where the authors explore the relation between redundancy and matching models. In Section 7, we apply our token-based approach to derive the steady-state distribution of a large family of matching models.

Another important related work is [6]. The model considered therein is similar to the one of [22] with the exception that the assignment policy ‘assign longest idle server’ (ALIS) is used. Under the ALIS-policy, a new arrival that could be served by more than one inactive server, is assigned to the longest-idle server. To implement this policy, the state descriptor is enriched with information on the idleness of every inactive server. The authors prove that the steady-state distribution of this model has a product form. In our paper, we do not consider the ALIS variant, however, from the analysis of [6], we expect that all our results would carry over to this case. We discuss this in more detail in Section 4.

3 Model description

In this section, we describe the token-based central queue model in more detail.

Customers and tokens.  The model that we study represents a central-queue system where the customers may be of mutually different types (or classes). The set of all customer types is denoted by and customers of type arrive according to a Poisson process with rate . As a result, the total arrival rate of customers to the system is . A distinguishing feature of this model is the fact that in order for customers to receive service, they must hold a token. To this end, a set of tokens denoted by is also associated with the model. In particular, a customer type is characterised by a token set which consists of the compatible tokens that can be held by customers of type . Similarly, associated with a token is a set of customer types that can choose the token, denoted henceforth by . Clearly, and

Assignment of customers to tokens.  At any point in time, the set of available tokens is denoted by , , while the set of unavailable tokens is given by . To receive service, customers are required to claim a compatible token. Hence, when a customer of type arrives, it will claim a single token from the set (if it is non-empty), and then join the central queue. In case no compatible token is available upon arrival (i.e. ), the customer will join the queue and wait until a token in the set becomes available. If multiple compatible tokens are available, i.e., then an assignment rule decides which of the tokens will be claimed by the arriving customer. More particularly, this assignment rule constitutes a randomised policy which, given and the class of the arriving customer, dictates the probability with which the customer should claim a particular token. We assume this assignment rule to satisfy a so-called assignment condition, which we elaborate on later in this section. Once a token is selected by a customer, it is no longer available for selection (i.e. ) until the customer has completed service. Upon release, the token will immediately be reclaimed by the longest waiting tokenless customer of a type from the set . If there are no such customers, the token is added back to the set such that . We shall refer to customers with tokens as active customers and identify such customers with the token associated with them. Customers in the central queue without tokens will be referred to as inactive customers.

Departure rates of customers. 

We assume service requirements of customers to be exponentially distributed. In light of the model’s token mechanism, this means that the departure rate of active customers from the system is non-negative, while that of inactive customers is zero. Throughout the paper, we assume that the departure rates associated with active customers satisfy a certain condition, which is specified below. Since this condition is reminiscent of the order-independent queue as introduced in

[20], we call this the order-independent condition.

Markovian state descriptor.  Due to the memoryless properties of the arrival and departure processes, the token-based central queue can be interpreted as a Markov process. We now introduce the state descriptor that we use to analyse this model. We will show in Section 4.1 that this state descriptor leads to a Markov system by stating its balance equations. The state descriptor that we use for the token-based central queue is of the form . This descriptor retains the order of arriving customers in the central queue from left to right. When the model is in state , it has active customers which have claimed tokens . Furthermore, there are inactive customers in the central queue that have arrived between the two customers that have claimed tokens and , respectively, for . Inactive customers at the end of the queue are denoted by . Since tokens are always claimed by the longest waiting eligible customer, we have that e.g. represents inactive customers which have token as their only compatible token. The set of such customer types is denoted by . In general, for , we denote the set of customer types that can claim tokens only from the set by . Thus, the customer types of the customers between those with tokens and must belong to the set . As the state descriptor retains the order of arrival, the oldest customer in a state is represented by token . The newest customer is one of the customers, or in case , it is the active customer with token . Furthermore, when , all the customers between and have arrived before the customers between and . We henceforth denote the state space of the resulting Markov process by , where any generic state is of the type . The only exception is the empty state with no customers present, which we denote by .

Assignment rule and assignment condition.  Recall that in case multiple compatible tokens are available upon the arrival of a customer, the assignment rule of the system determines the probability with which any of these tokens is assigned to the customer. Furthermore, in state , the arrival rate of customers that will initially be inactive is given by , while the arrival rate of customers that become active immediately is given by . Given the nature of the assignment rule, we denote by the rate at which arriving customers claim token , provided that is the set of all unavailable tokens. While depends on the assignment rule, it holds for any assignment rule that


As in [22], for the system to have a product-form stationary distribution, we require that any assignment rule satisfies the following assignment condition.

Condition 1.

An assignment rule is said to satisfy the assignment condition if for any possible combination of unavailable tokens , , it holds that


for every permutation of

It is shown in [2] that there always exists at least one assignment rule for which the assignment condition is satisfied. As we will also see in Section 7, the assignment condition generally allows for a rather large set of assignment rules.

Order-independent condition.  To state the order-independent condition, we require additional notation. For any state , let denote the departure rate of the active customer holding token . Furthermore, let denote the total departure rate in state . Additionally, we denote by the total number of customers in state . The order-independent condition now reads as follows.

Condition 2.

The departure rates of the model are said to satisfy the order-independent condition if in a given state , each of the rates , , can be written as



  1. is a non-negative real-valued function for which , ,

  2. is independent of any permutation of and

  3. is a non-negative real-valued function for which for .

These restrictions on the functions , and have the following implications. First, by the restriction , the departure rate of an active customer may depend on the types of the active customers ahead of it, but not on those behind. Note that may equal zero, so that it is possible for active customers to still receive no service. Second, is defined such that the total departure rate of customers from the system is the same for any permutation of the active customers. Finally, the function allows the departure rate of customers to depend on the total number of customers present in the system, but at the same time the departure rate is indifferent to the types of the inactive customers. Next, based on the definition of , we conclude that


As mentioned earlier, this order-independent condition is reminiscent of the order-independent queue as introduced in [20]. The difference, however, stems from the fact that we consider a different state descriptor, which captures a broader set of systems (cf. Section 6.2). It is also important to note that this condition allows our model to be more general than that of [22], as will become clear in Section 6.1.

Further notation.  We conclude this section with notation needed to describe several important performance measures. At an arbitrary point in time, let denote the number of inactive customers in the system. More particularly, denotes the number of inactive customers in the central queue between the two customers that have claimed tokens and . Thus, when the system is in state , it holds that and . Moreover, the number of type- customers among these customers is denoted by . As a consequence, the total number of inactive type- customers, denoted by , satisfies . Using the same style of notation, denotes the total number of customers present in the system. Furthermore, for , represents the number of customers in the ‘-th’ part of the system, where the added single customer is the one that holds token . Of these customers, are of type , so that , the number of type- customers present in the system, satisfies

. Next, we define the time-till-token of a customer to be the duration of the period between its arrival and the moment the customer claims of a token. Then, the

time-till-token and the sojourn time of a type- customer is denoted by and , respectively. Likewise, the quantities and refer to the time-till-token and the sojourn time of an arbitrary customer. Finally, the indicator function on the event returns one if event is true, and zero otherwise.

4 Product-form stationary distribution

In this section, we derive the stationary distribution of the token-based central queue and find that it has a product form. In doing so, we use techniques from [22]. We describe the transition rates and the global balance equations pertaining to this process in Section 4.1. Then, we proceed to derive the product-form stationary distribution in Section 4.2. Section 4.3 subsequently points out how this stationary distribution can be computed efficiently for models with indistinguishable tokens by aggregation of states. We conclude with a note on the stability conditions of the token-based central queue in Section 4.4.

4.1 Transition rates and balance equations

The transitions associated with the model are organised into the following three categories. The first category contains transitions that are caused by the arrival of a customer. The second and third categories pertain to transitions due to a departure of a customer (i.e. a completion of service). In particular, the second category contains departure transitions where a token becomes available. The third category contains the remaining departure transitions, where the released token is immediately reclaimed by an inactive customer. We now proceed to describe the transition rates within each of these categories.

4.1.1 Arrival transitions

Recall from Section 3 that an arriving customer either joins the central queue as an inactive customer (when it finds no compatible tokens in the set ) or joins it as an active customer. In a given state , the arrival rate of inactive customers is . Hence, this is also the transition rate from state to state . Customers that immediately claim a token upon arrival, , arrive at rate . Therefore, the transition rate from state to state is given by . Recall that these rates satisfy Condition 1. By virtue of (1), we conclude that the total arrival rate into the system in any given state equals , as expected.

4.1.2 Departure transitions where tokens become available

To describe the departure rates, additional notation is required. Transitions to a state due to a departure of a customer where a token is released to the set are possible from states of the form

where , and . For future reference, it is worth noting that , as the population size of the two states only differ by the single active customer that releases token . Furthermore, we have that


Note, however, that is merely the rate in state at which the customer holding leaves the system. To obtain the transition rate from state to , this quantity must be multiplied with the probability that after this departure, the token is indeed released from activity. This probability is given by , where


is the probability that a customer waiting in the -th portion of the central queue can not be served by token . As a special case, we define for any token . It now follows that the transition rate from to is given by .

4.1.3 Departure transitions where tokens are reassigned

We proceed to consider the departure transitions, where a token is immediately reclaimed by another customer waiting further down the central queue. Transitions of this type to state are possible from states of the form

where , and . These transitions describe the event that the customer which holds departs the system, and token is subsequently reclaimed by an inactive customer between the customers holding and . Again, and


Similar to the previous case, the transition rate from a state to state can be argued to be equal to , where the latter factor is given by

with as defined in (6).

4.1.4 Global balance equations

Now that all transitions rates have been described, the global balance equations can be obtained. Using results from Sections 4.1.1-4.1.3, denoting the stationary distribution by and recalling that the total departure rate from a state is simply , the global balance equations are, for , given by


For , however, several of these terms can be omitted, so that


4.2 Product-form stationary distribution

We now present one of the main contributions of this paper. When both the assignment condition and the order-independent condition (cf. Conditions 1 and 2) are satisfied, the token-based central queue together with its state descriptor allows for a product-form stationary distribution. This distribution is given in the following theorem.

Theorem 1.

If the token-based central queue is stable and Conditions 1 and 2 are satisfied, then, for each , the stationary distribution is given by



The normalising constant is given by


where denotes the set of all possible combinations of tokens from the set .


This proof verifies that (10) satisfies the global balance equations (8) and (9), which guarantees that (10) represents the unique stationary distribution. It is straightforward to show that (10) satisfies (9). To see that (10) satisfies (8), we show in Appendix A that (10) satisfies the following three equations for every and :




Summing (13) over all available tokens and adding (4.2) and (14), we conclude using (1) that (10) satisfies (8). The theorem now follows. ∎

Remark 1.

Note that the expression for the stationary distribution in (10) is not in closed form. This is due to the fact that the normalising constant contains infinite sums. For some specific cases of the function , though, given that the token set is finite, allows for a closed-form expression. For example, when , (11) reduces to


which is in closed form. We will see in Section 7 that is a constant function in many applications.

Remark 2.

While we assume our model to satisfy Condition 1, a different assignment mechanism has been studied in [6] called ALIS: ‘Assign Longest Idle Server’. Stated in the context of the token-based central queue, the key feature of an ALIS queue is that an arriving customer who finds multiple eligible tokens upon arrival, will activate the token that has been available the longest. Since this mechanism cannot be captured by an assignment rule as described in Section 3, we must extend the state descriptor to keep track of which token has been available the longest, in order to regard this mechanism. The new state descriptor is of the form , where are the available tokens in ascending order of the time they have been idle. In other words, if an arriving customer is eligible to claim token , it will do so. Otherwise, it will claim if it is able to do so, and so on. By using the same proof techniques as in [6], we expect it can be shown that the stationary distribution for the token-based central queue with an ALIS mechanism also has a product form.

4.3 Aggregation of states for indistinguishable tokens

When the model contains tokens which are indistinguishable from one another, computation of the stationary distribution in Theorem 1, and especially its normalising constant in (11) can be made more efficient. To define the notion of indistinguishability, we write the token set as a union of disjoint token sets , where it holds for any two tokens , that , , and for . We then call tokens which belong to the same token set indistinguishable from one another.

If the number of disjoint token sets is much smaller than , the computational burden of Theorem 1 can be relieved by state aggregation. To this end, let us say that a token has a token label whenever , . Then, since any two tokens and from the set are indistinguishable, we may as well address both of them by their token label . This leads to the state descriptor of the form , where now represents the label of the token which is held by the -th active customer in the system, but not the identity of the actual token. We denote the state space under this state descriptor with . Let denote the label of token , i.e. if . Then, by aggregation of states, one can derive the following stationary distribution for the aggregated state descriptor from (10):


where (with ) represents the arrival rate of customers that immediately claim a token with label , when there are active customers that have claimed tokens from labels . Likewise, when considering tokens such that for , we define and represents the same set of customer types as . The normalising constant as given in (11) remains unchanged, but can now alternatively be written as

where represents any possible combination of token labels. In the sequel, when working with the aggregated state descriptor, we will use and as notation for the equivalents of and .

4.4 Stability

From the stationary distribution (10), conditions for stability can be derived. In particular, in case the function has a limit , the system will be stable if for each and . Under this condition, Equation (10) constitutes a non-null and convergent solution of the equilibrium equations of the irreducible Markov process underlying the model. As such, it is implied by [12, Theorem 1] that the Markov is ergodic, leading to stability. When for some and , we have by (11) that , implying that the expected return time to state (0) is infinite. As such, the Markov process is not ergodic and the token-based central queue is unstable. Finally, in case , the questions whether or not there is ergodicity depends on the way (and possibly the speed at which) the function converges to its limit .

5 Performance analysis

Now that we have derived the (product-form) stationary distribution in Section 4, we study several performance measures of the token-based central queue. In particular, we study the (per-type) number of inactive customers in Section 5.1. Likewise, we study the (per-type) number of customers present in the system in Section 5.2. Then, making use of the distributional form of Little’s law (cf. [18]), we obtain results for the time-till-token of type- customers (i.e., the time it takes for customers to claim a token) in Section 5.3. As we will see in Section 7, coincides with the waiting time of type- customers in many applications of our model. Finally, we also consider the sojourn time of customers in Section 5.4.

5.1 Number of inactive customers

This section considers the number of inactive customers in the system. For applications where the time-till-token represents the waiting time, this number coincides with the number of customers in the system waiting for service. The main theorem of this section concerns the probability generating function (PGF) of , the number of type- customers that are inactive.

Theorem 2.

Let for and . Then, the joint PGF of is, for , given by


The proof extensively uses Theorem 1 and can be found in Appendix B. ∎

An expression for , the total number of inactive customers in the system, now follows from the fact that and .

Corollary 3.

The total number of inactive customers in the system satisfies, for ,


5.2 Number of customers in the system

We now study the number of customers present in the system, both per-type () and in general (), by noting that these customers are comprised of inactive customers on one hand and active customers whose service is yet to be completed on the other hand.

Theorem 4.

Let be the type of the customer that holds token and define . Then, the joint PGF of , representing the per-class number of customers present in the system, is, for given by