I Introduction
Infinite server queues are an important stochastic modeling tool for a diverse array of disciplines: they are used to study Internet services [1], fragmentation in dynamic storage allocation [2], software reliability [3], and call centers [4]. Although having an infinite number of servers is typically only an approximation of reality, such approximations are often very useful. In the case of Internet services, these approximations are also becoming increasingly reasonable. Cloud based web hosting from companies like Amazon, Google, and Microsoft allows businesses to dynamically provision their web services so they can accommodate almost arbitrarily large customer demand [5]. Because customer demand is (almost) always met immediately, infinite server queues offer both natural and tractable stochastic models.
While infinite server queues are useful by themselves, they become even more powerful when used in conjunction with queueing network theory. In particular, for many types of queueing networks it is easy to compute productform stationary distributions for the number customers at each node in the network. The term productform refers to the fact that the joint distribution is merely a product of the marginals and this drastically simplifies the analysis of seemingly intractable systems. The first major theoretical result on productform distributions for queueing networks was presented by Jackson
[6]. This seminal work was later extended by Baskett, Chandy, Muntz, and Palacios in the study of what are now called BCMP networks [7]. Subsequently, Kelly used the idea of quasireversibility to study Jackson and BCMP networks in the more general framework of Kelly networks [8]. Kelly networks have been used and built upon in several ways; see [9, Chapter 4] and the references therein for more details.A particularly interesting development is the notion of insensitivity to the service time distributions (see [10, Chapter 3] and the references therein). Insensitivity refers to the fact that many productform distributions depend on the service time distributions only through their means and hence, general service times can easily be incorporated into queueing network models. For instance, even if the service times are not exponential, applying Jackson’s Theorem [6] as if they were exponential will often yield the correct productform distribution. Insensitivity results still typically require Poisson arrivals, but allowing for general service time distributions offers considerable modeling flexibility.
Although the productform results mentioned above are very general and powerful, they typically require Markovian routing. Formally, this means that when a customer completes service at a particular node and is routed elsewhere, the routing decision must be independent of the past evolution of the network. In particular, requiring Markovian routing precludes routing that is dependent on previous service times. This is limiting because we are interested in a model in which routing decision are based on service times through service time constraints, e.g. deadlines.
In particular, we are motivated by Internet services with timeouts [11]. For example, consider a customer who uses an online financial service with multifactor authentication for logging in. There is a deadline for the verification process – if the customer takes too long then the verification process will fail and the customer will need to try again. In this scenario, we can model the login step as an infinite server queue because (assuming a cloud hosting service is used) an arbitrary number of customers can signin without waiting. The total service time is given by the minimum of two independent service times – the amount of time required to signin and a deadline. Moreover, the routing decision depends on which service time achieves this minimum. Indeed, if the timeout occurs and the service time is the deadline, then the customer will be rerouted back to the signin page. If the timeout does not occur then the customer will successfully enter the system. Even in this very simple scenario, we see that timeouts and deadlines can drive routing decisions. As a result, the routing is generally not Markovian and we are unable to apply the typical queueing theory to find the equilibrium distribution of the system.
Although computing the equilibrium distribution of a queueing system with nonMarkovian routing is not necessarily straightforward, knowing the equilibrium distribution can be quite useful. In the case of Internet services, the equilibrium distribution can be used for marketing. For instance, the average number of customers on a page and the average amount of time they spend on that same page are useful for pricing display advertisements [12]. While this information can be collected empirically, getting the full distribution can be time consuming if the services times are heavytailed, as is often the case [13]. Consequently, there is substantial value in having an analytic model that can be used to understand and design web services without running full experiments.
Given this motivation and background, the remainder of the paper is organized as follows. In Section II, we present our model and position it relative to the existing theory. Since our model cannot be immediately solved by applying existing results, in Section III we show that our model does in fact have a productform equilibrium distribution. In Section IV, we verify this analytic result with a simulation. We conclude in Section V.
Ii Model Formulation and
Limitations of Existing Network Theory
In this section, we formally outline the stochastic model of interest. We then discuss the applicability of the theory of Jackson and Kelly networks. In particular, we show that if all service times are exponentially distributed then we can apply Jackson’s Theorem
[6] to find the productform distribution. However, if the service times are general, then we cannot apply insensitivity and quasireversibility results (e.g. [10, Chapter 3]) to find a product form equilibrium distribution. This demonstrates the nuances of our model.Iia Model Formulation and Motivation
We want to model infinite capacity service systems in which customers’ service times and routing probabilities are impacted by deadlines. For example, consider a system in which customers experience a natural service time but are ejected if their service time exceeds a fixed deadline. In this case, the total service time is the minimum of the natural service time and the deadline. Moreover, the customer will be routed differently based on whether or not he experienced the deadline or the natural service time. More generally, we consider systems in which customers experience a service time that is the minimum of several independent service times. Customers are routed stochastically but the routing matrix will change based on which of the competing service times achieved the minimum.
Formally, we consider a queueing network with nodes. Each node has an infinite number of servers. At node , customers experience independent competing service times
with a total service time of
We assume that the probability that more than one of the service times achieves the minimum is zero. We allow for some of these competing service times to be infinite. Let denote the service time for customer at node and
We assume that each customer’s service times are independent of all other customers’ service times. Furthermore, if a customer is served by the same node multiple times, then these service times are also independent. If
then customer is routed from node to node with probability and the customer exits the network with probability . Customers arrive to node according to a Poisson process with rate . We assume that every customer spends only a finite amount of time in the network (i.e. the network is open).
We note that because we are interested in infinite server nodes, we can focus on a single class of customers without any loss of generality. This is because the jobs do not interact in a queueing buffer. Indeed, if we had multiple classes, we could consider a “copy” of the network for each class. As long as customers do not change classes, these copies will not interact just as the different customer classes do not interact. The arrival rates would differ for each class but the model would not fundamentally change.
IiB Limitations of Jackson and Kelly Network Theory
Jackson’s Theorem [6] is a celebrated result for attaining productform equilibrium distributions of queueing networks. Although Jackson’s Theorem applies only to queueing networks with exponential service times, the results can often be extended to the case of general service times (e.g. Kelly networks [8]) with insensitivity arguments, e.g. [10, Chapter 3]. We first show how Jackson’s Theorem can be applied to our model in the case that the service times are exponential. We then show that unlike many other models, the equilibrium distribution depends on the service time distributions in their entirety, not merely through their means and hence, insensitivity arguments do not apply to our model.
Theorem 1 is the version of Jackson’s Theorem presented in [9, Chapter 2]. Jackson’s Theorem was originally presented in [6].
Theorem 1 (Jackson’s Theorem)
Suppose we have an node queueing network in which jobs arrive at rate . Arriving jobs are independently routed to node with probability where . Upon service completion at node , a job is routed to node with probability and leaves the network with probability . We assume that is substochastic (i.e. at least one of the row sums is strictly less than one so the network is open).
When there are jobs at node , the exponential service time has rate where with and for all .
Let where the entry is and define as the solution^{1}^{1}1A unique solution exists because is substochastic. Furthermore, this solution is nonnegative. See [9, Chapter 2] for details. to the following equation:
Let be defined as
and assume the following:
Let be the number of jobs at node in equilibrium. Then
where
and
To apply Theorem 1, we need to focus on the case when is exponentially distributed with rate . Now we will need to make use of the following elementary result:
Lemma 1 ([10, Fact 2.3.1])
To quote [10], this lemma “states that the index of the smallest of independent exponential random variables is independent of the value of that minimum which is also exponentially distributed.” This tells us that a customer at node experiences an exponential service time with rate and that after service is complete the customer is routed to station with probability
independently of the service time. In addition, we know that the service rate function at node is . If is the number of customers at node in equilibrium, then applying Theorem 1 gives us that
where and solves
Furthermore, Theorem 1 tells us that
i.e. we have a productform distribution.
This seems to be a very powerful result. Not only do we have a productform distribution, the distribution depends on the service times only through their first moments. It is tempting to assume an insensitivity result such as
[10, Theorem 3.3.2]:Theorem 2
Suppose the nodes in Theorem 1 are infinite server queues with general and independent service times. Assume is the mean service time at node . We continue to assume that the arrivals to the network are Poisson and that the routing decisions are stochastic and independent of the past evolution of the network. Then
where
and solves the same linear flow equations from Theorem 1.
Informally, this means that we can apply Theorem 1 even when the service times are nonexponential. This kind of result applies for queues besides the infinite server queue; the key requirement is that the queue be quasireversible (see [10, Chapter 3] and the references therein for a discussion of quasireversible queues and insensitivity results).
For our model, when considering exponential services times, the probability that a customer at node will go to node after service is . However, for general service times the probability is
and in general . Consequently, assuming full insensitivity does not seem correct. Another approach would be to use instead of but continue to use as the service rate at node . Indeed, the term “insensitivity” typically refers to insensitivity of the service time distributions so it makes sense that the routing matrix should change. Although less naïve than assuming full insensitivity, we will see that this approach is also incorrect.
Both naïve insensitivity approaches are incorrect for two reasons. First note that if the service distributions change, then will generally not be the mean service time at node . However, even if the mean service times were preserved, applying Theorem 2 would still be incorrect because the routing is not Markovian. As mentioned earlier, Markovian routing is a form of probabilistic routing for which the routing decisions do not depend on the past evolution of the network. Markovian routing is required for both Theorem 1 and Theorem 2. In our model, the routing decisions are based on the service time that achieves the minimum and consequently the routing decisions are not independent of the past evolution of the network.
We see that Lemma 1 and Theorem 1 provide us with a productform result when the service times are exponential, but these same arguments do not extend to the case of nonexponential service times. We will demonstrate by simulation in Section IV that not only are the two naïve approaches (assuming full insensitivity and assuming insensitivity of the service times) not mathematically justified, they also give incorrect forms for the equilibrium distribution.
Iii A ProductForm Equilibrium Distribution
The discussion in Section IIB explains why typical mathematical arguments cannot be applied to find a productform distribution for the queueing model described in Section IIA. However, in this section we are able to explicitly characterize the stationary distribution as a productform. The key insight is to construct a queueing network for which the routing is Markovian and also has a stochastically equivalent equilibrium distribution. We are able to do this because infinite server queueing nodes can be represented as several infinite server queueing nodes acting in parallel. Although it may seem that adding more nodes adds more servers, because we are dealing with infinite server queues to begin with, the total number of servers is actually preserved.
Theorem 3
Consider the queueing model from Section IIA and let be the number of customers at node in equilibrium. For and , define
and let be defined by the following equations:
(1) 
Then
where . Furthermore,
i.e. we have a productform distribution.
Consider the following node network of infinite server queues. At node , the service time is distributed as
so the service rate at node is . Upon completing service at node a customer is routed in a Markovian fashion to node with probability
With probability
a customer leaves the network after service at node . Customers arrive externally to node according to a Poisson process with rate .
Because the arrivals are Poisson and PoissonArrivalsSeeTimeAverages (PASTA [14]), we can show that the equilibrium distribution of this network is stochastically equivalent to the equilibrium distribution of the network in Section IIA by considering the perspective of customers who arrive to various nodes. We will refer to nodes as supernode and we will show that in equilibrium supernode is stochastically equivalent to node in the original network. Furthermore, we will show that customers are routed between supernodes in a manner that is stochastically equivalent to the manner in which customers are routed between nodes in the original network.

Service Times: Among all (internal and external) arrivals to supernode , the service time distribution is
In addition, in the original network a customer will experience service time with probability . Therefore, we have that the service times in supernode are stochastically equivalent to the services times at node in the original network.

External Arrivals: The total external arrival rate to supernode is
which is the external arrival rate to node in the original network.

Routing: Now consider how customers are routed from supernode to supernode . First note that since a customer at node is routed to node with probability , the probability of being routed from supernode to supernode is
which is the same probability of being routed from node to node in the original network. This is true for all and this also implies that the probability of exiting the system after service at supernode is the same as the probability of exiting the system after leaving node in the original network.
Now consider the relationship between service times and routing decisions. If a customer at supernode experiences a service time that is distributed according to , that customer is then routed to node with probability . Since
this shows that relationship between service times and routing decision is maintained.
Because the external arrivals and the routing decisions are equivalent, the customer flows between supernodes are equivalent to the corresponding customer flows between nodes in the original network. We can conclude that the supernodes in the node network are equivalent to the nodes in the node network.
Because we have shown equivalence of the networks, we can find the stationary distribution of the original network by finding the stationary distribution of the new network. Each node in the new network is a queue and the routing between nodes is Markovian. As a result, we can apply Theorem 2. Let be the number of customers at node in equilibrium. Equation 1 defines the flows in the network: is the total arrival rate to node . Indeed, is the external arrival rate to node . In addition, customers are routed from node to node with probability so the second term in Equation 1 is the internal arrival rate. We are assuming that the network is open so we know that Equation 1 has a unique solution that is nonnegative ^{2}^{2}2See [9, Chapter 2] for details.. Since is the service rate at node and node is an infinite server queue, we have that
where . Since
we have that . The stochastic equivalence of the equilibrium distribution of the entire network allows us to conclude that the desired productform distribution holds.
Iv Simulation Verification
In this section we focus on a simple example. We illustrate the proof of Theorem 3 by explicitly showing how the constructed network corresponds to the original network. We then simulated the original network to demonstrate that the productform in Theorem 3 is correct. We also compare these simulations results with the two naïve approaches from Section IIB (assuming full insensitivity and assuming insensitivity of the service times) to show that the naïve approaches yield incorrect answers.
We consider a the two node network in Figure 0(a). This network is simple model of customers arriving to an online web service to complete two tasks in sequence (e.g. a login process followed by a credit card transaction). If customers do not complete either task within a fixed deadline, then the customer is must start again at the beginning. Specifically, the network can be described as follows:

New customers arrive at node 1 according to a Poisson process of rate .

At node 1, customers have a service time that is exponentially distributed with rate . There is also a deterministic deadline of .

If , then service is completed and the customer is routed with probability 1 to node 2. If then the service time exceeded the deadline and the customer is routed back to node 1.

At node 2, customers have a service time that is exponentially distributed with rate and a deterministic deadline of .

If then the customer exits the system and if then the service time was exceed and the customer is routed back to node 1.
The equivalent network is shown in Figure 0(b). This network can be described as follows:

New customer arrivals are routed to node with probability and to node with probability .

At node the service time is . Since this is a truncated exponential, the service rate at is

After service at node , customers are routed to node with probability and to node with probability .

The service rate at node is .

After service at node , customers are routed to node with probability and to node with probability .

At node the service time is . Since this is a truncated exponential, the service rate at is

After service at node , customers exit the system.

At node the service time is .

After service at node , customers are routed to node with probability and to node with probability .
Given this information, we can simulate the original network and compare the empirical distribution to the distribution from Theorem 3. In addition, we can compare the distribution attained by assuming full insensitivity and the distribution attained by assuming insensitivity of the service times. For simplicity, we focus on the case of . We simulate for time units with a time discretization of
. We take a single run of the simulation and take timeaverages to estimate the true distributions. We note that by relying on a single simulation run, we are using the fact that the system is ergodic.
Node 1  Node 2  

Simulated  1.591  1.003 
Exact  1.582  1.000 
Assuming Full Insensitivity  2.000  1.000 
Assuming Insensitivity of the Service Times  1.251  0.791 
First we compare the average number of customers in each node. Table I shows the results. We see that although assuming full insensitivity gives an accurate numerical result for node 2, the naïve methods are both wildly incorrect for node 1. In contrast, the exact result that is computed using Theorem 3 agrees with the simulation.
Now that the naïve methods are seen to be inadequate, we can now verify that the exact distribution agrees with the simulated distribution. First we consider the marginal distributions of customers at node 1 and at node 2. The results are shown in Figure 2. As expected, the exact result agrees with the simulation.
Theorem 3 also says that the distribution is the product of the marginals. Let be the empirical probability of having customers at node 1 and customers at node 2. Let be the empirical probability of having customers at node 1. Let be the empirical probability of having customers at node 1. Note that and are shown in Figure 2 and they agree with the result from Theorem 3. In our simulation
when rounded to three significant figures. This shows that the empirical equilibrium distribution of the network is (approximately) a product form distribution. This agrees with the productform result from Theorem 3.
V Conclusions and Future Work
Motivated by timeouts in Internet services, we have formulated a model for infinite server queueing networks in which routing decisions are based on service time deadlines. In spite of the fact that the usual theory does not apply, we have shown that the model does indeed have a productform stationary distribution. In addition to providing an analytic proof, we have also provided a simulation which verifies the results.
We have already noted that our proof is heavily dependent on the fact that each node in the network has an infinite number of servers. As a result, our analysis will not immediately transfer to more general networks in which the nodes have finitely many servers. However, our analysis does extend to the case in which only a portion of the overall network has infinite server queues with deadline based routing. We may also be able to extend our analysis to the case of closed networks of infinite server queues.
A less straightforward and more mathematically demanding extension of these results would be to the case of more general arrival processes. Nonstationary Poisson arrivals to infinite server queueing networks were considered in [15] with the main result being that productform results still hold but are timevarying. Given these previous results, this seems like a fruitful direction for future work.
Finally, we note that deadlines are essential to many applications besides Internet services such as wireless communication [16, 17], patient scheduling [18], low latency computing [19, 20], and utility computing [21, 22, 23]. Consequently, we feel that we may be able to apply similar modeling ideas to other application domains.
References
 [1] B. Urgaonkar, G. Pacifici, P. Shenoy, M. Spreitzer, and A. Tantawi, “An analytical model for multitier internet services and its applications,” in ACM SIGMETRICS Performance Evaluation Review, vol. 33, pp. 291–302, ACM, 2005.
 [2] E. G. Coffman, Jr, T. Kadota, and L. A. Shepp, “A stochastic model of fragmentation in dynamic storage allocation,” SIAM Journal on Computing, vol. 14, no. 2, pp. 416–425, 1985.
 [3] T. Dohi, T. Matsuoka, and S. Osaki, “An infinite server queuing model for assessment of the software reliability,” Electronics and Communications in Japan (Part III: Fundamental Electronic Science), vol. 85, no. 3, pp. 43–51, 2002.
 [4] W. Whitt, “Dynamic staffing in a telephone call center aiming to immediately answer all calls,” Operations Research Letters, vol. 24, no. 5, pp. 205–212, 1999.
 [5] Q. Zhang, L. Cheng, and R. Boutaba, “Cloud computing: stateoftheart and research challenges,” Journal of internet services and applications, vol. 1, no. 1, pp. 7–18, 2010.
 [6] J. R. Jackson, “Jobshoplike queueing systems,” Management science, vol. 10, no. 1, pp. 131–142, 1963.
 [7] F. Baskett, K. M. Chandy, R. R. Muntz, and F. G. Palacios, “Open, closed, and mixed networks of queues with different classes of customers,” Journal of the ACM (JACM), vol. 22, no. 2, pp. 248–260, 1975.
 [8] F. P. Kelly, Reversibility and stochastic networks. Cambridge University Press, 1979.
 [9] H. Chen and D. D. Yao, Fundamentals of queueing networks: Performance, asymptotics, and optimization, vol. 46. Springer Science & Business Media, 2013.
 [10] J. Walrand, An introduction to queueing networks. Prentice Hall, 1988.
 [11] A. Russo and A. Sabelfeld, “Securing timeout instructions in web applications,” in Computer Security Foundations Symposium, 2009. CSF’09. 22nd IEEE, pp. 92–106, IEEE, 2009.
 [12] A. Goldfarb and C. Tucker, “Online display advertising: Targeting and obtrusiveness,” Marketing Science, vol. 30, no. 3, pp. 389–404, 2011.
 [13] A. B. Downey, “Evidence for longtailed distributions in the internet,” in Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement, pp. 229–241, ACM, 2001.
 [14] R. W. Wolff, “Poisson arrivals see time averages,” Operations Research, vol. 30, no. 2, pp. 223–231, 1982.
 [15] W. A. Massey and W. Whitt, “Networks of infiniteserver queues with nonstationary poisson input,” Queueing Systems, vol. 13, no. 13, pp. 183–250, 1993.
 [16] N. Master and N. Bambos, “Power control for wireless streaming with HOL packet deadlines,” in 2014 IEEE International Conference on Communications (ICC), pp. 2263–2269, IEEE, 2014.
 [17] N. Master and N. Bambos, “Service rate control for jobs with decaying value,” in 2015 American Control Conference (ACC), pp. 3255–3260, IEEE, 2015.
 [18] N. Master, C. W. Chan, and N. Bambos, “Myopic policies for nonpreemptive scheduling of jobs with decaying value,” arXiv preprint arXiv:1606.04136, 2016.
 [19] N. Master and N. Bambos, “Low latency policy iteration via parallel processing and randomization,” in 2015 54th IEEE Conference on Decision and Control (CDC), pp. 1084–1091, IEEE, 2015.
 [20] N. Master and N. Bambos, “Randomized iterations for low latency fixed point computation,” in 53rd IEEE Conference on Decision and Control, pp. 5208–5215, IEEE, 2014.
 [21] Z. Zhou and N. Bambos, “A general model for resource allocation in utility computing,” in 2015 American Control Conference (ACC), pp. 1746–1751, IEEE, 2015.
 [22] Z. Zhou, B. Yolken, R. A. MiuraKo, and N. Bambos, “A gametheoretical formulation of influence networks,” in 2016 American Control Conference (ACC), pp. 3802–3807, IEEE, 2016.
 [23] Z. Zhou and N. Bambos, “Targetrate driven resource sharing in queueing systems,” in 2015 54th IEEE Conference on Decision and Control (CDC), pp. 4940–4945, IEEE, 2015.
Comments
There are no comments yet.