We are currently seeing the effect COVID-19 has on healthcare services in vast majority of countries in the world. Healthcare services are also under increasing pressure as demographics change and populations age. Public health services are struggling, and sometimes failing, to maintain services under this increasing load. Given this context of high demand for tightly constrained resources it is instructive to reassess the rationales for prioritization regimes currently in use, and contrast with other possibilities. Specifically, in this paper we characterize the performance of the very commonly used static priority regime, and contrast it with the recently proposed accumulating priority regime, under critical loadings.
Healthcare systems have traditionally used static priority queues in a range of settings from triage in an emergency departments (EDs), to organizing access to elective surgeries, such as hip and knee replacements ,. In a static priority regime, patients are assigned to a priority class, and must wait to be treated until all patients in higher priority classes have been treated. Patients may sometimes be moved to a higher priority class if their condition deteriorates, but in practice there is often no automatic mechanism for making such transitions, and any adjustments may rely on patients proactively approaching their healthcare provider. Accumulating priority queues (APQs) have recently been proposed to overcome some of the inherent drawbacks of static priority queues in healthcare . In accumulating priority queues, patients accumulate priority with time spent in the queue, at a rate that depends on their priority class, with higher priority patients accumulating priority faster than lower priority patients. Priority can accumulate linearly, or in a nonlinear fashion. Observational studies of behaviour in emergency departments have revealed that in practice physicians may operate a regime that is similar to an APQ, with the likelihood of being seen increasing more rapidly as waiting times approach threshold targets, see e.g. .
The accumulating priority regime was first proposed by Kleinrock , who obtained expressions for the expected waiting times for all classes. A large-deviations principle has been established in . More recently, Stanford et al.  derived expressions for the Laplace Stieltjes Transform of the waiting time distribution, which can then be inverted numerically. A later paper , showed that a wide range of possible accumulation functions (including, for instance, exponential and log) have an equivalent linear regime, in the sense that the order in which patients are seen is the same in both the nonlinear and linear formulation.
This paper considers the performance of a single server queue with total arrival rate and service rate 1, where , the load on the server, either satisfies or . The heavy traffic regime has been intensively studied, although not for the accumulating priority queue. When queues are overloaded and hence unstable, and no equilibrium exists. Unstable queues, if unchecked, grow without bound, which is unrealistic for almost every application, and of course, an infinite queue never exists in practice. However, there are many applications where arrival rates are greater than service rates for shorter or longer periods. Traffic networks are an immediate example. In many cities rush hour queues build up, and then decay, not because the system can cope with the increased volume of traffic but simply because the flow into the network has reduced as rush hour passes.
At the time of writing, one cannot overstate the importance of looking at healthcare systems in the situations when demand suddenly exceeds capacity. We see healthcare systems of a very large number of countries being overwhelmed with an influx of patients.
Healthcare can however suffer from a similar excess of demand over capacity in other situations, particularly in winter, and for those healthcare systems which experience loss of capacity over longer periods, addressing the issue of how best to organize patient prioritization is vital . Emergency departments may also see increased demand due to patients who could have been treated by their GP, but were deterred by cost (e.g. in New Zealand, where hospital visits in the public health system are free, but GP visits are not), or long waiting times for GP appointments. No patient arriving in a period when actually sees an infinite queue – rather they see a large, possibly very large, queue ahead of them. Their waiting time for treatment, given they have joined the queue, is also not infinite, but as a patient’s condition deteriorates, it may nevertheless be far too long. In practice also, if is close to , then over relatively short periods of time it may be difficult to determine whether or . Thus there is a strong practical need to address the question of how best to organize patient prioritization in this transient regime.
Indeed, we will see below that the two cases: a) and b) or , require fundamentally different approaches to patient prioritization. If , then the accumulating priority regime ensures that all patients are seen in a timely manner, while ensuring health targets are met (if those targets are feasible). On the other hand, if or , then it is impossible to limit waiting times for all patients, and the static priority regime provides a mechanism for ensuring that healthcare is still available to the most acute, and vulnerable, patients (see Fig. 2 below), whereas under APQ the expected waiting times for all classes increase (Fig. 1 below). For this scenario we propose below a mixture of the two prioritization schemes.
Section 2 gives a detailed description of the models we consider. Section 3 considers the case where , as , while Section 4 considers the case where . We conclude with a short discussion of related models and potential future research directions in Section 5.
We consider a service system with a single server and classes of customers. Customers (patients) of class arrive according to a Poisson process with rate and their priority class is associated with a positive real number for . The higher the number , the higher the priority class, and without loss of generality we assume . Thus arrivals of class 1 are in the highest priority class and arrivals of class are in the lowest priority class.
If a customer finds the server idle upon its arrival, the server immediately starts serving this customer. Priority is non-preemptive, that is, if the server is busy when a new customer arrives, the customer joins the waiting room regardless of her priority class. The room is assumed to be of infinite size. Whenever the server finishes serving a customer and the waiting room is non-empty, the server starts to serve the customer with the highest current priority among those currently in the waiting room.
We consider two different priority policies:
Static priorities (SP): in this case a customer of class has priority level which does not change;
Accumulating priorities (AP): in this case a customer of class that spent , , units of time waiting in the waiting room has priority level .
We assume, without loss of generality, that service times for customers are independent and exponentially distributed with mean. Therefore
is the load on the system – the average amount of new work arriving per unit of time – and is necessary for stability.
We will consider the system in high loads in two scenarios. In the first scenario we assume that and . The system is stable for any , but, as decreases to , the average number of customers waiting in the queue in the stationary regime (and, by Little’s formula, the average waiting time of a typical customer arriving in the system) tends to infinity. In the second scenario and the system is thus unstable.
Our goal is to study the behavior of the expected waiting time of a customer from each of the classes, in the two scenarios described above, and under the two different priority policies.
3 Loads approaching capacity
In this section we consider a sequence of systems indexed by such that the loads in these systems increase to . More precisely, we assume that are non-decreasing functions of such that as for all ,
for all , and
An interesting special case of the setting above is the scenario where are fixed and , but we do not restrict ourselves to this case.
Regardless of the service discipline chosen, the systems are stable for any . Denote by
the vector of steady-state queue lengths and bythe vector of steady-state waiting times (inclusive of the service time), for a particular value of . We also let and be the expected queue length and waiting times respectively, for all . Regardless of the service discipline,
as . The total queue lengths thus increase to infinity, and we are interested in how queue lengths, and waiting times, of the individual classes behave.
3.1 Static priorities
Consider first the SP priorities. Let , with Let also . From Cobham , we obtain that the expected waiting times for the priority classes are given by
and we can write
Thus, we see that as ,
Thus, as , in the (SP) case, the expected queue lengths and waiting times for classes 1 to are bounded from above, while for class they grow without bound. The durations of busy periods also increase without bound.
3.2 Accumulating priorities
For the AP case we can conclude from the Kleinrock formula (, see also ) that waiting times for all customers grow without bound as , so that if the are held constant, this regime does not offer the same protection for the higher priority classes as SP does. Indeed we can prove the following exact statement.
Consider an accumulating priority queue with classes, and accumulation rates . Then
We present a proof of Lemma 1 below but first comment on its implications. The result may be interpreted as follows: a customer from class entering service after waiting for time has, at that time, priority . Lemma 1 essentially states that the (AP) discipline makes all these priorities just before service equal on average, across classes, in heavy traffic. This is similar to the behaviour in heavy traffic of the MaxWeight protocol (see ) which equates scaled queue lengths (we discuss this connection further below).
Lemma 1 also implies that as for all , and hence
which also means that
We use a proof by induction on . First write
which implies the statement of the lemma for . Assume now the statement is valid for all and let us prove if for :
These results suggest that in the AP regime, the accumulation rates need to be adjusted if the system is subjected to increasingly heavy loads. A natural solution we propose is to take for fixed , , with . This effectively applies the SP regime to the lowest priority class, while maintaining the positive benefits of the AP regime for the remaining classes. More generally, we could consider regimes where for fixed , for any . If , then the split between classes following AP and those following SP will still provide benefits to the higher priority classes. Whatever the split, only the lowest priority class will experience delays that grow without bound.
3.4 Numerical illustrations
We illustrate the conclusions above with typical sample paths of a system with (there are therefore classes of customers) under different priority regimes. Assume that arrival intensities are given by and , and priorities , and .
In the case of accumulating priorities one can see (Fig. (1)) that the numbers of customers in all classes become large.
If static priorities are used (Fig. (2)), only the number of customers of class grows large, whereas the numbers of customers in classes and remain reasonably small at all times.
We can also see from Fig. (3) below that if our proposed solution is applied, the sample path looks similar to that of the system with static priorities.
A further example (see figure 4) illustrates how by changing the priority regime one can avoid long waiting times for high-priority customers if there is a sudden surge of lower-priority customers. There are, as before, three priority classes, for the entire simulation. In the first quarter of the simulation time , so the total load of the system is , the system is in relatively light traffic and the accumulating priorities are used. All queue lengths are small. In the second quarter of the simulation time the arrival rate of the low priority customers suddenly jumps (this could represent for instance seasonal effects) to , so the total load in the system is and the system is in heavy traffic. One can observe all queues becoming large, including that of the highest priority customers. At this point we switch to the static priority regime and apply it for the remainder of the simulation time (arrival rates remain such that the system is in heavy traffic). One can observe that the static priority regime ensures that only the low-priority queue is large, queues of higher-priority customers are small.
4 Loads above capacity
In this section we assume that is fixed. We assume in addition that and is also fixed, and further that for any (i.e. the system without any class would be stable). We have two particular cases in mind but do not restrict our attention to these. The first case is similar to the one we had in mind when studying the system in heavy traffic: an increase in the lowest-class arrival rate takes the system load above capacity. Another case may illustrate a catastrophic event, such as for instance a pandemic where a sudden jump in the highest-priority patients may lead to a system operating above capacity.
Since , and the system is therefore unstable, in this section we study a fluid version of the model in which we consider separately the queue for each customer class.
We suppose that the level of queue at time , , is given by
where denotes instantaneous service rate enjoyed by queue .
4.1 Static priorities
In the case of (SP) for . As long as , the queue for class 1 is strictly positive, all the available service capacity is directed to class 1. Once has emptied, the new arrivals of class 1 are assigned a dedicated service rate of . Since , this guarantees that for all for some finite .
If , then for . For values of , we have ), that is, a fraction of the available rate is used to keep at zero, and the remaining service capacity is all assigned to queue , while it is positive. Once drops to 0, only a fraction of the available capacity is required to drain the queue length at the same rate at which arrivals occur. Thus, since , for for some finite .
Similarly to the above, we can conclude that there exists a finite such that for . For class , on the other hand, for . When , , and thus , and as .
4.2 Accumulating priorities
When considering accumulating priorities we define the maximal priority process for each queue , , as
Here we have replaced by , which is the age of the oldest fluid particles in the system – for the fluid model considered here, these are equivalent.
If is unique, then
On the other hand, if is not unique, let
Under the accumulating priority regime, if two or more classes have priority then service capacity should be divided between them in such a way that their priorities remain equal (and maximal). Therefore, if , then , say, for some constant for all . Thus
for all . But and hence
Recall that for all , and recall that we assume for any . Thus for any as long as does not consist of the entire set and as long as .
Thus we can now understand the dynamics of the process of priorities: if we start at time 0 with a unique class with the highest fluid priority, its priority is decreasing until it equalises with the priority of another class. From that point onwards, the two priorities stay the same, and both are decreasing at the same rate, until they equalise with the priority of a further class. This continues until all priorities equalise, from which point onwards these priorities grow infinitely. This of course implies that the levels of fluids grow infinitely.
Note also that the above may be summarised for the level of queue as follows: once all priorities have equalised,
which shows that the relative queue lengths are exactly as in (5).
As before, a solution to the possibility of queues growing without bound if the accumulating priority regime is applied to all classes is to either employ a static priority regime, or a mixture of accumulating and static priority regimes, but in either case the lowest priority class needs to be operating under the static priority regime. Both the pure static priority regime, and the mixture, yield identical fluid solutions for classes as , with , for some . On the other hand, as , under any of the regimes.
We have seen that in heavy traffic, the highest priority classes need greater protection than is afforded by the accumulating priority queue with fixed accumulation rates. This can be achieved either by permitting accumulation rates to grow in inverse proportion to in the case , or by applying a static priority regime to the lowest priority class. In either case the lowest priority class suffers from increasing waiting times, but higher priority classes are protected from this growth.
These results have implications for other scenarios where prioritisation of tasks is a feature. We discuss below two other important areas of application, but we believe that the potential applications are considerably wider.
Prioritisation of tasks has been introduced in models of human dynamics where, upon completing a task, a person chooses the task from their to-do list with the highest priority to be performed next. A variant of static priorities has been considered in  and a version of accumulating priorities - in . Few people would disagree with the observation that at least at some points in our lives we all experience an overload of our to-do lists. This may be modelled as the arrival rate being (perhaps temporarily) close to, or even above, the completion rate, exactly the settings considered in this paper. Our results can therefore be interpreted as follows: when the number of tasks on the to-do list grows, if time-dependent priorities are used, the number of outstanding high-importance tasks will grow. In order to prevent this, either static priorities, or a combination of accumulating and static priorities suggested here, should be used.
Another connection we would like to highlight is to wireless transmission protocols, namely the celebrated MaxWeight introduced in . A simple version of it may be described as follows: there are a number of queues, each with its own exogenous stream of arriving jobs, and a single server which, upon completing a job, chooses the next one to perform from the queue with the largest number of outstanding jobs. Other priorities have also been discussed, in particular weighted queue lengths. If one views our model as tasks from the same class forming a queue, then in the case of accumulating priorities the server chooses the next task from the queue with the highest weighted waiting time of the longest-waiting task. Situations considered in this paper are such that the numbers of outstanding tasks in all queues grow to infinity. In this case, the waiting time of the longest-waiting customer is proportional to the number of outstanding tasks. Therefore, in the regimes considered here, the behaviour of the accumulating-priority queue is the same as that of the system governed by an appropriately weighted MaxWeight algorithm.
In this note we focused on average waiting times and queue lengths. It is of course important to study their distributions, which is a subject of our ongoing research. Another research direction we are currently pursuing is a more realistic scenario where customers abandon the system if they waited longer than a certain (perhaps random and perhaps class-dependent) threshold. Strategies minimizing the abandonment rate are of great practical interest.
-  G. Arnett and D. Hadorn. Developing priority criteria for hip and knee replacement surgery: Results from the Western Canada waiting list project. Canadian Journal of Surgery, 46(4):290–296, 2003.
-  A.-L. Barabasi. The origin of bursts and heavy tails in human dynamics. Nature, 435(7039):207, 2005.
-  P. Blanchard and M.-O. Hongler. Modeling human activity in the spirit of Barabasi’s queueing systems. Physical Review E, 75(2):026102, 2007.
-  A. Cobham. Priority assignment in waiting line problems. Journal of the Operations Research Society of America, 2(1):70–76, 1954.
-  Y. Ding, E. Park, M. Nagarajan, and E. Grafstein. Patient prioritization in emergency department triage systems: An empirical study of the Canadian Triage and Acuity Scale (CTAS). Manufacturing and Service Operations Management, pages 1–19, 2019.
-  L. Kleinrock. A delay dependent queue discipline. Naval Research Logistics Quarterly, 11(3-4):329–341, 1964.
-  N. Li, D. Stanford, P. Taylor, and I. Ziedins. Nonlinear accumulating priority queues with linear equivalent proxies. Operations Research, 65(6):1613–1628, 2017.
-  L. Rolewicz and B. Palmer. The nhs workforce in numbers. 2019.
-  D. A. Stanford, P. Taylor, and I. Ziedins. Waiting time distributions in the accumulating priority queue. Queueing Systems, 77(3):297–330, 2014.
A. L. Stolyar et al.
Maxweight scheduling in a generalized switch: State space collapse
and workload minimization in heavy traffic.
The Annals of Applied Probability, 14(1):1–53, 2004.
-  A. L. Stolyar and K. Ramanan. Largest weighted delay first scheduling: Large deviations and optimality. Annals of Applied Probability, pages 1–48, 2001.
-  L. Tassiulas and A. Ephremides. Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks. In 29th IEEE Conference on Decision and Control, pages 2130–2132. IEEE, 1990.