Nonparametric estimation of service time characteristics in infinite-server queues with nonstationary Poisson input

05/25/2018 ∙ by A. Goldenshluger, et al. ∙ 0

This paper provides a mathematical framework for estimation of the service time distribution and the expected service time of an infinite-server queueing system with a nonhomogeneous Poisson arrival process, in the case of partial information, where only the number of busy servers are observed over time. The problem is reduced to a statistical deconvolution problem, which is solved by using Laplace transform techniques and kernels for regularization. Upper bounds on the mean squared error of the proposed estimators are derived. Some concrete simulation experiments are performed to illustrate how the method can be applied and to provide some insight in the practical performance.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The advance of more and larger datasets leads to new questions in operations research and statistics. This paper can be placed in the intersection of these two fields. In particular, we study the statistical problems of estimating the distribution function and expectation of service times in an infinite-server queueing model in case of partial information. The information is incomplete in the sense that the number of busy servers is observed, but the individual customers cannot be tracked.

Infinite-server queue

First we provide some background on the queueing model, which is well studied and could be considered as a standard model. In such queues there are arrivals according to a homogeneous Poisson process, each customer is served independently of all other customers and customers do not have to queue for service, because there is an infinite number of servers. The model has a wide variety of applications in e.g. telecommunication, road traffic and hospital modeling. It can be used as an approximation to systems, where is relatively large with respect to the arrival rate, but the model is also interesting in its own right. For example, it can be interpreted outside queueing theory as a model for the size of a population. In this paper we will use queueing terminology (servers, customers, etc.), but these terms could be adjusted according to the application. For example, ‘customers’ could be cars travelling between two locations and their ‘service time’ could be the travel time. In many applications it is seen that the arrival rate is not constant, but it varies over time. We will provide some examples below. This observation motivates studying the queue, where the arrival rate is assumed to be a nonhomogeneous Poisson process. This model is still particularly tractable (cf. [EMW1993]) and amenable for statistical analysis, as shown in this paper.

Statistical queueing problems

Queueing theory studies probabilistic properties of random processes in service systems on the basis of a complete specification of system parameters. In this paper we are interested in inverse problems when unknown characteristics of a system should be inferred from observations of the associated random processes. Typically such observations are incomplete in the sense that individual customers cannot be tracked as they go through the service system. The importance of such statistical inverse problems with incomplete observations was emphasized in [Baccelli].

The service time distribution and its expected value are important performance metrics of the infinite-server queue (note that waiting times are identical to service times in infinite-server queues). Our goal is to estimate these characteristics of the system from observations of the queue–length process. Specifically, let

be arrival epochs constituting a realization of the non–homogeneous Poisson point process on the positive real half line with intensity

. The service times

are positive independent random variables with common distribution

, independent of . Suppose that we observe the queue length (or: number of busy servers) restricted to the time interval . The goal is to construct estimators of the service time distribution  and the expected service time

with provable accuracy guarantees.

In case of complete data, where it can be seen when each customer arrives and leaves, the above estimation problems are trivial. The difficulties of the incomplete data problems lie in the fact that only macro level data is observed, i.e. only the number of busy servers are recorded.


A non-exhaustive list of example applications of our analysis is given below:

  • Traffic: Toll plazas or sensors often count the number of cars entering and leaving a certain highway segment. Cars can overtake each other and therefore do not leave in the same order as they arrive. The problem is to determine the distribution of the speed of the cars over that segment. A solution that would make the problem trivial is to identify the car by recording its license plate. This comes with several downsides. For example, it could be costly to implement and there could be privacy concerns. There have been several attempts in the literature to solve this problem in another way, usually by trying to match vehicles at the upstream and downstream points based on non-unique signatures (for example, vehicle length), which can be obtained by a particular type of dual-loop detectors cf. e.g. [Coifman]. Our approach applies to the harder case where single-loop detectors are used, i.e. where only the vehicle counts at the upstream and downstream points are known.

  • Communication systems: If two internet routers only track the timestamp of when a packet arrived, then what is the distribution of the packet flow duration between those routers? The importance of such statistical inverse problems was emphasized by [Baccelli]. In particular, the recent paper [Antunes] proposes a sampling framework for measuring internet traffic based on the model.

  • Biology: As noted in [Jansen2016], the production of molecules may be ‘bursty’, in the sense that periods of high production activity can be followed by periods of low activity. In [Dob2009], this is modeled using an interrupted Poisson process, i.e. a Poisson process that is modulated by a stochastic on/off

    switch. The number of active molecules can then be modeled as the number of jobs in a modulated infinite-server queue. By expectation maximization and maximum likelihood techniques it is possible to filter the most likely

    on/off arrival rate sample path, cf. [Kris1997]. Subsequently, using such filtered paths, our methods can provide estimates of the lifetime distribution of molecules, when the molecules cannot be tracked separately but only the aggregate amount.

In all of these examples it would be unreasonable to assume that the arrival rate is constant. It is known that traffic is busier during rush hours, production of molecules is bursty, and internet activity is higher during the day than at night. Thus inhomogeneity of the Poisson process is an important feature in a variety of applications.

The paper contribution

In this paper we deal with the statistical inversion problem of estimating the service time distribution and the expected service time in the queue, using only observations of the queue-length process. We approach the problem as a statistical deconvolution problem and develop nonparametric kernel estimators based on Laplace transform techniques. Their accuracy is analyzed over suitable nonparametric classes of service time distributions. In particular, we derive upper bounds on the worst–case root mean squared error (rmse) of the proposed estimators, and show how properties of the arrival rate function and service time distribution affect the estimation accuracy. Furthermore, we provide details on the implementation of the estimators. For example, we describe an adaptive estimation procedure for the distribution function and confirm its efficiency by a simulation study.

Our results are based on a formula for the joint moment generating function of the queue–length process at different time instances. We provide a derivation of this result, which to the best of our knowledge has not appeared in the literature before and is of independent interest.

Current literature

Our contribution is related to two different strands of research. First, a similar type of statistical inference questions in queueing theory was studied before, but then for the homogeneous case, i.e. the system; see, e.g.,  [Brown], [Bingham], [Wichelhaus], [Goldenshluger], [Goldenshluger-b] and references therein. The analysis in case of the system is vastly different. This is due to the non-stationary nature of the queue, while the analysis for the relied on stationary measures. Second, similar deconvolution problems, such as density deconvolution, have been studied in statistical literature; see, e.g., [Zhang] and [fan]

. However, this strand of research typically considers models with independent observations, and advocates the use of Fourier transform techniques. In contrast, our setting is completely different: the queue–length process in the

model is intrinsically dependent, and Fourier–based techniques are not applicable since the arrival rate need not be (square) integrable. A connection can be drawn with [Bigot] where a deconvolution problem of estimating intensity of a non–homogeneous Poisson process on a finite interval was considered. We also mention recent work [Abramovich] where Laplace transform techniques were applied for signal deconvolution in a Gaussian model.


The rest of the paper is organized as follows. In Section 2 we present the formula for the joint moment generating function of the queue–length process at different time instances, from which known results of [EMW1993] can be derived. In particular, the covariance of the queue length at different points in time can be found, which plays an important role in the subsequent statistical analysis. In Section 3 we formulate the estimation problems and introduce necessary notation and assumptions. Our main statistical results, Theorems 2, 3 and 4, that provide upper bounds on the risk of estimators of and are given in Sections 4 and 5. Section 6 discusses numerical implementation of the proposed estimators and presents simulation results. Proofs of main results of this paper are given in Section 7.

2 Properties of the queue–length process

In this section we derive some probabilistic results on properties of the queue–length process of the queue; these results provide a basis for construction of our estimators.

Let be arrival epochs constituting a realization of a non–homogeneous Poisson process on the real line with intensity . The service times are positive independent random variables with common distribution , independent of . Assume that the system operates infinite time; then the queue–length process is given by

The next result provides formulas for the Laplace transform of finite dimensional distributions of .

Theorem 1.

Let ,


and for let

Let be fixed points, and let by convention; then for any , one has



stands for the expectation with respect to the probability measure

generated by the queue–length process when the service time distribution is .

The following statement is an immediate consequence of the result of Theorem 1 for specific cases and .

Corollary 1.

For any

i.e., is a Poisson random variable with parameter . Moreover, for any

and therefore


The results in Corollary 1 are well known; they are proved, e.g., in [EMW1993, Section 1] using a completely different technique. We were not able, however, to locate formula (2) in the existing literature.

3 Estimation problems, notation and assumptions

In this section we formulate estimation problems, introduce necessary notation and assumptions and present a general idea for the construction of estimators of linear functionals of service time distribution .

3.1 Formulation of estimation problems

Consider an queueing system with Poisson arrivals of intensity . Assume that independent realizations of the queue–length process are observed on the interval . Using the observations we want to estimate the service time distribution and the expected service time . In the sequel, we will be interested in estimating the value of at a single given point .

It is worth noting that the observation scheme of this paper, where independent copies of the queue–length process are given, is quite standard in statistics of non–stationary processes. We refer, e.g., to [kutoyants] where nonparametric estimation of intensity of a non–homogeneous Poisson process was considered. On the other hand, the accuracy of the estimator also increases as a function of , where scales the arrival rate. In other words, good estimations can be obtained even for , as long as the arrival rate is high enough.

By estimators and of and respectively we mean measurable functions of . Their accuracy will be measured by the worst–case risk over a set of distributions . In particular, for a functional class the risk of is defined by

while the risk of is defined by

Our goal is to construct estimators of and with provable accuracy guarantees over natural classes of service time distributions.

3.2 General idea for estimator construction

It follows from Theorem 1 that expectation of the queue–length process is related to the service time distribution via the convolution equation (1),

Therefore estimating a linear functional of , such as or , from observation of is a statistical inverse problem of the deconvolution type.

We will base construction of our estimators on the linear functional strategy that is frequently used for solving ill–posed inverse problems. The main idea of this strategy is to find a pair of kernels, say, and such that the following two conditions are fulfilled:

  • the integral approximates the value ;

  • the kernel is related to via the equation .

In view of Corollary 1 and by condition (ii), the statistic

is an unbiased estimator for the integral

, which by (i) approximates the value . Thus, is a reasonable estimator of .

3.3 Notation and assumptions

In order to construct estimators in our settings we will use the Laplace transform techniques. For this purpose we require the following notation.


For a generic locally integrable function on the bilateral Laplace transform is defined by

The region of convergence of the integral on the right hand side is a vertical strip in the complex plane; it will be denoted by

for some constants . If is supported on then , and the corresponding region of convergence is a half–plane

The inversion formula for the Laplace transform is

Assumptions on the arrival rate.

As we will show in the sequel, estimation accuracy depends on the growth of at infinity and on the rate of decay of the Laplace transform over vertical lines in the region of convergence . These properties of the arrival rate are quantified in the next two assumptions.

Assumption 1.

The intensity function is a non–negative locally integrable function on with the abscissa of convergence of the Laplace transform .

  • If then there exists real number such that

  • If then there exist real numbers , , and , , such that

Assumption 2.

The Laplace transform does not have zeros in , and


where , and .

Several remarks on these assumptions are in order.

Assumption 1 states growth conditions on : (a) allows exponential growth, while (b) is a polynomial growth condition. The important case of bounded is included in Assumption 1(b) and corresponds to . Here note that while , and a convenient normalization is used. The growth conditions (4) and (5) guarantee that is absolutely convergent in the half–plane with and respectively.

Assumption 2 states that does not have zeros in . By the Riemann–Lebesgue lemma, decreases along vertical lines in (see, e.g., [Doetsch, §23]), and Assumption 2 stipulates the decrease rate. Note that if in (6) then for all , and can be inverted from by integrating over any vertical line . If then for any , and the Laplace inversion formula should be understood in the sense of the –convergence (cf. [Widder46, Chapter 2, §10]).

The next examples demonstrate that Assumptions 1 and 2 hold in many cases of interest.

Example 1 (Constant arrival rate).

Let and . It is obvious that Assumption 1 holds with and . In addition, since

and , Assumption 2 holds with and . Note that has a unique singularity point at .

Example 2 (Polynomial arrival rate).

Let , ; here Assumption 1 holds with , and . Moreover, , ; hence

Thus Assumption 2 is valid with , and . The function has a unique singularity point at .

Example 3 (Sinusoidal arrival rate).

Let for , ; then


It is evident that

Thus Assumption 2 holds with , , . Three singularity points of are located on the convergence axis: , .

Example 4 (Exponential arrival rate).

Let , for some ; then , . Here

Thus Assumption 2 is fulfilled with , and .

4 Estimation of the service time distribution

In this section we consider the problem of estimating the service time distribution.

4.1 Estimator construction

The estimator construction follows the linear functional strategy described in Section 3.2.

Let be a fixed bounded function supported on . For real number we define


Since the convergence region of is , is well defined in . The kernel will be always chosen so that , and if Assumption 2 holds then does not have zeros in . Then function is analytic in , and kernel does not depend on real number provided that .

The next result reveals a relationship between kernels and and provides motivation for construction of our estimator.

Lemma 1.

Let as defined in (1), and assume that




The first condition in (8) ensures that is well defined and finite for each . In view of the second condition, by Fubini’s theorem

Now we show that the definition (7) implies that function solves the equation


Indeed, the bilateral Laplace transform of the left-hand side is

On the other hand,

It follows from the definition of and the inversion formula for the bilateral Laplace transform [cf. [Widder46, Chapter VI, §5]] that for all in a region where is analytic. Then is indeed a solution to (9). ∎

Now we are in a position to define the estimator of based on independent realizations of the queue–length process observed on the interval . Let be an ordered set of the departure and arrival epochs of realization , , so that is constant in between any two sequential epochs. The estimator of is defined as follows


The construction depends on the design parameter that will be specified in the sequel.

4.2 Upper bound on the risk

Now we study the accuracy of the estimator defined in (10). The risk of the estimator

will be analyzed under local smoothness and moment assumptions on the probability distribution

. In particular, we will assume that belongs to a local Hölder class and has bounded second moment.

Definition 1.

Let , and . We say that a function belongs to the class if is times continuously differentiable on and

Definition 2.

Let . We say that a distribution function supported on belongs to the class if

We denote also

Let be a kernel supported on and satisfying the following conditions:

  • For a fixed positive integer

  • For a positive integer number kernel is times continuously differentiable on and

Note that conditions (K1)–(K2) are standard in nonparametric kernel estimation; see, e.g., [Tsybakov].

Theorem 2.

Suppose that Assumptions 1 and 2 hold, and . Let be a kernel satisfying conditions (K1)–(K2) with and . Let be the estimator (10) associated with kernel and parameter

If so that as then


where may depend on , and only, and

Remark 1.

The risk of converges to zero at the nonparametric rate as . The “ill-posedness index” is determined by smoothness of function with appropriate on the entire real line. In particular, if is continuously differentiable on and then , and the resulting rate is . The deconvolution problem is much harder if is smooth on : for instance, under conditions of Example 2 with the resulting rate is .

Remark 2.

Theorem 2 considers asymptotics as . Another natural asymptotic regime is the heavy traffic limit when the scale parameter of the arrival intensity tends to infinity while . An inspection of the proof shows that the result of Theorem 2 remains true if asymptotics with fixed is replaced by with fixed .

Remark 3.

In general the rate of convergence may depend on , and this dependence is primarily determined by behavior of the arrival rate at infinity. In particular, if increases exponentially then the accuracy is proportional to and deteriorates rapidly with growth of . Note however that if the is bounded (here and ) then the upper bound does not depend on .

Remark 4.

In cases and , the statement of the theorem remains intact if boundedness of the second moment of in the definition of is replaced by the boundedness of the first moment. This is not so in the case , : if boundedness of the first moment only is assumed then the dependence on is worse for large as the bound becomes proportional to .

5 Estimation of the expected service time

In this section we consider the problem of estimating the expected service time from observations .

5.1 Estimator construction

For real number (where is chosen arbitrarily) let be an infinitely differentiable function on such that

To define on the intervals and we use standard construction. Let , ; then on the interval where climbs from to we put

while on the interval where descends from to we let

Since is infinitely differentiable, the Laplace transform is an entire function.



where is the abscissa of convergence of the Laplace transform of . The following statement is a key result on properties of the function .

Lemma 2.



The proof of Lemma 2 is omitted as it goes along the same lines as the proof of Lemma 1. Note that the integral on the right hand side of the previous formula for large approximates ; this fact underlies the construction of our estimator.

The estimator of is defined as follows


The estimator depends on the tuning parameter which will be specified in what follows.

5.2 Upper bounds on the risk

The next two statements establish upper bounds on the risk of under different growth conditions on the arrival rate, Assumptions 1(a) and 1(b) respectively.

Theorem 3.

Let Assumptions 1(a) and 2 hold, and let . Let be the estimator (13) associated with parameter

If and as then

and is a positive constant depending on and .

Theorem 4.

Let Assumptions 1(b) and 2 hold, and let . Let be the estimator (13) associated with parameter

If as then

where is a positive constant depending on , and only.

Remark 5.

The results of Theorems 3 and 4 hold without any conditions on smoothness of . In particular, may be a discrete distribution with bounded second moment.

Remark 6.

Theorems 3 and 4 demonstrate that accuracy of is primarily determined by the growth of the arrival rate at infinity. In particular, if the arrival rate increases exponentially, i.e. , then the risk of converges to zero at a slow logarithmic rate. At the same time, if the arrival rate is bounded, i.e. and

, then the risk tends to zero at the parametric rate. A close inspection of the proof shows that growth of the arrival rate manifests itself in the growth of the variance of

which, in its turn, affects the rate of convergence.

6 Implementation and numerical examples

In this section we provide details on implementation of the proposed estimators. In particular, we discuss numerical methods for calculating kernels and and an adaptive scheme for bandwidth selection. We also conduct a small simulation study in order to illustrate practical behavior of the proposed estimators.

6.1 Implementation issues

In our simulations we consider a Gaussian kernel, i.e.

for all . The reason we use this kernel is that in some cases can be computed explicitly. If such analytical expressions are not available, we resort to numerical integration and inversion techniques. For consistency, we still use the Gaussian kernel in the numerical cases, even though this is obviously not necessary.

It is computationally somewhat expensive to implement the theoretical kernel in (12) as covered in Theorems 3 and 4 because the integral of does not have an explicit form, and neither does its transform. From that point of view it is more attractive to work with a slightly different kernel , in particular

i.e.  is a convolution of the indicator function and , where is a standard Gaussian kernel. In some cases we let to be a convolution of the indicator function