# Fluctuation Bounds for the Max-Weight Policy, with Applications to State Space Collapse

We consider a multi-hop switched network operating under a Max-Weight (MW) scheduling policy, and show that the distance between the queue length process and a fluid solution remains bounded by a constant multiple of the deviation of the cumulative arrival process from its average. We then exploit this result to prove matching upper and lower bounds for the time scale over which additive state space collapse (SSC) takes place. This implies, as two special cases, an additive SSC result in diffusion scaling under non-Markovian arrivals and, for the case of i.i.d. arrivals, an additive SSC result over an exponential time scale.

• 4 publications
• 4 publications
• 6 publications
11/10/2020

### Optimizing the Age-of-Information for Mobile Users in Adversarial and Stochastic Environments

We study a multi-user downlink scheduling problem for optimizing the fre...
04/20/2019

### Analysis and Design of Robust Max Consensus for Wireless Sensor Networks

A novel distributed algorithm for estimating the maximum of the node ini...
01/17/2018

### Exact quantum query complexity of weight decision problems

The weight decision problem, which requires to determine the Hamming wei...
08/06/2018

### Heavy-Traffic Insensitive Bounds for Weighted Proportionally Fair Bandwidth Sharing Policies

We consider a connection-level model proposed by Massoulié and Roberts f...
10/31/2011

### A General Theory of Additive State Space Abstractions

Informally, a set of abstractions of a state space S is additive if the ...
06/08/2022

### Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Many deep reinforcement learning algorithms rely on simple forms of expl...

## 1 Introduction

The subject of this paper is a new line of analysis of the Maximum Weight (MW) scheduling policy for single-hop and multi-hop networks. The main ingredient is a purely deterministic qualitative property of the queue dynamics: the trajectory followed by the queue vector under a MW policy tracks the trajectory of an associated deterministic fluid model, within a constant multiple of the cumulative fluctuation of the arrival processes. With this property at hand, it is then a conceptually simple matter to translate concentration properties of the arrival processes to concentration properties for the queue vector. As a consequence, we can obtain:

• New, simple derivations of existing results on the convergence to a fluid solution/trajectory and on state space collapse (SSC).

• Stronger versions of existing SSC results, involving more general arrival processes, and tighter concentration bounds.

• An approach to obtaining new results that would seem rather difficult to establish with existing methods.

The core of our approach is the trajectory tracking result mentioned above. The latter is in turn an adaptation of a similar result established in [29], for a general class of continuous-time hybrid systems that move along the sudifferential of a piecewise linear convex potential function with finitely many pieces; other than an additional restriction to the positive orthant, a continuous-time variant of the MW dynamics turns out to be exactly of this type. However, a fair amount of additional work is needed to translate the general result to the standard, discrete-time, MW setting; cf. Theorem 2 and its proof.

### 1.1 General Background

We consider a multi-hop switched network with fixed routing, such as those arising in wireless networks [13] or switch fabrics [10]. The network operates in discrete time, and is driven by jobs (or packets) that arrive according to a stochastic, deterministic, or adversarial process. There is a scheduler which, at each time step, selects one of finitely many possible service vectors. These service vectors can be fairly arbitrary, reflecting interdependence constraints between different servers, e.g., interference constraints in the context of wireless networks.

We focus on the popular MW scheduling policy [33], which operates as follows. At any time step, a MW policy associates to each queue a weight proportional to its length, and selects a service vector that maximizes the total weighted service. MW policies are known to have a number of attractive properties such as maximal throughput [33, 7, 8]. In addition, under certain conditions, e.g., a resource pooling assumption, they minimize the workload in the heavy traffic regime [31]. On the other hand, the queue size dynamics, under MW policies, are quite complex, and a detailed analysis is difficult.

A common way of reducing the complexity of the analysis involves a fluid approximation, also known as a fluid model. The fluid model relies on two simplifications that lead to a description in terms of a set of differential equations (cf. Subsection 2.3): (a) the dynamics evolve in continuous — rather than discrete — time, and (b) the arrival process is replaced by a constant flow with the same average. The fluid model underlies a general technique for dealing with discrete-time networks: approximate the queue lengths by fluid solutions and then analyze the fluid model. This approach has proved useful in the study of the MW dynamics, leading to results on stability ([5, 1]), SSC ([31, 26, 27, 4]) and delay stability under heavy tailed arrivals ([19, 20]). A key ingredient behind such results is an understanding of the accuracy with which fluid solutions approximate the original queue length processes; this paper contributes to this understanding.

### 1.2 State Space Collapse Literature

A prominent application of fluid models is in establishing state space collapse (SSC), i.e., that in the heavy traffic regime, the queue length process stays close to a low-dimensional set, for a long time, and with high probability.

111We note here the important distinction between multiplicative and additive (or strong) state space collapse, which is discussed further in Section 2.4. The literature review here is mostly about multiplicative state space collapse.

Seminal SSC results for communication networks were given in the works of Reiman [22], Bramson [2], and Williams [35]. Subsequently, several works [31, 26, 27, 11] followed the general framework of Bramson [2] to prove SSC under different scheduling policies, including for the case of MW policies. The general approach involves splitting a -long interval into intervals of length , and then showing that the fluid-scaled processes (i.e., ) stay close to the fluid solutions in each one of these smaller intervals. The SSC results then follow from the property that the fluid solutions are attracted to a low-dimensional set, called the set of invariant points.

For single-hop networks with Markovian arrivals operating under a generalization of the MW policy, SSC was proved in [31]. It was also shown, in [31], as a consequence of SSC, that the workload process converges to a reflected Brownian motion, and that every MW- policy222For any given , the MW- policy is an extension of the MW policy in which the “weight” of queue is proportional to , where is the length of the queue at node . with minimizes this workload among all scheduling algorithms. The results of [31] were extended to multi-hop networks in [3], and to another generalization of MW policies in [28]. For multi-hop networks with non-Markovian arrivals operating under MW-, SSC under diffusion scaling was studied in [27]. Several works [12, 24] then used the results of [27] to provide diffusion approximations for the MW dynamics. Finally, SSC has also facilitated the study of the steady-state expectation of the number of jobs in a network [6, 16, 17, 14, 36, 34].

### 1.3 Preview of Results

Our approach to the analysis of MW policies relies on a bound on the distance of the queue length processes from the fluid solutions, in terms of the fluctuations of the cumulative arrival processes. In more detail, we consider a queue length process , driven by an arrival process with average rate , and compare with a fluid solution driven by a steady arrival stream with the same rate , under the same initial conditions .

We already know that, under suitable scaling, the trajectories of the original discrete-time process remain close to the fluid solutions. Furthermore, the fluid model is well-known to be non-expansive333A dynamical system is called non-expansive if for any two trajectories, and , we have . [32]. By combining these facts, it is quite plausible that one should be able to derive bounds of the form

 ∥∥Q(t)−q(t)∥∥≤c+t−1∑τ=0∥∥A(τ)−λ∥∥, (1)

where is the vector of arrivals at each one of the queues at time , and is a constant which is independent of . However, our goal is to derive a stronger bound, of the form

 ∥∥Q(t)−q(t)∥∥≤c+Cmaxk

for some constants and , independent of and . The bounds in Eqs. (1) and (2) are qualitatively different. Under common probabilistic assumptions, and with high probability, grows at a rate of , whereas only grows as (roughly) .

The sensitivity bound (2) allows us to make several contributions to the study of the MW policy.

• We obtain a very simple proof of the convergence of fluid-scaled processes to fluid solutions; cf. Corollary 1.

• We establish a strong SSC result for the MW policy. In particular, we derive an upper bound and a matching lower bound on the time scale over which additive SSC takes place; cf. Theorem 3. As a corollary, when the arrivals are i.i.d, we establish SSC for the exponentially scaled queue length process, , for some constant ; cf. Corollary 2.

• In another corollary, we establish an additive SSC result in diffusion scaling and under non-Markovian arrivals, which strengthens the currently available diffusion scaling results under the MW policy in several respects; see Section 2.4 for more details.

• As will be reported elsewhere, the sensitivity bound (2) provides tools that allow us to resolve an open problem from [20], on the delay stability in the presence of heavy-tailed traffic.

On the technical side, the proof of the sensitivity bound (2) exploits a similar bound from our earlier work [29] on the sensitivity of a class of hybrid subgradient dynamical systems to fluctuations of external inputs or disturbances. The main challenges here concern the transition from discrete to continuous time, as well as the presence of boundary conditions, as queue sizes are naturally constrained to be non-negative. For the proof of our SSC results, we follow the general framework of Bramson [2], while also taking advantage of the sensitivity bound (2). We believe that our tight characterization of the time scale over which SSC holds would have been very difficult without the strong sensitivity bound (2).

### 1.4 Outline

The rest of the paper is organized as follows. In the next section, we describe the network model and our conventions, along with some background on fluid models and SSC. In Section 3, we present our central result, which is an inequality of the form (2); cf. Theorem 2. Then, in Section 4, we present our SSC results. We provide the proofs of our results in Sections 5 and 6, while relegating some of the details to appendices, for improved readability. Finally, in Section 7, we offer some concluding remarks and discuss possible extensions.

## 2 System Model and Preliminaries

In this section, we list our notational conventions, define the network model that we will study, and go over the necessary background on fluid models and State Space Collapse (SSC).

### 2.1 Notation and Conventions

We denote by , , and the sets of real numbers, non-negative reals, positive reals, integers, non-negative integers, and positive integers, respectively.

A vector , will always be treated as a column vector, with components , for . We use and , to denote the transpose and the Euclidean norm of , respectively. For any two vectors and in , the relation indicates that , for all . Furthermore, we use to denote the componentwise minimum, i.e., the vector with components . For a vector and a set of indices, we use to denote the vector whose th entry is equal to the th entry of if , and is equal to zero if . Finally, we let be the -dimensional vector with all components equal to 1.

The notation stands for the convex hull of a set of vectors in . Given a vector and a set , we let . We use to denote the Euclidean distance of from the set . Furthermore, if is an matrix, we let be the image of the set

under the linear transformation associated with

. Given a vector , denotes the diagonal matrix with the entries of on its main diagonal.

Finally, for a function , and with a slight departure from standard conventions, we use either or to denote the right derivative of at , assuming that it exists.

### 2.2 The Network Model and the MW Policy

A discrete-time multi-hop network with fixed deterministic routing is specified by queues, a non-negative routing matrix , and a finite set of actions (or service vectors) that correspond to the different schedules that can be applied at any time.

Note routing is pre-specified, and is not affected by the queue sizes or the scheduler. Furthermore, single-hop networks correspond to the special case where

is the zero matrix. More generally, the most common case (single-path routing) is one where the routing matrix has entries in

, with at most one nonzero entry in each column, and where the th entry being one indicates that any work completed at queue is transferred to queue for further processing. However, we allow for more general matrices because this additional freedom does not affect the main proofs, and also allows for a simpler treatment of weighted MW policies; see Lemma 4, in the proof of Theorem 2.

The following assumption will be in effect throughout the paper, and is naturally valid in typical application contexts.

###### Assumption 1.

For any , and any , the set also contains the vector , i.e., the vector obtained by setting the th component of to zero.

According to Assumption 1, if a certain service vector is allowed, it is also possible to follow at all queues other than queue , while providing no service to queue . In particular, the zero vector is always an element of . On the technical side, Assumption 1 appears innocuous; however, it is indispensable for the proof technique used in this paper, and has also been made in earlier work (cf. Assumption 2.3 of [27]).

We consider networks that operate in discrete time. The input to a network is a collection of discrete-time, non-negative arrival processes, described by functions , where stands for the workload that arrives to queue during the th time slot. Whenever the arrival processes are ergodic stochastic processes, we define the arrival rate vector as the vector whose th component is the average of the process . We will use to denote the (always non-negative) workload at queue at time , and to denote the corresponding workload vector. In the sequel, we will use the terms workload, queue size, and queue length, interchangeably. The evolution of is determined by the particular policy used to operate the network.

We now proceed to define weighted Max-Weight (WMW) policies, which can be viewed as either a generalization of MW policies or as a special case of the broader class of MW- policies444A MW- policy is obtained by replacing in (3) by , where is a function in an appropriate class. considered in [7]. We are given a multihop network with queues, as described above, along with a positive vector , and the associated diagonal matrix . For any , we let be the set of maximizers of :

 Sw(Q)≜argmaxμ∈SQTW(I−R)μ. (3)

A WMW policy associated with (or -WMW, for short) chooses, at each time , an arbitrary service vector .555For a concrete example, if corresponds to serving only queue , with unit service rate, and if work completed at queue is routed to queue , the term is of the form . A Max-Weight (MW) policy is a special case of a WMW policy, in which . When dealing with MW policies, we drop the subscript , and write instead of .

Given a network and an arrival process , the evolution of the queue lengths is given by:

 Q(t+1)=Q(t)+A(t)+(R−I)min(μ(t),Q(t)),∀t∈Z+, (4)

where is the service vector chosen by the policy at time , and as mentioned earlier, is to be interpreted componentwise. Equation (4) corresponds to the situation where a time slot begins with a queue vector , and then a service vector is chosen and applied. Finally, the new arrivals are recorded at the end of the time slot and contribute to the new queue vector .

Consider an ergodic and Markovian arrival process with arrival rate vector , for which there exists some scheduling policy that stabilizes the network, i.e., results in a positive recurrent process. The closure of the set of all such vectors is called the capacity region and is denoted by .

We now record a fact that will be used later, in the proofs of Lemma 5 and Claim 4. Fix some in the capacity region and consider a stabilizing policy. We define as the averrage departure rate from queue . Then, the flow conservation property implies that . Moreover, following an argument similar to the one in Section 3.C of [33], there exists a vector such that . Assumption 1 then implies that , and as a result . In conclusion,

 C⊆(I−R)Conv(S). (5)

A remarkable property of MW and WMW policies is that they are throughput optimal in the sense that for any in the interior of , and any ergodic Markovian arrival process with average arrival rate vector , the resulting process is positive recurrent [33]. Similar throughput optimality results are available for extensions of MW, e.g., for the so-called -MW policies [7].

### 2.3 The Fluid Model

The fluid model associated with the MW policy is a deterministic dynamical system that runs in continuous time, and in which the arrival stream is replaced by a steady “fluid” arrival stream with rate vector . We will be working with the following definition of the fluid model; somewhat different but equivalent definitions can be found in [27] and [20].

###### Definition 1 (Fluid Solutions).

We are given an arrival rate vector and an initial queue length vector . A fluid model solution (or, simply, fluid solution) is an absolutely continuous function that together with a collection of functions , for , and another function , satisfies the following relations, almost everywhere:

 ˙q(t)=λ+(R−I)⎛⎝∑μ∈Ssμ(t)μ−y(t)⎞⎠, (6)
 ∑μ∈Ssμ(t)=1, (7)
 yi(t)≤∑μ∈Ssμ(t)μi,i=1,…,n, (8)
 if qi(t)>0,then yi(t)=0,i=1,…,n, (9)
 if μ∉Sw(q(t)),then sμ(t)=0,∀ μ∈S. (10)

It is known that for any multi-hop network and any initial condition, a fluid solution always exists (cf. Appendix A of [4] and Lemma 9 of [20]), and is unique (cf. Lemma 10 of [20]), even though the corresponding and need not be unique. Moreover, for , (6)–(10) imply that remains non-negative for all subsequent times . Later on, in Proposition 2, we will show that fluid solutions admit an alternative description, as the trajectories of a related subgradient dynamical system.

We will be particularly interested in the set of invariant states of the fluid model, which, for any in the capacity region, is defined by (cf. Theorem 5.4(iv) of [27])

 I(λ)≜{q0∈Rn+∣∣q(t)=q0,∀t, is a fluid solution}. (11)

Our notation is chosen to emphasize the dependence on of the set of invariant states. We note that if belongs to the interior of the capacity region, then is a singleton, equal to . Thus, can be non-trivial only if lies on the boundary of .

We now record a scaling property of the set of fluid solutions.

###### Lemma 1.

Consider a fluid solution and a constant . Let , for all . Then, is also a fluid solution.

###### Proof.

Note that the set of maximizing schedules in Eq. (3) does not change when we scale the queue vector by a positive constant. Therefore, for any ,

 S(ˆq(t))=S(q(rt)/r)=S(q(rt)). (12)

Consider the functions and that together with satisfy the fluid model relations (6)–(10). Let and , for all and all . Then, it is easy to verify that , , and also satisfy (6)–(10). Therefore, is also a fluid solution. ∎

Suppose that is in the capacity region and that , so that is a fluid solution. Then, Lemma 1 implies that for any scalar , is also a fluid solution, and therefore . Furthermore, it is not hard to see that the identically zero function is also a fluid solution, so that . We conclude that is a cone, i.e.,

 αI(λ)=I(λ),∀ α>0. (13)

The interest in fluid solutions stems from the fact that they provide approximations to suitably scaled versions (i.e., under “fluid scaling”) of the original process. We summarize here one such result, which is a special case of Theorem 4.3 in [27]; similar results are given in [4] (Lemmas 4 and 5).

###### Proposition 1.

Fix some , , and . Letting range over the positive integers, consider a sequence of arrival processes that satisfies

 1rmaxt≤rT∥∥t∑τ=0(Ar(τ)−λ)∥∥−−−−→r→∞0, (14)

almost surely. Let be the process generated according to Eq. (4) when the arrival process is and the initial condition is . We define the continuous time scaled processes , and note that , for all . Finally, let be a fluid solution, under that particular vector , initialized with . Then,

 supt≤T∥∥ˆqr(t)−q(t)∥∥−−−−→r→∞0, (15)

almost surely.

Condition (14) is typically satisfied under common probabilistic assumptions, e.g., when is an i.i.d. process with mean and bounded domain, or more generally of exponential type. Thus, loosely speaking, convergence of the arrival processes leads to convergence of the queue processes.

As we shall see in Section 3, our results will allow for stronger statements; namely, we will show that the rate of convergence in Eq. (14) provides bounds on the rate of convergence to the fluid solution, in Eq. (15); cf. Corollary 1.

### 2.4 State Space Collapse

In this section, we discuss known results about State Space Collapse (SSC) under a MW policy, thus setting the stage for a comparison with the results we will present in Section 4.

We consider the heavy traffic regime, where the arrival rate vector gets arbitrarily close to some point on the outer boundary of the capacity region. In this regime, the average queue lengths typically tend to infinity, yet it is often the case that the queue length vector stays close to the set of invariant states, . This phenomenon is called SSC, and has been studied extensively, mostly under the so-called diffusion scaling. In this scaling, we start with a sequence of stochastic processes, indexed by , and then proceed to study a sequence of scaled processes , referred to as diffusion-scaled processes, defined by

 ˆqr(t)=1rQr(⌊r2t⌋),t≥0. (16)

The extent to which the queue length process stays close to the set of invariant states is in general determined by the magnitude of the fluctuations of the arrival process. It is therefore natural to start the analysis with some assumptions on these fluctuations. General SSC results, under the MW policy and some of its extensions, were provided in [27], under the following assumption.666In our statement of the assumption, we modify the notation of [27], interchanging the roles of and , to preserve consistency with the rest of this paper.

###### Assumption 2 (Assumption 2.5 of [27]).

Let be a sequence of arrival processes indexed by . We assume that for each , is stationary,777“Stationary” means that the have the same distribution for all , but without necessarily being independent. with mean , and that as . We furthermore assume that there exists a sequence converging to as , such that

 (17)

Note that Assumption 2 is quite general, not requiring the arrival processes to be i.i.d. or Markovian. Theorem 7.1 of [27], slightly rephrased,888Our rephrasing consists of replacing the term denoted by in [27] by . This is legitimate, because (cf. Theorem 5.4 (iv) in [27]) and therefore . establishes that for a network operating under a MW- policy (a generalization of WMW policies, and under certain conditions on ), for any , and under Assumption 2, the diffusion-scaled queue length processes satisfy, for any ,

 P⎛⎜ ⎜⎝supt∈[0,T]d(ˆqr(t),I(λ))max(1,supt∈[0,T]ˆqr(t))>δ⎞⎟ ⎟⎠−−−−→r→∞0, (18)

when , for some .

The bound in (18) is referred to as multiplicative SSC. Yet, there is a stronger notion, called additive SSC, which involves a bound similar to (18), but with the term absent from the denumerator, and which is known to hold under i.i.d. arrivals.

###### Theorem 1 ([24] Theorem 7.7).

Consider a network operating under a MW- policy, with , with i.i.d. and uniformly bounded arrivals with rate , for some , and the associated diffusion-scaled queue length processes . Assume that , for some . Then,999The result in [24] assumed that ; however, the proof extends to the case of general . The authors are grateful to Y. Zhong and D. Shah for discussions about the scope of the results in [24]. for any ,

 P(supt∈[0,T]d(ˆqr(t),I(λ))>δ)−−−−→r→∞0. (19)

Compared to the above literature, our results only apply to the case where (i.e., the MW policy), but allow for queue-dependent weights, so that the weight of queue is . More crucially, our results (cf. Section 4 and Theorem 3, in particular):

• remain valid as long as , which is a weaker condition than , for a fixed ;

• unlike [24], we do not require the arrival process to be i.i.d. or bounded, as long as the arrival process has certain concentration properties. Furthermore, the concentration properties that we require (cf. Definition 2) are weaker than Assumption 2, for the case of diffusion scaling (cf. Corollary 3);

• apply to scalings other than diffusion scaling, and include a converse result that characterizes the possible scalings for which additive SSC holds.

We finally note another related line of work which studies a property similar to SSC, namely, the extent to which the steady-state distribution is concentrated in a neighbourhood of the set of invariant points. In particular, [15] and [14] have characterized the tail of the steady-state distribution of the distance from the set of invariant points for the case of an input-queued switch.

## 3 Main Result: Sensitivity

The backbone behind all of the results is the following main theorem.

###### Theorem 2 (Sensitivity of WMW policy).

For a network operating under a WMW policy, there exists a constant , to be referred to as the sensitivity constant, that satisfies the following. Consider an arrival process and the corresponding queue length process . Let be a fluid solution corresponding to some , and initialized with . Then, for any ,

 (20)

Note that the result holds without having to assume that lies inside the capacity region. The proof is given in Section 5, and the key steps are as follows. We show that the study of WMW policies can be reduced to the study of MW policies. Furthermore, given a network operating in discrete time under the MW policy, we introduce an associated continuous-time dynamical system, which we call the induced dynamical system. Next, we show that the fluid solutions and the queue length processes of the network can be viewed as unperturbed and perturbed trajectories of the induced dynamical system, respectively. We finally argue that the induced dynamical system falls within the class of subgradient systems that were studied in [29], and apply the main result in that reference to prove (20). The reductions that are developed in the course of the proof, may be of independent interest.

### 3.1 Convergence to Fluid Model Solutions

An immediate consequence of Theorem 2, together with Lemma 1, is a bound on the distance of the fluid scaled process from a fluid solution .

###### Corollary 1.

Consider a network operating under the WMW policy and let be the constant in Theorem 2. Fix an arrival function and some . Let be the process generated according to Eq. (4) when the arrival process is and the initial condition is . Let . Let be a fluid solution corresponding to some and initialized at . Then, for any ,

 (21)

Corollary 1 strengthens (15) significantly. Any statistical assumptions on the fluctuations of the arrival process readily yield concrete upper bounds on the distance of the original process from its fluid counterpart.

## 4 State Space Collapse

In this section, we apply Theorem 2 to establish a general additive SSC result; cf. Theorem 3. We then continue with some corollaries on exponential scaling or diffusion scaling. Our approach can also be used to obtain results that apply in steady-state. However, we do not go into that latter topic because such results can also be proved using simpler, more direct methods, as in [15] and [14].

### 4.1 Definitions and Preliminaries

At the core of our proofs lies the following lemma, which asserts that fluid solutions are attracted to the set of invariant states, which was defined in Eq. (11). The proof of the lemma is given in Appendix A.

###### Lemma 2 (Attraction to the Set of Invariant States).

Consider a network operating under the MW policy and a vector in its capacity region. There exists a constant such that for any fluid solution associated with , and any time ,

 q(t)∉I(λ)  ⟹  d+dtd(q(t),I(λ))≤−α(λ),

with this right-derivative being guaranteed to exist.

We continue with a definition that quantifies the rate at which a family of processes concentrates on its mean.

###### Definition 2 (f-Tailed Sequence of Random Processes).

Consider a function and a vector . Let be a sequence of random processes indexed by . Assume that for each , is stationary, has expected value , and that . Suppose that for every ,

 f(r,δ)P(1rsupt≤r∥∥t∑τ=0(Ar(τ)−λr)∥∥>δ)−−−−→r→∞0. (22)

Then, is said to be an -tailed sequence of random processes with limit mean , and we refer to as the concentration rate function.

Later, we will show that the time scale over which SSC holds is almost proportional to the best possible concentration rate function . We observe that any sequence of random processes that satisfies Assumption 2 is a sequence of -tailed processes, with . However, the reverse is not true: Assumption 2 involves an additional requirement of uniform convergence over all values of an additional indexing parameter , whereas Definition 2 essentially only considers the case . Thus, Definition 2 is less restrictive, easier to check, and also seems more natural.

There are many processes whose concentration properties are well understood, and which translate to the requirements in Definition 2, for a suitable concentration rate function . We record one such fact in Lemma 3 below, which deals with bounded i.i.d. arrival processes, and which is proved in Appendix B.

###### Lemma 3 (Bounded I.I.D. Processes are Exponential-Tailed).

Fix a vector and a constant . Consider a sequence of random processes indexed by . Suppose that for every

, the random variables

are i.i.d., and that , for all . Denote the mean of by , and suppose that . Take any constant , and let . Then, is an -tailed sequence of random processes with limit mean .

Similar results are possible for arrival processes that are modulated by a finite and ergodic Markov chain. The boundedness assumption can also be removed under standard conditions on the moment generating function of

.

We now define scaled processes, as a generalization of the fluid and diffusion scaled processes.

###### Definition 3 (g-Scaled Processes).

Consider an increasing function and a sequence of random processes. Then, the corresponding sequence of -scaled processes is defined as

 ˆqr(t)=1rQr(⌊g(r)t⌋), (23)

for all and all .

The fluid scaling and the diffusion scaling of a random process are particular -scaled, processes corresponding to and , respectively. Definition 3 allows for a more general scaling of time.

### 4.2 Main SSC Result

We now present our main SSC result.

###### Theorem 3 (Strong State Space Collapse).

Consider a network operating under a WMW policy, and a vector in its capacity region, with a corresponding set of invariant states . Fix some , and let be a sequence that converges to . Consider two functions and , with . Let be an -tailed sequence of arrival processes with limit mean , and let be a corresponding sequence of -scaled queue length processes. Suppose that , as .

1. Suppose that for every , we have . Then, for any ,

 P(supt∈[0,T]d(ˆqr(t),I(λ))>δ)−−−−→r→∞0. (24)
2. Under the same assumptions as in Part (a), we can also bound the rate of convergence in (24): for any , there exists an such that

 rf(r,ϵ)g(r)P(supt∈[0,T]d(ˆqr(t),I(λ))>δ)−−−−→r→∞0. (25)

Moreover, for the case of a MW policy, (25) holds for every , where is the sensitivity constant of the network (cf. Theorem 2) and is the constant in Lemma 2.

3. Conversely, suppose that and , are such that , for every , and . Then, for any network operating under a MW policy, any arrival rate in its capacity region (excluding its extreme points), and any , there exists an -tailed sequence of arrival processes satisfying (22) and a corresponding sequence of -scaled processes , , initialized at , such that

 P(supt∈[0,T]d(ˆqr(t),I(λ))>δ)−−−−→r→∞1, (26)

for all .

The proof of Theorem 3 is given in Section 6. The first part relies on the facts that the queue length process stays close to a fluid solution (Theorem 2), and that a fluid solution is attracted to the invariant set (Lemma 2). The proof of the converse relies on an explicit construction.

We note that Part (a) is a straightforward corollary of Part (b). Nevertheless, we have included the statement of Part (a) because it is in a form comparable to SSC results in the literature, and also because it facilitates a comparison with the converse result in Part (c).

Theorem 3 ties together the time scaling over which SSC occurs and the concentration rate function, , of the arrival processes. The underlying intuition is that if the queue length process is initialized sufficiently close to , then it will stay in a -neighbourhood of , with high probability, for a period of time proportional to . This enables us to prove additive SSC over time scales much longer than those underlying the diffusion scaling, as in the next subsection.

### 4.3 Special Cases of SSC

In this section, we apply Theorem 3 to obtain more concrete SSC results. The first result concerns SSC over an exponentially large time scale. While it refers to bounded i.i.d. processes, it admits straightforward extensions to arrival processes with a concentration rate function that grows exponentially with , as is the case whenever a suitable Large Deviations Principle holds.

###### Corollary 2 (Bounded I.I.D. Arrivals: SSC over an Exponential Time Scale).

Consider a network operating under a MW policy, a vector in its capacity region, a , and a sequence of arrival processes that satisfy the assumptions of Lemma 3. Consider a , where is the input sensitivity constant of the network, is the constant in Lemma 2, and is an upper bound on the size of arriving jobs (cf. Lemma 3). Consider the -scaling of the queue length processes,

 (27)

and suppose that , as . Then, for any ,

 eγrP(supt∈[0,T]d(ˆqr(t),I(λ))>δ)−−−−→r→∞0. (28)
###### Proof.

Let . Then, , and Lemma 3 implies that is an -tailed sequence of processes for . Let . Then, . Let be the time scaling in the definition (27) of . Then,

 rf(r,ϵ)g(r)=rexp(2βrϵ/na2)exp(βrϵ/na2)=rexp(βrϵ/na2)>exp(γr). (29)

Therefore, the assumptions in Part (b) of Theorem b are satisfied, and

 (30)

Thus, (28) holds, which is the desired result. ∎

We note that Part (c) of Theorem 3 provides a partial converse to Corollary 2

: under i.i.d. arrivals with nonzero variance, additive SSC does not hold over a super-exponential time scale.

The next corollary of Theorem 3(a) concerns additive SSC under diffusion scaling.

###### Corollary 3 (State Space Collapse in Diffusion Scaling).

Consider a network operating under a WMW policy, and a function such that , for all . Consider a in the capacity region, an -tailed sequence of arrivals with limit mean , and a corresponding diffusion-scaled queue length processes (cf. (16)). Suppose that , as . Then, for any ,

 P(supt∈[0,T]d(ˆqr(t),I(λ))>δ)−−−−→r→∞0. (31)

Corollary 3 strengthens Theorem 1, for the case of WMW policies, in that the assumption of i.i.d. arrivals is removed. We only require a concentration property for the arrival process, such as

 (32)

which is even weaker than Assumption 2. Moreover, under a MW policy and i.i.d. arrivals, our Corollary 2 extends Theorem 1 by establishing SSC over an exponential time scale (as opposed to the diffusion scaling). For further perspective with respect to existing results, please refer to the discussion following the statement of Theorem 1, in Section 2.4.

## 5 Proof of Theorem 2

In this section, we present the proof of Theorem 2, organized in a sequence of subsections. We first show in Subsection 5.1 that for any network operating under a WMW policy, there is another network operating under a MW policy whose queue length process is a linear transformation of the queue length process of the original network. Thus, we can just focus on the MW policy. In Section 5.2 we review a general sensitivity result on a class of dynamical systems with piecewise constant drift. Next, in Subsection 5.3 we introduce an induced continuous-time dynamical system that provides the bridge between the original discrete-time process under a MW policy and the fluid model. The proof concludes in Subsection 5.4 by applying the general sensitivity result to the induced system.

### 5.1 From WMW to MW

In order to leverage the tools that we will develop for MW policies and apply them to the more general WMW policies, we start with a reduction from WMW policies to a MW policy. This is accomplished through the following lemma, which shows that the queue lengths and fluid solutions under a WMW policy are linear transformations of queue lengths and fluid solutions under a MW policy, in a transformed network.

###### Lemma 4 (Reduction of WMW Dynamics to MW Dynamics).

Consider a network with action set and a routing matrix . Fix a weight vector , an arrival function , and an arrival rate vector . Let be a queue length process of corresponding to the arrival , under a -WMW policy. Let , , and , for all . Let be a network with action set and routing matrix . Then,

1. is a queue length process of corresponding to the arrival , under a MW policy.

2. is a fluid solution of corresponding to arrival rate and unit weights (as in MW) if and only if is a fluid solution of corresponding to arrival rate and WMW weights .

###### Proof.

Given some and , we let and . Then,

 ˜QT(I−˜R)˜μ = (W1/2Q)T(I−W1/2RW−1/2)W1/2μ = QTW1/2W1/2(I−R)W−1/2W1/2μ = QTW(I−R)μ.

Therefore, is a maximizer of if and only if is a maximizer of , i.e., .

For Part (a), for any ,

 ˜Q(t+1) = W1/2Q(t+1) = = = ˜Q(t)+˜A(t)+(˜R−I)min(˜μ(t),˜Q(t)).

Therefore, satisfies the evolution rule (4) of , and is a queue length process corresponding to the arrival function . Since evolves according to a -WMW policy, we have . As shown earlier, this implies that , and thus indeed follows a MW policy.

For Part (b), consider a set of functions and for , that together with satisfy (6)–(10). It is not difficult to see that all equations remain valid when , , , , , and are replaced with , , , , , and , respectively. The reverse direction is also true. Therefore, is a fluid solution of corresponding to the arrival rate vector , with unit weights, if and only if is a fluid solution of corresponding to the arrival rate vector , with weight vector . ∎

### 5.2 FPCS Dynamical Systems

In this subsection, we review some definitions and results from [29]. A dynamical system is identified with a set-valued function and the associated differential inclusion . We start with a formal definition, which allows for the presence of perturbations.

###### Definition 4 (Trajectories of a Dynamical System).

Consider a dynamical system , and let be a right-continuous function, which we refer to as the perturbation. Suppose that and are measurable functions of time that satisfy

 X(t)=∫t0ζ(τ)dτ+U(t),∀ t≥0, (33)
 ζ(t)∈F(X(t)),∀ t≥0. (34)

We then call a perturbed trajectory corresponding to . In the special case where is identically zero, we also refer to as an unperturbed trajectory.

For a convex function , we denote its subdifferential by . We say that is a subgradient dynamical system if there exists a convex function , such that for any , . Furthermore, if is of the form

 Φ(x)=maxi(−μTix+bi),

for some , , and with ranging over a finite set, we say that is a Finitely Piecewise Constant Subgradient (FPCS, for short) system. Note that for such systems, is always equal to the convex hull of the vectors that maximize .

FPCS systems admit a very special sensitivity bound.

###### Theorem 4 ([29] Theorem 1).

Consider an FPCS system . Then, there exists a constant such that for any unperturbed trajectory , and for any perturbed trajectory with corresponding perturbation and the same initial conditions , we have

 ∥∥X(t)−x(t)∥∥≤Csupτ≤t∥∥U(τ)∥∥,∀ t∈R+. (35)

Moreover, for any , the bound (35) applies to the (necessarily FPCS) system with the same constant .

### 5.3 Reduction of the MW Dynamics to an FPCS System

Throughout this subsection, we restrict attention to a network operated under an (unweighted) MW policy. In order to take advantage of Theorem 4, we show that a discrete-time network can also be represented as an associated (“induced”) FPCS dynamical system.

###### Definition 5 (Induced FPCS system).

For a network with action set and routing matrix , the induced FPCS system is the subgradient dynamical system associated with the convex function

 Φ(x)=maxμ∈S((I−R)μ)Tx. (36)

In particular, is the convex hull of the image of under the linear transformation , where is the set of vectors that maximize .

We start with the observation that fluid solutions of a network are trajectories of the induced FPCS system.

###### Proposition 2 (Fluid Model Solutions as Trajectories of the Induced FPCS System).

Consider a network and its induced FPCS system . Let be a fluid solution of the network corresponding to arrival rate . Then, is an unperturbed trajectory of the dynamical system . Conversely, any unperturbed trajectory of , with , is a fluid solution corresponding to .

###### Proof.

For a vector and a set of indices, we let

 DJ(μ)≜{ξ∈Rn+∣∣ξi=μi, for all i∉J, and 0≤ξj≤μj, % for all j∈J}. (37)

Equivalently,

 DJ(μ)=Conv({σ−K(μ)∣∣K⊆J}), (38)

where is a vector whose th entry is equal to the th entry of if , and equal to zero if . Recall that is defined as the set of all that maximize ; cf. (3).

###### Claim 1.

Fix a and a . Let