Some Properties of Length Rate Quotient Shapers

07/11/2021 ∙ by Yuming Jiang, et al. ∙ 0

Length Rate Quotient (LRQ) is the first algorithm of interleaved shaping – a novel concept proposed to provide per-flow shaping for a flow aggregate without per-flow queuing. This concept has been adopted by Time-Sensitive Networking (TSN) and Deterministic Networking (DetNet). An appealing property of interleaved shaping is that, when an interleaved shaper is appended to a FIFO system, it does not increase the worst-case delay of the system. Based on this "shaping-for-free" property, an approach has been introduced to deliver bounded end-to-end latency. Specifically, at each output link of a node, class-based aggregate scheduling is used together with one interleaved shaper per-input link and per-class, and the interleaved shaper re-shapes every flow to its initial traffic constraint. In this paper, we investigate other properties of interleaved LRQ shapers, particularly as stand-alone elements. In addition, under per-flow setting, we also investigate per-flow LRQ based flow aggregation and derive its properties. The analysis focuses directly on the timing of operations, such as shaping and scheduling, in the network. This timing based method can be found in the Guaranteed Rate (GR) server model and more generally the max-plus branch of network calculus. With the derived properties, we not only show that an improved end-to-end latency bound can be obtained for the current approach, but also demonstrate with two examples that new approaches may be devised. End-to-end delay bounds for the three approaches are derived and compared. As a highlight, the two new approaches do not require different node architectures in allocating (shaping / scheduling) queues, which implies that they can be readily adapted for use in TSN and DetNet. This together with the derived properties of LRQ shed new insights on providing the TSN / DetNet qualities of service.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Interleaved shaping is a novel concept for traffic shaping, originally proposed by Specht and Samii in [24]. Conceptually, its idea is to perform per-flow traffic shaping within a flow aggregate using only one FIFO queue. An appealing property of interleaved shaping is the so-called “shaping-for-free” property: When an interleaved shaper is appended to a FIFO system and shapes flows to their initial traffic constraints, it does not increase the worst-case delay of the system. Based on this property, Specht and Samii also proposed in [24] an approach to achieve bounded worst-case end-to-end (e2e) delay in the network. The approach includes a specific way to allocate shaping and scheduling queues in switches and re-shaping flows to their initial traffic constraints using the corresponding interleaved shaping algorithms.

The concept of interleaved shaping, together with the approach of allocating queues and reshaping traffic, has been adopted and extended by IEEE Time-Sensitive Networking (TSN) [12] and IETF Deterministic Networking (DetNet) [7] to deliver bounded e2e latency. The concept is called Asynchronous Traffic Shaping (ATS) in the former [13] while Interleaved Regulation in the latter [6].

In [24], two algorithms for interleaved shaping are introduced, which are Length Rate Quotient (LRQ) and Token Bucket Emulation (TBE), together with a timing-based analysis on the worst-case e2e delay achieved by them. While LRQ is for traffic constraints where the gap between consecutive packets satisfies a length rate quotient condition, TBE is for the well-known token bucket (TB) or leaky bucket (TB) traffic constraints. In [21], more types of traffic constraints are investigated under a unified traffic constraint concept called “Pi-regularity” and the resultant interleaved shapers are called Interleaved Regulators (IRs). The “shaping-for-free” property is also proved for IRs in [21].

Surprisingly, other than the “shaping-for-free” property, few other properties of interleaved shapers have been reported. As a step towards filling the gap, this paper is intended. Specifically, we focus on LRQ, the first interleaved shaping algorithm and derive its properties under both the interleaved setting and the per-flow setting. For interleaved LRQ shapers, in addition to “shaping-for-free”, the proved properties include conformance, output characterization, a sufficient and necessary condition to ensure the existence of bounded delay, service characterization, and delay bounds. For per-flow LRQ shapers, in addition to the properties as a special case of the interleaved version, we particularly investigate properties of a per-flow LRQ based flow aggregator.

Similar to the analysis in [24], ours also employs timing based analysis, which directly investigates the timing of various operations, such as shaping and scheduling, in the network and the time relationships between them. Generally, this timing based analysis method can be found in the max-plus branch of network calculus (NC) [3] [20]. In this paper, for server modeling, rather than taking the (min-plus) service curve model [20] or the max-plus NC version of service curve, i.e., the -server model [3], we particularly adopt the Guaranteed Rate (GR) server model [9, 11], based on which the various properties are derived. An underlying motivation is the known fact that, without additional treatment, the directly obtained delay bound based on service curve models is looser than that from GR: See, e.g., [3] [20] for discussion about the treatment and [19] for a timing analysis based discussion about the underlying reason and its impacts.

With the derived properties, we discuss that an improved e2e delay bound can be obtained for the approach proposed in [24] , in comparison with related bounds found in the literature, e.g., [24] for TSN ATS [13] and [23] for DetNet [6]. This improvement is due to the adopted GR-based timing analysis. In [23] and indeed in most TSN / DetNet delay bound analysis literature as reviewed and discussed in [28], the analysis is based on the service curve server model. To illustrate the difference, strict priority is specifically used as an example, whose delay bounds, obtained using the service curve model [23], the timing method in [24] and the adopted GR model, are compared.

In addition, we demonstrate with two examples that new approaches, based on the derived properties, may be devised which can also deliver bounded e2e latency. A comparison of the e2e delay bounds from the three approaches suggests that, by employing specific information of the network, the accordingly devised approaches may be able to offer better e2e delay bounds, in comparison to the universal approach [24].

The rest of this paper is organized as follows. In the next section, i.e, Section 2, the LRQ interleaved shaping algorithm and its modeling are first introduced, followed by some other preliminaries. They include traffic and server models that are used in the analysis and/or comparison. Our focused server model is GR. In Section 3, the focus is on properties of interleaved LRQ. In Section 4, properties of per-flow LRQ are investigated. In Section 5, the node structure suggested by the universal approach is introduced. Following that, an improved delay bound for the current e2e delay approach is presented, with strict priority as an example scheduling discipline to discuss the improvement. Then, to demonstrate how the derived properties may be exploited, two new approaches with their e2e delay bounds are presented. Moreover, a discussion comparing the three approaches and their delay bounds is also provided in Section 5. Finally concluding remarks are given in Sec. 6.

2 The LRQ Algorithm and Preliminaries

2.1 Notation

We consider FIFO systems serving flows that belong to the same class. A flow is a sequence of packets. Each system has one or multiple flows as inputs and outputs. In the case of multiple flows, we sometimes treat these flows as one aggregate flow. By convention, a packet is said to have arrived to a system at the input (respectively departed the system at the output) when and only when its last bit has arrived to (respectively departed) the system. If multiple packets arrive at the same time, their original order, if it exits, is preserved; otherwise, the tie is broken arbitrarily. When a packet arrives seeing the system busy, the packet will be queued and the buffer size for the queue is assumed to be large enough ensuring no packet loss. All queues are FIFO and are initially empty.

For a system, let denote the set of flows. For each flow , let denote the -th packet in the sequence, where , and its maximum packet length. For every packet , we denote by its arrival time to the system, its departure time from the system, and its length. The maximum packet length of the system is denoted by . In addition, we use and to respectively denote the amount of traffic of flow which arrives to and departs from the system within time period , with and and and .

Sometimes, reference time functions are used to characterize how the flow is treated by the system. Specifically, we use and to respectively refer to the times when the packets have reached the head of the queue and become eligible for receiving service, and the times when packets are expected to depart. They define reference eligible time and expected finish time for each packet of the flow .

For a composite system consisting of multiple systems, subscripts will be added. For instance, for a node in a network, and for a link at the node , and will be respectively used, where may be any of the parameters introduced above.

As a summary, the notation uses the form of , where represents a system, e.g. a node, and a subsystem in , e.g., a link; represents a flow, and the packet number in the flow. may be a packet level parameter, e.g., packet arrival time , departure time , length ; a flow level parameter, e.g., rate and burstiness for the traffic constraint and reserved or guaranteed service rate , and and for cumulative traffic amount; a reference time, e.g., for eligibility time / virtual start time and for virtual finish time. When it is clear from the context, some of the superscripts or subscripts may be omitted.

2.2 The LRQ Interleaved Shaping Algorithm

2.2.1 The LRQ algorithm

Length Rate Quotient (LRQ) is the first algorithm of interleaved shaping [24]. Consider an LRQ shaper, whose FIFO queue is shared by an aggregate of flows. The LRQ shaper performs per-flow interleaved shaping on the aggregate, according to the algorithm shown in Algorithm 1 [24].

Initialization:
Shaping:

1:while (true) {
2:wait until ;
3:
4:;
5:
6:wait until ; output ;
7:
8:;
9:;
10:}
Algorithm 1 Pseudo code of the LRQ algorithm

The LRQ algorithm shown in Algorithm 1 takes the original form in  [24]. As is clear from Algorithm 1, there is only one FIFO queue where per-flow shaping is conducted. In Algorithm 1, denotes the queue of the shaper, where packets join in the order of their arrival times. After reaching the head of the queue, the head packet is checked for its eligibility of output from the queue, which depends on the flow that it belongs to. Time stamp stores the eligible time of flow for its next packet. At the output time of packet , the time stamp is updated to equal the present / output time plus the quotient (), where is the length of . In this way, the next packet of flow after this packet is at least delayed until the time reaches .

2.2.2 A model for LRQ

To model the LRQ algorithm, let denote the packet number of in Algorithm 1, i.e., is the -th packet of the aggregate flow coming out of the queue in Line 3. In addition, let and denote the arrival time and output / departure time of the packet. Furthermore, let denote the flow where the packet is from, its packet number in this flow , and the eligibility time of packet , i.e., the -th packet of flow .

Line 6 tells that under the condition implied by Line 2, LRQ outputs the packet immediately when the present time reaches the eligibility time of the packet . In other words, the output time equals the eligibility time, i.e., . The condition of Line 2 is that the packet must have already arrived, i.e. . In addition, the loop, particularly the two highlighted lines, Lines 2 and 6 imply the FIFO order is preserved when outputting packets, or in other words, . Combining these, we have

(1)

with the initialization condition for , and since the queue is initially empty, where the eligibility time function is updated according to Lines 8 and 9 as:

(2)

2.2.3 Remark on model difference

The concept of interleaved shaping has been extended to consider other shaping constraints, such as token-bucket constraint  [24] [13] and “Pi-regularity” constraint [21], and has been adopted by IEEE TSN [13] and IETF DetNet [7]. In these standards as well as in the modeling work [21], the interleaved shaping algorithms directly take (1) as the form, where the eligibility time function (2) is adapted according to the targeted shaping constraint. Specifically, the corresponding time functions of and are respectively called and individual flow’ in the IEEE Standard 802.1Qcr [13].

In the modeling work [21], the introduced function is indeed the function (2) here. For interleaved LRQ, the Pi-function has the following expression:

(3)

As a highlight, the initial condition for the function is different from the initial condition for . While it is in the initial LRQ algorithm  [24] and the model (1) above, it is in [21]. Also in [21], this initial condition is discussed to be necessary for its proposed “Pi-regularity” traffic constraint model.

2.3 Flow and Server Models

2.3.1 Flow models

For flows, two specific traffic models are considered. One is the -regularity model [3], also known as the max-plus arrival curve model [25, 22]:

Definition 1

A flow is said to be -regular for some non-negative non-decreasing function iff for all , there holds

or equivalently, ,

(4)

where with and , and is called the max-plus convolution operator.

In the case with a constant rate , which is equivalent to , , we also say the flow is -constrained.

Another traffic model that will be used is the following (min-plus) arrival curve model.

Definition 2

A flow is said to have a (min-plus) arrival curve , which is a non-negative non-decreasing function, iff the traffic of the flow is constrained by [20], ,

or equivalently, ,

(5)

where define and is the min-plus convolution operator.

A special type of arrival curve, which will often be used in the paper, has the form: . In this case, we will also say that the flow is leaky-bucket or token-bucket -constrained. The model was first introduced by Cruz in his seminal work [5] that triggered the development of the network calculus theory.

It can be verified that if a flow is -constrained, it is also -constrained with and , i.e., having a (min-plus) arrival curve

As shown by the two definitions, while the -regularity or max-plus arrival curve model characterizes a flow based on the arrival time , the (min-plus) arrival curve model does so based on the cumulative traffic amount function . In the literature, e.g., [3, 22], the relationship between the min-plus and max-plus arrival curves has been investigated. Particularly, it has been shown [22] that they can be converted to and are dual of each other.

As a highlight, the (min-plus) arrival curve model has a straightforward property, which, however, is notoriously hard for the max-plus counterpart. It is the superposition property. Consider an aggregate flow. If every constituent flow of the aggregate has an arrival curve , the aggregate has an arrival curve: .

2.3.2 Server models

For server modeling, define two reference time functions and iteratively as:

(6)
(7)

with , , and where denotes the reference service rate. Later will also be referred to as the eligibility time or virtual start time (VST) function, and the virtual finish time (VFT) function.

Consider a physical link of rate serving a flow that inputs at one end of the link. Observe the output at the other end and ignore the propagation delay. Then, is the time that the first bit of packet starts to exit the link, and the time that the last bit finishes departing the link.

The following relationship between functions and can be easily verified, e.g., see [15]: ,

(8)

These functions, VST and VFT, have been used as basis in designing scheduling algorithms and in modeling the service provided by a system. The designed scheduling algorithms include Virtual Clock using [27] and Start-time Fair Queueing using [10]. The models include Guaranteed Rate (GR) server [9] and its generalized version [11], and Start-Time (ST) server [15], which are respectively based on VFT and VST.

In this paper, the Guaranteed Rate (GR) server model is adopted.

Definition 3

A system is said to be a Guaranteed Rate (GR) server with guaranteed rate and error term to a flow, written as , iff it guarantees that for any packet of the flow, its departure time satisfies [9, 11]:

(9)

or equivalently

(10)

with , where is the max-plus convolution operator.

It has been shown that a wide range of scheduling algorithms, including priority, weighted fair queueing and its various variations, round robin and its variations, hierarchical fair queueing, Earliest Due Date (EDD) and rate-controlled scheduling disciplines (RCSDs), can be modeled using GR [9, 11]. For this reason and to simplify the representation, instead of presenting results for schedulers implementing specific scheduling algorithms, we use the GR model to represent them. A summary of the corresponding GR parameters of various scheduling algorithms can be found, e.g., in [15].

Considering the relationship (8), a server model may similarly be defined based on , which is called the Start-Time (ST) server model, written as , iff for any packet of the flow, the system guarantees its departure time [15]:

(11)

or equivalently

(12)

with and .

As indicated by the max-plus convolution operator used in (10) and (12), these models are server models for the max-plus part of network calculus [3]. In the min-plus part of network calculus, the (min-plus) service curve model is well-known. The latency-rate type (min-plus) service curve is defined as follows.

Definition 4

A system is said to offer to a flow a latency-rate service curve iff for all [20],

(13)

where .

In [20, 15], the relationship between the GR model, the ST model, the latency-rate server model and the (min-plus) latency-rate service curve has been investigated. Particularly, it is shown [15] that the latency-rate server model is equivalent to the start-time (ST) server model. With the relation (8), it can be verified that if a system is a server to a flow, it is also a server and provides a latency-rate service curve to the flow [20, 15]:

(14)

Conversely, if the system is an or latency-rate server with the same parameters to the flow, it is also a to the flow [15].

2.3.3 Delay and backlog bounds

With the flow and server models introduced above, the following delay and backlog bounds can be found or proved from literature results, e.g., [11, 20].

Proposition 1

Consider a flow served by a system. The flow has an arrival curve , and the system is a server to the flow. If , the delay of any packet , i.e., , is upper-bounded by, ,

and the backlog of the system at any time, i.e., , is upper-bounded by, ,

As a special case, the flow is -constrained, i.e. . If , the bounds in Proposition 1 can be written more explicitly as, ,

(15)

for delay and ,

(16)

for backlog.

In the TSN / DetNet literature, the delay and backlog bounds are derived commonly based on the assumption that the flow has a (min-plus) arrival curve and the server has a latency-rate (min-plus) service curve [28], except in the initial interleaved shaping paper [24] that adopts a timing analysis technique directly on the reference time functions similar to our analysis in this paper. It has also been noticed that the delay bounds from the service curve analysis are more pessimistic than from the timing based analysis [28]. This difference is also seen here as discussed in the following.

Specifically, service curve-based analysis can result in a delay bound that is larger than the bound from GR-based analysis shown in Proposition 1. The difference is due to the extra term in the service curve characterization as shown in (14). By exploiting an advanced property of network calculus (NC), which is “the last packetizer can be ignored for delay computation” (see e.g. [20]), the packetizer delay can be deducted from the service curve based delay bound. However, considering that the delay bound must hold for all packets, only may thus be extracted. Consequently, the “improved” service curve based delay bound becomes:

Then its difference from GR-based analysis can be reduced to

As a remark, the discussion on the delay bound difference is only based on the server models themselves. When delay bound analysis is conducted on a specific scheduling discipline, the GR-based analysis may benefit additionally. As an example, strict priority will be considered and the bounds derived from different approaches be compared in Section 5.3.1.

3 Properties of Interleaved LRQ Shapers

In this section, we first review the “shaping-for-free” property of interleaved shaping and prove it for LRQ without altering the initial condition introduced for the original LRQ algorithm. Then, we prove properties of interleaved LRQ shapers as stand-alone elements, including delay and backlog bounds. In the next section, i.e., Section 4, properties of per-flow LRQ, including per-flow LRQ based flow aggregation, are investigated.

3.1 The “Shaping-for-Free” Property

As introduced in Section 2, functions (1) and (2) capture the essence of the LRQ algorithm. In addition, by adapting (2), interleaved shaping of flows with other traffic constraints can be implemented, for which, a systematic investigation has been conducted in [21].

Applying (2) to (1), we can rewrite and obtain the following model for LRQ: ,

(17)

with the initial condition: and for , which is equivalent to the initial condition for (1), since the three involved parameters , and in (2) are non-negative in nature and is non-zero.

In the literature, “shaping-for-free” is a well known property of per-flow shapers. Specifically, if a shaper is greedy and the initial traffic constraint of the flow is used as the shaping curve, the worst-case delay of the flow in a system composed of the shaper and a server is not increased in comparison with a system of the server only, in spite of the order of the shaper and the server in the combined system. Earlier works include [26] [8] and a more systematic investigation is summarized in [3] and [20].

Under interleaved shaping, the shaping-for-free property is first studied in [24]. In [21], a generalized treatment is provided, where the property is proved for a wide range of traffic constraints, including both Chang’s -regularity and (min-plus) arrival curve constraints.

For LRQ, the shaping-for-free property is summarized in Theorem 1. Figure 1 illustrates a typical setup when studying the shaping-for-free property. As highlighted in Section 2.2.3, the initial condition (3) used in [21], which is considered necessary there, is different from the initial condition (2) used by the original LRQ algorithm [24]. In this paper, we keep the initial condition (2), which can be also written as and , and re-prove the shaping-for-free property for LRQ. To account for the impact of the initial condition, the proof uses strong induction.

Figure 1: The shaping-for-free property setup
Theorem 1

Consider a set of flows , where every flow is -regulated, i.e., . These flows pass through a system composed of a FIFO server and an interleaved LRQ shaper with rate for , . No matter about the order of the server and the shaper, a delay upper bound for the FIFO server is also a delay upper bound for the composite system.

3.2 Properties of LRQ as Stand-alone Elements

In this subsection, a number of properties of LRQ as stand-alone elements are proved. Among them, while Lemma 1 and Lemma 2 find similar properties of their per-flow counterparts, the other properties are unique to interleaved shaping.

Lemma 1

(Conformance) Consider an interleaved LRQ shaper with a set of input flows , where for every flow , rate is applied. If at the input, every flow is -regulated, then the shaper introduces no delay, i.e., for every packet , there holds .

An implication of Lemma 1 is that at any time, there is at most one packet in the LRQ system from each flow. This information may be used for conformance check. For instance, from each flow, at most one packet is allowed and additional non-conformant packets are dropped. This way can prevent delaying other flows’ packets if one flow is non-conformant to its -constraint.

The following output characterization result is immediately from (17).

Lemma 2

(Output Characterization) Consider an interleaved LRQ shaper with a set of flows , where for every flow , rate is applied. Regardless of the traffic constraint for each flow at the input, the output of the flow is constrained by , i.e., ,

Having proved Lemma 1 and Lemma 2, we now focus on delay. Unfortunately, its worst-case analysis is notoriously challenging. In the rest of this section, we approach it step by step. First, the following result provides a sufficient and necessary condition for an LRQ shaper system to have bounded delay.

Lemma 3

(Sufficient and Necessary Condition) For an interleaved system with rates for its flow set , the delay for any packet is upper-bounded, if and only if there exists a non-negative constant such that, ,

(18)

and if the condition is satisfied, is also an upper-bound on the delay.

Note that, in Lemma 3, the condition does not assume how each flow is regulated at the input. If the flow is -regulated at the input, applying this traffic condition together with from Lemma 1 gives . In other words, the sufficient and necessary condition is satisfied with . This also confirms Lemma 1 .

When the flow is not -regulated, the condition constant is not as easily found. Additional approaches are needed to help find delay bounds. For this, in Lemma 4, we relate the departure time with a generalized version of the virtual start time and virtual finish time functions defined in (6) and (7). Specifically, their generalized counterparts are: ,

(19)
(20)

with and , where, for ease of expression, we use to denote the rate of the flow that packet belongs to, i.e., .

The difference between (19) and (6), and the difference between (20) and (7), are that while the rate in the function for each packet is the same in the latter, it may differ from packet to packet in the former. These generalized virtual start time and virtual finish time functions (19) and (20) are similarly defined in the generalized Guaranteed Rate server model [11].

Lemma 4

(GR Characterization) Consider an interleaved LRQ shaper with a set of input flows , where for every flow , rate is applied. The departure time of any packet is bounded by: for

(21)

where and are defined in (19) and (20) respectively.

With Lemma 4, the following corollary is immediately from the definition of the generalized GR server model, the corresponding delay bound analysis [11] and Proposition 1.

Corollary 1

The LRG regulator is (i) a generalized GR server with guaranteed rate and error term and (ii) provides a service curve . (iii) If every flow is -constrained and , then the delay of any packet is bounded by, ,

(22)

and (iv) the backlog of the system at any time is bounded by: ,

(23)

While it is encouraging to have the delay bound (22) for interleaved LRQ shapers as the first step, the condition and the term in (22) make the bound conservative. We improve in the follow result.

Theorem 2

Consider an interleaved shaper with rates for its flow set . If every flow is -constrained, and , the delay of any packet is bounded by, ,

(24)

which implies the following delay bound for all packets:

4 Properties of Per-Flow LRQ

In this section, the focus is on per-flow LRQ. We first discuss the relation of per-flow LRQ with two existing concepts / models and briefly summarize its properties corresponding to those of interleaved LRQ. Then, we investigate applying per-flow LRQ in flow aggregation.

4.1 Per-flow LRQ

Unlike interleaved LRQ, whose properties were previously little investigated, much more for per-flow LRQ can be readily obtained from existing results, due to its relationship with two existing concepts, -regulator [3] and smoothing Leaky Bucket (sLB) [17].

First, for per-flow LRQ, since there is only one flow, and packet is . So we have

(25)

which, after applied iteratively, leads to

(26)

with the two functions and given as: and with and . Equation (26) is exactly the same as how a minimal -regulator is constructed (see Theorem 6.2.2, [3]). Hence, all related results for minimal -regulators in [3] also apply to per-flow LRQ.

Second, there is another shaping concept equivalent to per-flow LRQ, which is smoothing Leaky Bucket (sLB) [17]:

  • A smoothing Leaky Bucket (sLB) is a shaper that consists of a bucket and a buffer. The bucket has two states, EMPTY and FULL, and is initially set to be EMPTY. When the bucket becomes EMPTY, the sLB sends out instantaneously the head of queue packet if the buffer is not empty, and at the same time places into the bucket a number of tokens equal to the size of this packet and changes the bucket state to FULL. The bucket leaks at a constant leaking rate. Whenever the bucket becomes empty, its state is set to be EMPTY.

Note that the key idea of LRQ is to “hold” the next packet till the intended time gap from the previous packet is reached. The specific mechanism of sLB can also be used to equivalently implement such holding. With this equivalence, results for sLB, e.g. in [17], also carry over to per-flow LRQ.

It is worth highlighting that an sLB shaper differs from a normal leaky bucket (LB) or token bucket (TB) even when the bucket size of LB / TB is set to be the maximum packet length [17]. The reason is that, sLB ensures spacing between two consecutive packets to be equal to the length rate quotient (LRQ), while LB / TB may output more than one packet at once or output packets whose spacing is closer than by sLB, unless all packets have the same length.

Below we summary the properties of per-flow LRQ in accordance with what have been reported for interleaved LRQ. As discussed above, more results can be found following the -regulator and sLB concepts, see, e.g., [3] [17].

Corollary 2

(Conformance) A per-flow LRQ shaper with rate is a minimal -regulator with .

Since per-flow LRQ is a special case of interleaved LRQ with only one flow, all properties discussed in the previous section also hold for per-flow LRQ. As an example, we have Corollary 3 and Corollary 4, which will be used in later analysis, respectively from Lemma 2 and Lemma 4 :

Corollary 3

(Output) For per-flow LRQ with rate , the output has an arrival curve . In addition, if the input is -constrained with , the output is also -constrained 111For interleaved LRQ, a counterpart of this is yet to be found. , which, in combination of the former, gives that the output has an arrival curve of .

Corollary 4

(GR Characterization) For per-flow LRQ with rate , the departure time of any packet is bounded by:

(27)

Corollary 4 implies that the per-flow LRQ shaper is a guaranteed rate server and has a service curve as summarized below.

Corollary 5

A per-flow LRQ shaper with rate is (i) a GR server with the same rate and error term , and provides (ii) a latency-rate service curve .

With Corollary 5, the related results for GR and service curve models can also be applied to per-flow LRQ. Particularly, we present delay and backlog bounds for per-flow LRQ. As a highlight, while the backlog bound is the same as what would be found from existing GR analysis, e.g. Proposition 1, or from service curve analysis [20], an improved delay bound is presented in Corollary 6.

Corollary 6

(Delay and Backlog Bounds) For a per-flow LRQ shaper with rate , whose input has an arrival curve , if , then the maximum delay of any packet is upper-bounded by and the maximum backlog of the shaper at any time is bounded by

4.2 Aggregation based on per-flow LRQ

In interleaved LRQ, flows are first treated in FIFO, i.e. their packets are ordered in the FIFO queue according to their arrival times, and then per-flow LRQ shaping is conducted in an interleaved manner, preserving the packet order. For this “FIFO-aggregation (interleaved) per-flow shaping” setup, as illustrated in Figure 1, the “shaping-for-free” property of interleaved LRQ has been proved in the previous section.

Figure 2: LRQ-controlled aggregation

We now consider the setup “per-flow shaping FIFO-aggregation”, where the order of shaping and aggregation is changed. More specifically, each flow is first shaped with a per-flow shaper and the outputs from these shapers are then FIFO-aggregated based on packets’ departure times from the shapers. Figure 2 illustrates the setup, in contrast to the setup “FIFO-aggregation (interleaved) per-flow shaping” shown in Figure 1.

For a system of per-flow LRQ shapers FIFO server shown in Figure 2, we have the following delay and backlog bounds.

Theorem 3

(Delay and Backlog Bounds) Consider a set of flows passing through a system, where each flow is shaped by a per-flow shaper, and the outputs from the shapers join a FIFO queue that is served by a server. Every flow is -constrained. If for and , then for every packet of , its delay is bounded by

(28)

and the total backlog of all queues in the system is bounded by

(29)

As a comparison, for the same set of flows directly served by the FIFO server, the following delay and backlog bounds are from Proposition 1:

(30)

for delay, and the total backlog of all queues in the system is bounded by

(31)

For backlog, it is easily seen that the backlog bound (29) is higher, but with practical setting and , its difference from (31) is only .

For delay, the difference of (28) from (30) is , which highly depends on : (28) may be smaller than (30) and vice versa. Note that the delay bound (30) applies universally to all flows with no difference, which may be preferred when all flows have the same delay guarantee requirements. However, when such requirements are diverse, (30) implies that the configuration and control have to check (30) against the most stringent requirement. In contrast, (28) implies a tuning knob, which is , which may be set differently according to each flow’s own, possibly diverse, delay requirement.

At an immediate glance, the “per-flow shaping FIFO-aggregation” setup in Figure 2 clearly requires more LRQ shapers than the “FIFO-aggregation (interleaved) per-flow shaping” setup in Figure 1. However, when both are applied to deliver bounded e2e latency in a network, as to be introduced in the next section, the total number of needed shapers may be the same.

5 Achieving Bounded End-to-End Latency

5.1 Per-flow Scheduling or Aggregate Scheduling

A central objective of TSN and DetNet is to deliver bounded end-to-end latency to flows [12, 7]. With similar / related objectives, two Internet quality of service architectures, Integrated Services (IntServ) [2] and Differentiated Services (DiffServ) [1], can be found. Their approaches to the delivery of e2e quality of service are fundamentally different. While IntServ mainly relies on per-flow scheduling to ensure isolation among flows and reserve resources along the e2e path, DiffServ only needs class-based aggregate scheduling at each node to provide service differentiation among flows.

In per-flow scheduling, each flow has a dedicated queue through which it shares the service of a server, e.g., an output link, with other flows. An advantage of this per-flow-queue treatment is that it can effectively provide isolation among flows and subsequently deliver bounded e2e latency [2]. In contrast, in aggregate scheduling, flows of the same class typically share one queue, which shares the service of a server with queues of other classes. In delivering bounded e2e latency, per-flow scheduling is more advantageous over aggregate scheduling. This is due to that, under aggregate scheduling, the burstiness level of a flow can be significantly affected by other flows in the same aggregate due to sharing the same queue, and this influence can be cascaded. As a consequence, with FIFO aggregation, e2e delay bounds for general topology networks are only available under sometimes very restrictive utilization levels [4, 14].

There is a vast literature related to IntServ and DiffServ. One example is the network calculus theory, initially developed for performance guarantee analysis of IntServ and DiffServ networks [5, 3, 20, 18, 22], which has to date been heavily applied to such analysis of TSN and DetNet networks [28].

Comparing with IntServ and DiffServ, TSN and DetNet also recommends class-based aggregate scheduling, however with using interleaved shaping to re-shape flows in the class aggregates. Surprisingly, interleaved shaping also has the shaping-for-free property as per-flow shaping. This has enabled an approach, including properly allocating queues at each node and reshaping flows to their initial traffic constraints, for the delivery of bounded e2e latency for TSN and DetNet [24], and this approach is universal: its effectiveness does not dependent on network topologies.

In this paper, a set of other properties of LRQ have been proved. This triggers the following question: Can they be used as basis to design new approaches to deliver bounded e2e latency? To this aim, two example approaches will be introduced in this section. Both keep the same way of allocating queues at nodes.

In the remaining of this section, the node structure is first introduced in Section 5.2. In Section 5.3, an improved e2e delay bound, based on the analysis in this paper for the universal approach, is presented. Then in Sections 5.4 and 5.5, two new approaches are introduced together with their e2e delay bounds. Finally, a discussion on the three approaches and their bounds is included in Section 5.6.

5.2 Node Structure

The node structure as shown in Figure 3 is adopted, which was initially proposed in [24] for TSN asynchronous traffic and has also been adopted for DetNet [6] to deliver bounded e2e latency. This structure was also considered earlier with the same aim but for DiffServ [16].

Figure 3: Node structure and timing model

Specifically, for an output link at the node, there are a number of queues, where each queue corresponds to one service class and is shared by its traffic in the FIFO manner. Some scheduling disciplines are employed to schedule packets from these queues in using the output link. As inputs to each queue, there is a set of shapers, where each shaper is for shaping traffic of the same class from one input link with the targeted output link

[24, 16].

Figure 3 also illustrates six conceptual stages that a packet goes through at a node. Upon arrival of a packet at an input link (1), it is processed and forwarded (2), via some internal mechanism or switch fabric (3), to the corresponding output queuing part of the output link, first going through a shaper (4) and then the class queue (5) followed by being served and transmitted on the output link (6). Between two consecutive stages, e.g., , there is delay to the packet. The delay between the first and the last stages, i.e., , is the delay of the packet at the node. In addition, between two adjacent nodes on the path of the packet, there is propagation delay from the last stage in the previous node to the first stage in the next node, i.e, , .

Without loss of generality, we assume that the service provided to each traffic class queue can be characterized using the Guaranteed Rate (GR) server model. In [11, 15], the GR rate and error terms for a large number of scheduling disciplines can be found.

5.2.1 Additional notation

In this section, additional notation will be used in e2e delay analysis. We use to denote the node of the -th output link on the path of the considered flow , and an input link on node . Let denote the set of flows at the node , which share the same output link of the flow, and denote the set of flows in which are from input link of the node. By definition, , where denotes the set of input links of the node . Let denote the number of input links at node , and the total number of nodes on the path of the considered flow .

Let and respectively denote the guaranteed rate and error term of the GR characterization of the class-based aggregate scheduler for at node . In Section 5.4 and Section 5.5 for Approach 2 and Approach 3, when a per-flow LRQ shaper is used for link at node , we additionally use to represent the rate of the corresponding per-flow LRQ shaper.

Two traffic constraints are considered, namely -regulated and -regulated. Accordingly, the delay bounds presented in this section are for two cases. In one case, all flows are initially -regulated, while in the other case, they are initially -regulated. To simplify representation, the results for the two cases are combined: Specifically, we use the same for the -regulated case and for the LRQ-regulated case.

Corresponding to the six stages, we use to denote the arrival time of reaches the -th stage, , at node . Thus, the delay of the packet at the node between two stages and is . Let denote an upper bound on the delay of any packet in the considered flow from at node to at node . By this definition, an upper bound on the end to end delay of the flow can be written as:

(32)

In the investigation in Section 5.3 to Section 5.5, we focus on the output queuing part, i.e., in Figure 3, simply assuming constant delays and ignoring delay variations on . for all packets . Later in Section 5.6, the impact of delay variations on on the obtained delay bounds as well as on the backlogs will be discussed.

5.3 Approach 1: Reshaping to the Initial Traffic Specification

This is the approach that was initially proposed in [24] for TSN asynchronous traffic. In this approach, the shaper at each node, cf. Figure 3, is an interleaved shaper, which uses the initial traffic specification of each flow, i.e. either or -regulated, to reshape the flow’s traffic at the node.

With Theorem 1 and Lemma 1, we have for all , and since the flow is already assumed to comply with the traffic constraint when entering the network. With these, (32) can be re-written as:

(33)

which indicates that if a nodal delay bound for is found, an e2e delay bound can be readily obtained, since the second term on the right hand side is assumed to be bounded.

Thanks to interleaved shaping, the traffic of every flow on is shaped to its initial traffic constraint. Then a nodal delay bound for can be immediately obtained from Proposition 1, which is,

for , if . Applying the nodal delay bound to (33) gives an e2e delay bound summarized in Corollary 7.

Corollary 7

The maximum end-to-end latency of any packet of the considered flow , denoted as , is upper-bounded by:

if .

In Section 2.3.3 when introducing the GR model, we have discussed that the obtained nodal delay bound is generally better than bounds from service curve analysis. A more concrete example is given below, where strict priority is focused, which has commonly been assumed as an algorithm for aggregate scheduling when conducting e2e delay analysis for TSN and DetNet [28].

5.3.1 Strict priority

The following result introduces how a strict priority server can be characterized by the GR model.

Lemma 5

Consider a flow , which may be the aggregate flow of a traffic class, shares with other flows a work-conserving server of constant capacity. The server adopts non-preemptive strict priority when serving the packets. The capacity of the server is . Every flow is -constrained. Then, the service provided by the server to flow can be characterized by with

and if , the delay of any packet of flow is bounded by , i.e.,

(34)

where and with / denotes the set of flows having higher / lower priority than the flow , denotes the maximum packet length of flows in , and denotes the minimum packet length of flow .

As a comparison, in the original LRQ work, using a timing analysis method, the following bound has been found [24]:

(35)

In addition, the following delay bound is obtained by using the (min-plus) service curve model [28]222In [23], there is effort to improve bound. However, the improvement does not cover the difference with (35), since for the server, the service curve characterization remains the same as which cannot be avoided due to the service curve model.:

(36)

Clearly, the timing-analysis based bound (35) is better than the service curve analysis based bound (36). The difference is per node. In addition, the GR analysis based bound (34) makes further improvement of over (35). For e2e delay, the differences are multiplied, so the e2e delay bound shown in Corollary 7 with nodal bound (34) may be more preferred.

5.4 Approach 2: Reshaping to LRQ-Regularity

In this and the next subsections, we investigate new approaches, exploiting the derived properties, to achieve bounded e2e latency. Note that, Approach 1 is universal, independent of the network topology. In this and the next subsections, we additionally take network topology information in the approach design. A simple setup is considered, where the network has a tree-topology, e.g., an aggregation network, and the focus is on flows with directions from leaves to the root. In this setup, traffic from child nodes is aggregated at their parent node that further forwards the aggregated traffic upwards to its parent node. There is no traffic segregation.

For this setup, the total number of nodes on the path of a flow is simply the number of node generations from the entrance node of the flow to the root node, and the number of links at a node , denoted as , is the number of child nodes of .

Approach 2 is similar to Approach 1, but has three differences. One is that while all shapers in other nodes are still interleaved shapers, they are per-flow shapers in ingress nodes. Another is that, all flows from the same ingress input link and belonging to the same traffic class are treated as one (aggregate) flow and interleaved shaping is applied to such (aggregate) flows in the rest part of the network. The third is that all shapers are based on , even for the case where the initial traffic constraints are in the form of .

Essentially, an aggregate flow represents the aggregate of flows sharing the same leave-to-root path. As a remark, if each flow is already treated as the e2e path-sharing aggregate flow in Approach 1, Approach 2 is the same as its LRQ-regulated version, and the only difference is that when the traffic constraint is changed to , Approach 2 still uses while Approach 1 uses interleaved token bucket shapers, e.g., TBE shapers [24].

Without loss of generality, for a considered flow at the first node, we call its input link as the first link of the node, and denote by the rate of the corresponding per-flow LRQ shaper. To simplify the expression, we also denote by the constraint rate of aggregate which is used by the interleaved LRQ shapers. Note that for the aggregate with per-flow shaping rate , .

With Lemma 2, we know the output from the first shaper is -regulated. Thus, using interleaved shapers with the same rates for shapers at later nodes will not affect the corresponding delay bounds. Specifically, based on Theorem 1 and Lemma 1 for -regulated traffic, we have for all , which is the same as for Approach 1. However, we can no more ignore . With these and denoting , (32) can be re-written as:

(37)

Let denote the set of flows in the aggregate from the -th link at node , and denote the set of such aggregate flows at node .

With the delay bounds introduced in Proposition 1 for and in Theorem 2 for , an upper bound on the e2e delay for Approach 2 can be readily obtained, which is summarized in Corollary 8.

Corollary 8

The maximum delay of any packet of the considered flow under Approach 2, denoted as is bounded by:

if and .

5.5 Approach 3: Per-Input-Link Shaping with Per-Flow LRQ

In Approach 2, interleaved shaping is still performed inside the network. In Approach 3, we relax this requirement such that only per-flow LRQ shapers are used, in both the ingress nodes and other nodes. Other than this difference, the same setup described for Approach 2 is adopted.

Specifically, every shaper, cf. Figure 3, treats the flows from the corresponding input link at a node as a FIFO aggregate and shapes the aggregate using per-flow with rate