PI^2 Parameters

06/02/2021
by   Bob Briscoe, et al.
0

This report gives the reasoning for the parameter settings of the reference Linux implementation of the PI^2 AQM, focusing initially on the target queue delay.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

04/15/2019

Managing a Queue to a Soft Delay Target

This memo proposes to transplant the core idea of Curvy RED, the softene...
04/15/2019

Rapid Signalling of Queue Dynamics

Rather than directly considering the queuing delay of data, this memo fo...
08/19/2020

Parameterized Algorithms for Queue Layouts

An h-queue layout of a graph G consists of a linear order of its vertice...
05/17/2021

A Neat Linked Queue with the Rear Sentinel

We introduce a very simple queue implementation with the singly linked l...
12/12/2017

Congestion Control Approach based on Effective Random Early Detection and Fuzzy Logic

Congestion in router buffer increases the delay and packet loss. Active ...
06/20/2018

Low Delay Rate Allocation in WLANs Using Aggregation

In this paper we consider transport layer approaches for achieving high ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

This report explains the reasoning behind the parameter settings of the reference Linux implementation of the PI AQM111See https://github.com/L4STeam/sch_dualpi2_upstream for the code. These settings are documented as a pseudocode example in Appendix A of the IETF specification of the Coupled DualQ AQM [DSBEW21]. In both cases, the PI AQM is used for the Classic queue within the dual-queue structure called DualPI2, but the parameter settings for PI discussed here apply irrespective of whether a PI AQM stands alone or within a dual-queue structure. The discussion of the target parameter also applies to a PIE AQM [PPP13].

Similar reasoning for the parameter settings was given in the technical report produced in 2015 [dSBTB15] to support standardization of the Coupled DualQ AQM. The present report comes to the same conclusion, but spells out all the details that were glossed over at that time.

This report focuses on the PI2 parameter settings in the following lines of Figure 2 in [DSBEW21]:

7:  % PI2 AQM parameters
8:  RTT_max = 100 ms             % Worst case RTT expected
9:  RTT_typ = 34 ms                          % Typical RTT

11: % PI2 constants derived from above PI2 parameters
12: target = 0.22 * 2 * RTT_typ            % qDelay target
13: Tupdate = min(RTT_typ, RTT_max/3)  % sampling interval
14: alpha = 0.1 * Tupdate / RTT_max^2 % integral gain [Hz]
15: beta = 0.3 / RTT_max          % proportional gain [Hz]

It may be noticed that lines 9 & 12 are different from those in draft-ietf-aqm-dualq-coupled-15. But it is planned to correct that in future versions of the draft. The resulting value of target is still 15 ms.

The main body of this report focuses on line 12 above, which in turn depends on line 9:

9:    RTT_typ = 34 ms
12:   target = 0.22 * 2 * RTT_typ

The default for target has to be chosen as a compromise to minimize queue delay without causing under-utilization for paths of different RTTs.

2 Terminology

Figure 1: Definition of terms

The schematic plots of one cycle of queue delay against time for two different congestion controllers (Reno and Cubic) in Figure 1 define some terminology for this report. We will use as the target queue delay for a particular AQM and as the amplitude of the cycles (in units of time). In steady state the congestion controller and AQM together keep the average queue delay at , so the fraction, of the amplitude that sits below the average depends solely on the geometry of the sawtooth curve.

The total RTT consists of the constant base delay of the path, , and queue delay , that is .

The min RTT is related to the max RTT by the multiplicative factor, of the congestion controller: .

The gain parameters for the PI AQM are denoted and to distinguish them from the additive increase and multiplicative decrease factors and used in congestion controllers.

3 Scaling of Queue Variation

is intended to be the operating point that the queue cycles around under stable conditions, so we consider only long-running flows and fixed capacity links. Within this stable environment, we consider a single long-running flow as the worst-case for queue variability (and a fairly common case in access link bottlenecks).

The schematics in Figure 2 show how different Linux congestion controls vary the queue delay of a single flow around the target and how the variation scales with RTT and link capacity. The scales of the plots are all the same, but actual numerical values of queue delay and time are irrelevant for this visualization.

Figure 2: Scaling of queue delay variability with RTT and link capacity

The following two subsections consider how queue variation with a single flow scales with RTT and with link capacity. Then subsection 3.3 discusses the geometry and prevalence of different sawtooth shapes.

3.1 Scaling of Queue Variation with RTT

As long as the queue doesn’t completely drain, the delivered packet rate, , remains constant, because link capacity is constant. Then, given that the window, , the round trip time (RTT, ) varies in direct proportion to the window. The sender induces the RTT to vary by building appropriate variations in the queue.

As a first-order approximation, we assume that queue delay tracks the window instantly, which is a reasonable approximation when the window varies slowly, but less accurate during rapid changes. However, it becomes sufficiently correct within a round trip or two222Whether Proportional Rate Reduction (PRR) is used or the window is allowed to stall while the queue drains.. So the approximation is sufficient as long as the duration of each cycle is significantly greater than one round trip.

Consider two flows with the same congestion controller, but the second flow has twice the total RTT of the first. By definition a multiplicative decrease in the second window will be twice that of the first. Therefore the decrease in the queue will be twice that of the first (because the base delay element remains constant). Put algebraically,

This is why, starting at the top left and working down the schematics for each congestion control in Figure 2, it can be seen that the amplitude of the queue variation grows linearly with RTT.333At least, it does until the troughs of the sawteeth drain the queue completely, and some underutilization starts to set in. This can be seen in the plots labelled ‘RTT ’ for a) Reno and c) Cubic mode (for visualization, light grey traces extrapolate where the plots would be if the queue could be negative) This linear scaling of queue variability with RTT is just as true for either mode of Cubic as it is for Reno.

3.2 Scaling of Queue Variation with Link Capacity

More link capacity allows either more flows or more throughput per flow. But in the edge links giving access to the Internet, which tend to be the bottleneck links, the number of simultaneous flows is still low, and single lone flows remain common.

If link capacity doubles, the delivered packet rate (and the average window) of a single flow doubles too. Nonetheless, the AQM keeps the average RTT, , at the same operating point. For a particular congestion control, the max and min RTT are related to the average RTT by constant factors, as confirmed algebraically below, so they also remain unchanged.

Therefore, and also remain unchanged as link capacity scales (shown in the right-hand column of Figure 2).

Incidentally, the scaling of the cycle duration (along the horizontal time axis in Figure 2) is not directly relevant to queue variation, but Appendix C briefly explains why it is relevant to utilization.

3.3 Sawtooth Geometry

Figure 3: Scatter-plot per country of average user to CDN RTT under load against average fixed access bandwidth. RTT is taken as if it is under load over the PI AQM under study, so RTT = base RTT plus 15 ms. Only the 43 countries with the most Internet users are plotted, representing 90% of Internet users. The top ten are labelled as well as those at the extremes. The curve overlaid on the plot is where the Cubic congestion control in Linux switches over from Reno mode to pure Cubic mode

The fraction, , of the sawtooth amplitude that lies below the average is important when determining the target queue delay. For a Reno-style linear sawtooth, obviously . And Appendix A proves that for a Cubic sawtooth, whatever the value of .

The “Great TCP Congestion Control Census” [MSJ19] conducted by Mishra et al in Jul–Oct 2019 found that Cubic was the most used by nearly 31% of the Alexa top 20k web sites, but BBR was approaching 18%, and already had a larger share of the Alexa top 250, as well as contributing 40% by downstream traffic share.444The census did not investigate congestion controls used by QUIC. Of the 51% of the Alexa top 20k sites that were not using either Cubic or BBR, 19% were split between eight other known controllers, the greatest shares being for YeAH and CTCP or Illinois at under 6% each. The remaining 32% were unidentifiable, including sites that were unresponsive or did not serve anything large enough to be testable. As part of that 32%, nearly 17% of the total were using an unknown congestion controller and further investigation found nearly 6% of the total were using an undocumented Akamai controller.

BBRv2 supports L4S when it detects ECN marking, so it is unlikely to use the Classic queue. This leaves 67% of sites that use some form of Classic congestion control, of which 46% use Cubic and the remainder is split across a dozen or so other algorithms, many of which, like Cubic, attempt to be friendly to Reno at low BDP.

Figure 3 illustrates how the Cubic congestion control would invariably run in its Reno mode for Content Distribution Network (CDN) traffic over a PI AQM. The figure visualizes the average CDN RTT555The study did not measure fixed and mobile separately. RIPE Atlas probes are generally connected to fixed access links although some are connected via Ethernet to mobile broadband. under load against average fixed bandwidth per household for the top 43 countries ranked by number of Internet users (see Appendix B for the detailed data and sources). The base RTT from Appendix B is inflated by 15 ms to model the target queue delay of the PI AQM. It can be seen that nearly all the points are below the upper limit of Cubic’s Reno-friendly mode for a single flow (the two unlabelled points furthest from the curve are Uzbekistan and Iraq, with Iran, Venezuela, Columbia and Thailand just above it). However, there is a question mark over the CDN RTT in China (see Appendix B).

As link rates continue to scale, the points are expected to shift inexorably to the right. However, Cubic is likely to remain largely in Reno mode for some considerable time to come because, as CDN deployment continues, the points are also expected to shift downwards as base RTTs reduce below 20 ms then below 10 ms as they have done in the more mature deployments in Europe, N America and the Pacific rim. Also remember that we have chosen to examine the worst case of a single flow; whenever there are more simultaneous flows, the points would shift back to the left, into the Reno region.

The switchover curve between Reno and Cubic assumes the Linux implementation of Cubic with a packet size of 1500 B; aggressiveness constant ; multiplicative decrease factor ; and additive increase factor (as opposed to , which is recommended in RFC 8312 for friendliness to Reno). The packet rate converges to Equation 1 in Cubic’s Reno mode or to Equation 2 in cubic mode, as given below.

(1)
(2)

At the same loss probability,

, the packet rate, equals when the switchover RTT is
(3)

In summary, the prevalent sawtooth geometry of Classic traffic is likely to be dominated for some time by the Reno mode of Cubic, with and .

4 Typical Base RTT

The globally typical RTT for CDN traffic is calculated in Appendix B. The average RTT for each country is weighted by the Internet user population in that country, as collated on Wikipedia [Wik20]. The countries are ranked in order of user population until 90% of the total Internet users in the world is covered. The CDN RTT per country is based on measurements by Beganović using RIPE Atlas probes deployed by volunteers in what is claimed to be the largest Internet measurement infrastructure in the world [Beg19].

The resulting weighted average RTT to CDNs is 34 ms. However, these is a question-mark over the latency figures, given the measurements were all taken to 7 CDNs with global coverage, which might not be representative of the CDN market in certain countries, particularly China (see Appendix B for details).

As a sanity check, 34 ms compares reasonably well with the global averages given on Ookla’s Speedtest Global Index page:

  • [nosep]

  • 20 ms fixed and 37 ms mobile (Apr 2021 data);

  • 24 ms fixed and 42 ms mobile (Apr 2020 data).

Ookla’s data is collected from self-selecting users who use speedtest’s CDN-based servers [Ook21]. The page gives a single global figure without details of the method used.

5 Default target

When selecting a global default for target, the aim is to ensure that the AQM keeps queue delay reasonably low while not compromising utilization for the majority of users. Ideally a latency figure for say the 75th or 90th percentile of users would be used to derive target, but that data is not available globally.

Therefore, a ’safety factor’ is applied to the average RTT between users and CDNs, which has to allow for the statistical distribution of RTTs to CDNs, particularly for users in rural areas [KKFR15], who will be further from the nearest CDN and who are also likely to have least bandwidth and therefore be least willing to see it eaten by under-utilization. The safety factor also has to allow for flows between clients and servers other than in CDNs. As an interim measure, we apply the safety factor, .

Next we draw together all the strands of the analysis of sawtooth scaling and geometry in section 3, in order to derive a default target. We want to sit at the minimum of the saw-teeth, , which is related to the target queue as follows

Therefore:

(4)

We call the geometry factor. The geometry factors of a selection of congestion controls (CCs) are tabulated below (Cubic in Reno mode is abbreviated to CReno). The geometry parameters are taken from the current Linux implementation, but they are also as recommended in the RFCs.

CC
Reno
CReno 0.7
Cubic 0.7

Taking account of the mix of congestion controls discussed in subsection 3.3, but without modelling all the minor players, we use a weighted average of about 90% CReno; 10% Cubic, which gives a geometry factor of about 0.22. Thus, for PI we suggest setting the default to:

Over time, as CDN deployment continues, will continue to reduce, so the default target could be reduced in future. That in tuurn will reduce RTT further, with the knock-on effect of keeping more Cubic flows in Reno mode, thus reinforcing the applicability of the lower target for AQMs.

Other implementations intended for particular link technologies might use a different default today. For instance, the Low Latency DOCSIS specification [DOC19] uses , which makes sense because cable technology is less likely to extend to rural areas, so the distribution around the average RTT is likely to be considerably tighter. By a similar argument, the default target for mobile networks might need to be greater than 15 ms, depending on how well 5G meets its aspirations to reduce RTT.

Of course, operators are free not to use the default target for out-of-the-ordinary environments. For instance, they could configure a higher target for satellite links and remote rural locations; or a lower target for highly concentrated urban deployments.

References

  • [Beg19] Emir Beganović. Analysing Global CDN Performance. Blog, RIPE Labs, August 2019. Online: https://labs.ripe.net/author/emirb/analysing-global-cdn-performance/.
  • [DOC19] Data-Over-Cable Service Interface Specifications DOCSIS® 3.1; MAC and Upper Layer Protocols Interface Specification. Specification CM-SP-MULPIv3.1-I17-190121, CableLabs, January 2019.
  • [DSBEW21] Koen De Schepper, Bob Briscoe (Ed.), and Greg White. DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput (L4S). Internet Draft draft-ietf-tsvwg-aqm-dualq-coupled-15, Internet Engineering Task Force, May 2021. (Work in Progress).
  • [dSBTB15] Koen de Schepper, Olga Bondarenko, Inton Tsang, and Bob Briscoe. ‘Data Center to the Home’: Ultra-Low Latency for All. Technical report, RITE Project, June 2015. http://riteproject.eu/publications/.
  • [Jac88] Van Jacobson. Congestion Avoidance and Control. Proc. ACM SIGCOMM’88 Symposium, Computer Communication Review, 18(4):314–329, August 1988.
  • [JK88] Van Jacobson and Michael J. Karels. Congestion Avoidance and Control. Technical report, Laurence Berkeley Labs, November 1988. (a slightly modified version of the original published at SIGCOMM in Aug’88 [Jac88]).
  • [KKFR15] Chamil Kulatunga, Nicolas Kuhn, Gorry Fairhurst, and David Ros. Tackling Bufferbloat in capacity-limited networks. In 2015 European Conference on Networks and Communications (EuCNC), pages 381–385, 2015.
  • [MSJ19] Ayush Mishra, Xiangpeng Sun, Atishya Jain, Sameer Pande, Raj Joshi, and Ben Leong. The Great Internet TCP Congestion Control Census. Proc. ACM on Measurement and Analysis of Computing Systems, 3(3), December 2019.
  • [Ook21] Ookla. Speedtest Global Index. http://www.speedtest.net/global-index, April 2021.
  • [PPP13] Rong Pan, Preethi Natarajan Chiara Piglione, Mythili Prabhu, Vijay Subramanian, Fred Baker, and Bill Ver Steeg. PIE: A Lightweight Control Scheme To Address the Bufferbloat Problem. In High Performance Switching and Routing (HPSR’13). IEEE, 2013.
  • [RXH18] I. Rhee, L. Xu, S. Ha, A. Zimmerman, L. Eggert, and R. Scheffenegger. CUBIC for Fast Long-Distance Networks. Request for Comments RFC8312, RFC Editor, August 2018.
  • [Wik20] List of countries by number of Internet users. Online: https://en.wikipedia.org/wiki/List_of_countries_by_number_of_Internet_users, 2019–2020.

Appendix A Average Queue Over a Cubic Sawtooth

The following analysis determines the fraction of the amplitude of a single Cubic sawtooth that sits below the average. Terminology and assumptions are defined in the body of the paper (section 2 & section 3).

The formula for the congestion window of a Cubic sawtooth is defined in IETF RFC 8312 [RXH18] as,

where is a constant (recommended as 0.4 in RFC 8312 but currently 0.6 in Linux) and:

where is the multiplicative decrease factor already defined in section 2 (recommended as 0.7).

By the same reasoning as in section 3, while the link is not underutilized, Cubic’s RTT is directly proportional to its congestion window:

The average RTT over a cycle, , is then

Substituting for :
(5)

Then, for a single Cubic sawtooth, the fraction of the amplitude that sits below the average is

Thus, is constant for any .

Appendix B Typical User to CDN RTT

Beganović [Beg19] provides the average RTT measured using ICMP ping from probes in each country to sites known to be served by CDNs. The data was collected from RIPE Atlas probes deployed by volunteers around the world, and was last updated on 17 Apr 2019.

The data is tabulated below and visualized in Figure 4. At the bottom of the table, an average is derived, weighted by the population of Internet users in each country (taking the countries with the highest Internet user populations until 90% of the world’s total Internet users are covered). The per-country data on numbers of Internet users was taken from Wikipedia [Wik20]

, which in turn used population figures for each country, usually from the US Census Bureau, and various estimates of the percentage of Internet users in each country, mostly provided by the ITU.

The measurements were taken to the following seven global CDNs:

  • Akamai

  • AWS Cloudfront

  • Microsoft Azure

  • Cloudflare

  • Google Cloud CDN

  • Fastly

  • Cachefly

It is possible that the latency figure for China is suspect, because measurements to large Chinese CDN providers such as the following were not included in the RIPE Atlas study: ‘

  • Alibaba Cloud

  • Baidu Cloud

  • BaishanCloud

  • ChinaCache

  • Tencent Cloud

Given users in China make up nearly a quarter of the global total, the weighted average would be sensitive to any large error in the CDN latency for users in China. For instance, if the latency figure just for China was reduced from 66ms to 20ms (bringing it in line with India), the global weighted average would drop from 34ms to 24ms.

Country Population % of Internet users Fixed bandwidth (Mb/s) CDN latency (ms)
popul’n [Wik20] [Ook21] [Beg19]
China 1,427,647,786 69.27% 988,990,000 172.95 66
India 1,366,417,754 55.31% 755,820,000 55.76 20
United States 324,459,463 96.26% 312,320,000 191.97 14
Indonesia 266,911,900 79.56% 212,354,070 26.31 21
Brazil 213,300,278 75.02% 160,010,801 90.3 21
Nigeria 205,886,311 66.15% 136,203,231 16.33 4
Russia 143,989,754 82.39% 118,630,000 87.01 30
Japan 127,484,450 91.27% 116,350,000 167.18 8
Bangladesh 164,945,471 70.41% 116,140,000 36.02 39
Pakistan 213,756,286 47.10% 100,679,752 11.74 41
Mexico 128,972,439 69.01% 89,000,000 48.35 30
Iran 83,020,323 94.06% 78,086,663 19.17 76
Germany 82,114,224 94.74% 77,794,405 120.93 14
Philippines 104,918,090 69.58% 73,003,313 49.31 25
Vietnam 97,338,579 70.04% 68,172,134 66.38 23
United Kingdom 66,181,585 98.22% 65,001,016 92.63 11
Turkey 80,745,020 76.88% 62,075,879 34.95 41
France 64,979,548 89.32% 58,038,536 192.25 14
Egypt 101,545,209 53.91% 54,740,141 39.66 81
Italy 60,416,000 83.65% 50,540,000 90.93 19
South Korea 50,982,212 96.94% 49,421,084 241.58 4
Spain 46,750,321 90.70% 42,400,756 186.4 11
Thailand 69,037,513 52.89% 36,513,941 206.81 24
Poland 38,382,576 90.40% 34,697,848 130.98 12
Canada 36,624,199 92.70% 33,950,632 167.61 19
Argentina 44,271,041 75.81% 33,561,876 51.51 19
South Africa 56,717,156 56.17% 31,858,027 43.91 20
Colombia 49,065,615 62.26% 30,548,252 53.73 48
Ukraine 44,222,947 66.64% 29,470,000 67.52 23
Saudi Arabia 32,938,213 82.12% 27,048,861 90.24 76
Malaysia 31,624,264 80.14% 25,343,685 103.34 9
Morocco 35,739,580 61.76% 22,072,765 25.37 44
Taiwan 23,626,456 92.78% 21,920,626 163.85 7
Australia 24,450,561 86.54% 21,159,515 77.88 14
Venezuela 31,977,065 64.31% 20,564,451 17.9 77
Algeria 41,318,142 47.69% 19,704,622 6.78 67
Ethiopia 104,957,438 18.62% 19,543,075 12.39 55
Iraq 38,274,618 49.36% 18,892,351 29.88 77
Uzbekistan 31,910,641 52.31% 16,692,456 39.2 78
Myanmar 53,370,609 30.68% 16,374,103 22.75 50
Netherlands 17,035,938 93.20% 15,877,494 152.94 9
Peru 32,165,485 48.73% 15,674,241 51.81 33
Chile 18,054,726 82.33% 14,864,456 176.48 18
% world Averages weighted by Internet users
Above countries 90.15% 4,292,105,058 103.32 34
World 100.00% 4,761,334,541
Figure 4: Scatter-plot per country of average user to CDN RTT and average fixed access bandwidth. Only the 43 countries with the most Internet users are plotted, representing 90% of Internet users. The top 10 are labelled as well as those at the extremes

Appendix C Scaling of Cycle Duration

Scaling of the cycle duration is not directly relevant to the setting of PI parameters, but is does affect utilization in an indirect but important way. A Classic congestion control responds to a single loss or ECN mark, so losses and ECN marks have to be completely absent during a cycle for a flow to maintain full utilization. The longer the duration of each cycle, the more likely that some extraneous event will occur, e.g. the arrival of a brief flow or loss due to a transmission error. This noise sensitivity of Classic flows becomes the dominant determinant of utilization the more flow rate scales (see footnote 6 of Jacobson & Karels [JK88]).

This scaling if cycle duration is important to understand, as follows:

  • [nosep]

  • Additive increase of a constant amount of data per round trip causes the duration of a single flow’s sawtooth cycle to double for every doubling of link rate. This can be seen for a) Reno and b) Cubic in Reno mode in the right-hand column of Figure 2.

  • In contrast, the cycle duration of a purely Cubic congestion control scales with the cube-root of bandwidth-delay product (BDP). So, as link capacity or RTT doubles, the duration of the cycles of a single flow grow by .

Note, though, that the amplitude of Cubic’s queue-delay variation still scales like Reno, i.e. linearly with RTT and invariant with link capacity, because it is determined by the multiplicative decrease.

Document history

Version Date Author Details of change
00A 01-Jun-2021 Bob Briscoe First draft
01 02-Jun-2021 Bob Briscoe Changed Figure 3 to RTT under load. Numerous minor corrections.