Singletons for Simpletons: Revisiting Windowed Backoff using Chernoff Bounds

For the well-known problem of balls dropped uniformly at random into bins, the number of singletons — those bins with a single ball — is important to the analysis of backoff algorithms. Existing arguments employ advanced tools to obtain concentration bounds. Here we show that standard Chernoff bounds can be used instead, and the simplicity of this approach is illustrated by re-analyzing several fundamental backoff algorithms.


page 1

page 2

page 3

page 4


Concentration-Bound Analysis for Probabilistic Programs and Probabilistic Recurrence Relations

Analyzing probabilistic programs and randomized algorithms are classical...

Sharper bounds for uniformly stable algorithms

The generalization bounds for stable algorithms is a classical question ...

The power of thinning in balanced allocation

Balls are sequentially allocated into n bins as follows: for each ball, ...

Learning without Concentration

We obtain sharp bounds on the performance of Empirical Risk Minimization...

Linear Bandits on Uniformly Convex Sets

Linear bandit algorithms yield 𝒪̃(n√(T)) pseudo-regret bounds on compact...

Fundamental Limitations in Sequential Prediction and Recursive Algorithms: L_p Bounds via an Entropic Analysis

In this paper, we obtain fundamental L_p bounds in sequential prediction...

Benign overfitting without concentration

We obtain a sufficient condition for benign overfitting of linear regres...

1 Introduction

Backoff algorithms address the general problem of how to share a resource among multiple devices. A ubiquitous application is WiFi networks, where the resource is a wireless channel, and multiple devices may contend for access. Any single packet sent uninterrupted over the channel is likely to be received, but if the sending times of two or more packets overlap, communication often fails due to destructive interference at the receiver (i.e., a collision). An important performance metric is the time required for all packets to be sent, which is known as the makespan.

Model. The network model is as follows. Time is discretized into slots, and each packet can be transmitted within a single slot. Starting from the first slot, a batch of packets is ready to be transmitted on a shared channel.222Packets can be viewed as originating from different devices, and going forward we speak only of packets rather than devices. For any fixed slot, if a single packet sends, then the packet succeeds; however, if two or more packets send, then all corresponding packets fails. A packet that attempts to send in a slot learns whether it succeeded and, if so, the packet takes no further actions; otherwise, the packet learns that it failed in that slot, and must try again at a later time.

Problem. Measured in the number of slots, what is the smallest possible makespan? This question is examined by Bender et al. [5] who analyze several backoff algorithms that execute over disjoint, consecutive sets of slots called windows. In every window, each packet that has not been sent successfully selects a single slot uniformly at random in which to send.333A subtly different algorithm, Bernoulli backoff

, has each packet sending with some probability

in each slot, where is the window size. While this admits the use of Chernoff bounds, it yields worse makespan results. See Section 2.1 for more discussion. Bender et al. [5] analyze several algorithms where windows monotonically increase in size.

There is a close relationship between the execution of such algorithms in a window, and the popular balls-in-bins scenario, where balls (corresponding to packets) are dropped uniformly at random into bins (corresponding to slots). In this context, we are interested in the number of bins containing a single ball.

Despite their simple specification, windowed backoff algorithms are surprisingly intricate in their analysis. In particular, obtaining concentration bounds on the number of slots (or bins) that contain a single packet (or ball) — so-called singletons [49] — is complicated by dependencies that rule out a naive application of Chernoff bounds (see Section 2.1). This is unfortunate given that Chernoff bounds are often one of the first powerful probabilistic tools that researchers learn (for example, Dhubashi and Panconesi [21] derive them starting on page ), and they are standard material in a randomized algorithms course.

In contrast, the makespan results in Bender et al. [5] are derived via delay sequences [33, 46]. Alternative tools for handling dependencies include Poisson-based approaches by Mizenmacher [38] and Mitzenmacher and Upfal [37], and the Doob martingale [21], but to the best of our knowledge, these have not been applied to this problem.

1.1 Our Goal

The above mentioned tools are powerful, but are they necessary here, or is there a more streamlined route to arrive at the makespan results of Bender et al. [5]? Apart from being an intriguing theoretical question, an affirmative answer might improve accessibility to the area of backoff algorithms for researchers.444We note that claims of simplicity are partly a matter of taste. It is not our intention to be dismissive of these well-known methods. More narrowly, this might benefit students embarking on research, many of whom cannot fully appreciate the very algorithms that enable, for example, their Instagram posts access to online course notes.555For example, randomized binary exponential backoff is a key component of the distributed coordination function (DCF) in the IEEE 802.11 (WiFi) standards. However, in our experience, the makespan analysis is accessible to few students in a senior-level course on computer networks, or even to those in a graduate-level course on wireless networks.

What if we could apply standard Chernoff bounds to analyze singletons? Then, the analysis distills to proving the correctness of a “guess” regarding a recursive formula (a well-known procedure for students) for the number of packets remaining after each window, and that guess would be accurate to within a tunable, multiplicative factor with small error probability.666We do not claim it is easy to show Chernoff bounds can be used. But if one accepts this as true, then the analysis via Chernoff bounds simplifies in this way.


In this paper, we show that this is possible. Our approach involves an argument that the indicator random variables for counting singletons satisfy the following property from 


Property 1.

Given a set of indicator random variables , for all subsets the following is true:


We prove the following:

Theorem 1.

Consider balls and bins. Let if bin contains exactly ball, and otherwise, for . If or , then satisfy the Property 1.

Property 1 permits the use of standard Chernoff bounds; this implication is posed as an exercise by Dubhashi and Panconesi [21] (Problem 1.8), and we provide the argument in our appendix (Section B). We then use Chernoff bounds to re-derive known makespan results for several algorithms analyzed in [5], in particular: Binary Exponential Backoff (BEB), Fixed Backoff (FB), and Log-Log Backoff (LLB). Additionally, we analyze the asymptotically-optimal (non-monotonic) Sawtooth Backoff (STB) from [30, 25]. These algorithms are specified in Section 5, but our derived makespan results are stated below.

Theorem 2.

For a batch of packets, the following holds with probability at least :

  • FB has makespan at most and at least .

  • BEB has makespan at most and at least .

  • LLB has makespan .

  • STB has makespan .

We highlight three aspects of this work. First, both cases and of Theorem 1 are useful. Specifically, the argument for LLB uses first case, while BEB, FB, and STB use the second.

Second, our approach seems to yield reasonably tight results. Notably, we match the first-order term in the analysis of FB, something that is highlighted in [5]. We suspect that tighter results are possible with a more careful (and perhaps messier) analysis.

Third, we omit trivial steps in our analysis, with the goal of conveying how this approach may apply to other windowed backoff algorithms. Additional proof details are given in the appendix.

1.2 Related Work

Several prior results address dependencies and their relevance to Chernoff bounds and load-balancing in various balls-in-bins scenarios. In terms of backoff, the literature is vast. In both cases, we summarize only closely-related works.

Dependencies, Chernoff Bounds, & Ball-in-Bins. Backoff is closely-related to balls-and-bins problems [4, 18, 45, 47], where balls and bins correspond to packets and slots, respectively. Balls-in-bins analysis often arises in problems of load balancing (for examples, see [9, 10, 11]).

Dubhashi and Ranjan [22] prove that the occupancy numbers — random variables denoting the number of balls that fall into bin — are negatively associated. This result is used by Lenzen and Wattenhofer [34] use it to prove negative association for the random variables that correspond to at most balls.

Czumaj and Stemann [19] examine the maximum load in bins under an adaptive process where each ball is placed into a bin with minimum load of those sampled prior to placement. Negative association of the occupancy numbers is important to this analysis.

Finally, Dubhashi and Ranjan [22] also show that Chernoff bounds remain applicable when the corresponding indicator random variables that are negatively associated. The same result is presented in Dubhashi and Panconesi [21].

Backoff Algorithms. Many early results on backoff are given in the context of statistical queuing-theory(see [31, 29, 41, 27, 31, 28]

) where a common assumption is that packet-arrival times are Poisson distributed.

In contrast, the batched-arrival (or static) model assumes all packets arrive at the same time. The makespan of backoff algorithms with monotonically-increasing window sizes has been analyzed in [5], and with packets of different sizes in [6]. A windowed, but non-monotonic backoff algorithm which is asymptotically optimal in the batched-arrival setting is provided in [26, 30, 2].

A related problem is contention resolution, which addresses the time until the first packet succeeds [48, 39, 24, 23]. This has close ties to the well-known problem of leader election (for examples, see [13, 12]).

Several results examine the dynamic case where packets arrive over time as scheduled in a worst-case fashion [35, 20, 8]. A similar problem is that of wake-up [16, 15, 17, 14, 36, 32] addresses how long it takes for a single transmission to succeed when packets arrive under the dynamic scenario.

Finally, several results address the case where the shared communication channel is unavailable at due to malicious interference [3, 42, 43, 44, 40, 1, 7].

2 Analysis for Property 1

We present our results on Property 1. Since we believe this result may be useful outside of backoff, our presentation is given in terms of the well-known balls-in-bins terminology, where we have balls and bins.

2.1 Preliminaries

Throughout, we often employ the following inequalities (see Lemma 3.3 in [44]):

Fact 1.

For any , .

Knowing that indicator random variables (i.r.v.s) satisfy Property 1 is useful since the following Chernoff bounds can then be applied [21].

Theorem 3.

(Dubhashi and Panconesi [21])777Again, this is stated in Problem 1.8 in [21]; see our appendix. Let where are i.r.v.s that satisfy Property 1 . For , the following holds:


We are interested in the i.r.v.s , where:

Unfortunately, there are cases where the s fail to satisfy Property 1. For example, consider balls and bins. Then, , so , but .

A naive approach (although, we have not seen it in the literature) is to leverage the result in [34], that the variables used to count the number of bins with at most balls are negatively associated. We may bound the number of bins that have at most ball, and the number of bins that have (at most) balls, and then take the difference. However, this is a cumbersome approach, and our result is more direct and yields tighter results.

Another idea is to consider a subtly-different algorithm where a packet sends with probability in each slot of a window with slots, rather than selecting only a single slot to send in; this is referred to as Bernoulli backoff. However, as the authors of [5] point out, when is within a constant factor of the window size, there is a constant probability that the packet will not send in any slot under Bernoulli backoff. Consequently, the number of windows required for all packets to succeed increases by a -factor, whereas only windows are required under the model used here.

2.2 Property 1 and Bounds on Singletons

To prove Theorem 1, we establish the following Lemma 1. For , define:

which is the conditional probability that bin contains exactly 1 ball given each of the bins contains exactly 1 ball. Note that is same for any , and let:

Lemma 1.

If or , the conditional probability is a monotonically non-increasing function of , i.e., , for .


First, for , the conditional probability can be expressed as


Note that in (4) is equal to (9) with .

For , we note that beyond the range , it must be that . In other words, for since all balls have already been placed. Thus, we need to prove , for .

On the other hand, if , we need to prove , for . Thus, this lemma is equivalent to prove if or , the ratio , for .

Using the Equation (9), the ratio can be expressed as:

Let , then ; and let . Thus, the ratio becomes;

By the Binomial theorem, we have:

Thus, the ratio can be written as:


Note that because , then . Thus, the third term in (10) is always non-negative. If or , then for any . Consequently, the ratio . ∎

Proof of Theorem 1.

Let denote the size of the subset , i.e. the number of bins in . First, note that if , when , the probability on the left hand side (LHS) of (1) is 0, thus, the inequality (1) holds. In addition, shown above for any . Thus, the right hand side of (1) becomes . Thus, we need to prove for any subset, denoted as with

The LHS can be written as:

Lemma 1 shows that if or , is a decreasing function of . Consequently, , for . Thus:

and so the bound in Equation (1) holds. ∎

The standard Cheroff bounds of Theorem 3 now apply, and we use them obtain bounds on the number of singletons. For ease of presentation, we occasionally use to denote .

Lemma 2.

For balls that are dropped into bins where or , the following is true for any .

  • The number of singletons is at least with probability at least .

  • The number of singletons is at most with probability at least .


We begin by calculating the expected number of singletons. Let be an indicator random variable such that if bin contains a single ball; otherwise, . Note that:


Let be the number of singletons. We have:

Next, we derive a concentration result around this expected value. Since or , Theorem 1 guarantees that the s are negatively associated, and we may apply the Chernoff bound in Equation 3 to obtain:

which completes the lower-bound argument. The upper bound is nearly identical (see the appendix). ∎

3 Analyzing Remaining Packets

We derive tools to analyze the number of packets over windows indexed from . This indexing is for the purposes of analysis, and it does not necessarily indicate the initial window executed by a backoff algorithm. For example, BEB’s initial window consists of a single slot, and does not impact the makespan analysis; instead, the first window of size is window . In contrast, FB’s windows each consist of slots, and this is treated as window . This will be addressed further when analyzing makespan in Section 5.

Let be the number of packets at the start of window . Let since some packets may have succeeded prior to window . Let denote the number of slots in window .

For the cases of and , we upper bound . These two cases are useful for upper-bounding the makespan.888Note that Case 1 below is not very useful when , but the result is sufficient for our inductive argument later in Section 4. Conversely, for , we show that . This is useful for lower-bounding the makespan

The bounds used in Corollary 1 below, and in other arguments, are chosen for ease of presentation; they may be tightened.

Corollary 1.

For , the following is true with probability at least :

  • [leftmargin=5.5mm]

  • Case 1. If , then .

  • Case 2. If , then .

  • Case 3. If and for any constant , then .


For Case 1, we apply Lemma 2 with , which implies with probability at least :

For Case 2, note that by Equation 3, when , .

To obtain the lower bound in Case 3, we apply Lemma 2 with , which implies with probability at least :

The following lemma is useful for achieving a with-high-probability guarantee when the number of balls is small relative to the number of bins.

Lemma 3.

Assume . With probability at least , all packets succeed in window .


Consider placements of packets in the window that yield at most one packet per slot. Note that once a packet is placed in a slot, there is one less slot available for each remaining packet yet to be placed. Therefore, there are such placements.

Since there are ways to place packets in slots, it follows that the probability that each of the packets chooses a different slot is:

We can lower bound this probability:

as claimed. ∎

Lemma 4.

Assume a batch of packets that execute over a window of size , where for all . Then, with probability at least , any monotonic backoff algorithm requires additional windows to complete all packets.


By Case 2 of Corollary 1, . By Lemma 3, the probability that any packets remain by the end of the next window is ; refer to this as the probability of failure. Subsequent windows increase in size monotonically, while the number of remaining packets decreases monotonically. Therefore, the probability of failure is in any subsequent window, and the probability of failing over all of the next windows is less than . ∎

4 Inductive Arguments

We present inductive arguments on using Chernoff bounds, as discussed in Section 1.1. All results hold for sufficiently large .

There are two inductive arguments concerning upper bounds. The first applies to FB, BEB, and LLB, while the second applies to STB. Notably, a single inductive argument would suffice except that we wish to obtain a tight bound on the first-order term of FB, which is one of the contributions in [5].

Lemma 5.

Consider a batch of packets that execute over windows for all . If , then with error probability at most .


We argue by induction on .

Base Case. Let . Lemma 2 implies:

where the last line follows by setting , and assuming is sufficiently large to satisfy the inequality; this gives an error probability of at most . The base case is satisfied since .

Induction Hypothesis (IH). For , assume with error probability at most .

Induction Step. For window , we wish to show that with an error bound of . Addressing the number of packets, we have:

The first line follows from Case 1 of Corollary 1, which we may invoke since for all . This yields an error of at most , and so the total error is at most as desired. The second line follows from the IH. ∎

A nearly identical lemma is useful for upper-bounding the makespan of STB. The main difference arises from addressing the decreasing window sizes in a run, and this necessitates the condition that rather than for all . Later in Section 5, we start analyzing STB when the window size reaches ; this motivates the condition that our next lemma.

Lemma 6.

Consider a batch of packets that execute over windows of size and for all . If , then with error probability at most .


We argue by induction on .

Base Case. Nearly identical to the base case in proof of Lemma 5 (details in appendix).

Induction Hypothesis (IH). For , assume with error probability at most .

Induction Step. For window , we wish to show that with an error bound of . Addressing the number of packets, we have:

Again, first line follows from Case 1 of Corollary 1, and which gives the desired error bound of . The second line follows from the IH. ∎

The third and final lemma in this section is useful in obtaining lower bounds on the makespan.

Lemma 7.

Consider a batch of packets, for any constant , that executes over windows of size for any constant . If , then , with error probability at most .


We argue the following claim by induction on .

Base Case. Let . Lemma 2 implies:

where setting , and assuming is sufficiently large, satisfies the last inequality and gives the associated error probability of at most . The base case is satisfied since for all .

Induction Hypothesis (IH). For , assume with error probability at most .

Induction Step. For window , we wish to show that with an error bound of . Addressing the number of packets, we have:

The first line follows from Case 3 of Corollary 1, and which gives the desired error bound of . ∎

5 Makespan Analysis

We begin by describing the windowed backoff algorithms Fixed Backoff (FB), Binary Exponential Backoff (BEB), and Log-Log Backoff (LLB) analyzed in [5]. Recall that, in each window, a packet selects a single slot uniformly at random to send in. Therefore, we need only specify how the size of successive windows change.

The simplest is FB, where all windows have size . In contrast, BEB has an initial window size of , and each successive window doubles in size. LLB has an initial window size of , and for a current window size of , it executes windows of that size before doubling; we call these sequence of same-sized windows a plateau.999As stated by Bender et al. [5], an equivalent (in terms of makespan) specification of LLB is that .

STB is non-monotonic and executes over a doubly-nested loop. The outer loop sets the current window size to be double that used in the preceding outer loop and each packet selects a single slot to send in; this is like BEB. Additionally, for each such , the inner loop executes over windows of decreasing size: ; this sequence of windows is referred to as a run. For each window in a run, a packet chooses a slot to send in uniformly at random.

5.1 Makespan Analysis

The following results employ tools from the prior sections a constant number of times, and each tool has error probability either or . Therefore, all following theorems hold with probability at least , and we omit further discussion of error.

Theorem 4.

The makespan of FB for a window of size is at most and at least .


Since for all , by Lemma 5 less than packets remain after windows. By Lemma 4, all remaining packets succeed within more windows. The corresponding number of slots is .

For the lower bound, Lemma 7 with and implies that after windows, at least packets remain. The corresponding number of slots is . ∎

The above lower bound can be derived for any so long as . For example, in [5], the authors consider FB with a window between and ; that is, . Here, we chose because it matches the window size used in our corresponding upper bound.

Theorem 5.

The makespan of BEB is less than and at least .


Let be the first window with at least slots. Assume no packets finish before the start of ; otherwise, this can only improve the makespan. By Lemma 5 less than packets remain after windows. By Lemma 4 all remaining packets succeed within more windows. Since has size less than , the number of slots until the end of , plus those for the subsequent windows, is less than:

by the sum of a geometric series.

The probability that any packets finish prior to a window of size is at most . From the start of a window of size to the end of a window of size , there are at most slots over these windows. Therefore, at most packets finish over these slots.

At most more windows occur prior to reaching a window of size at least . Applying Lemma 2, at least packets remain before the start of this window. By Lemma 7 with and , at least packets remain after additional windows, which corresponds to slots. ∎

Theorem 6.

The makespan of STB is .


Let be the first window of size at least . Assume no packets finish before the start of , that is ; else, this can only improve the makespan.

Our analysis examines the windows in the run starting with window , and so , etc. To invoke Lemma 6, we must ensure that the condition holds in each window of this run. For , is true. Applying Lemma 2, , while , the condition is again true. By Case 1 of Corollary 1, , while . In general, Case 2 guarantees while .

Lemma 6 implies that after windows, less than packets remain. Pessimistically, assume no other packets finish in the run. The next run starts with a window of size at least , and by Lemma 4, all remaining packets succeed within the first windows of this run, since the fifth (smallest) window has size at least .

The run that starts with window size contains slots, for . The number of slots is by the sum of a geometric series. ∎

Note that STB has asymptotically-optimal makespan since we cannot hope to finish packets in slots. In contrast, Bender et al. [5] show that the optimal makespan for any monotonic windowed backoff algorithm is . Here, we use the case for in Theorem 1 to re-derive the result in [5] that Log-Log Backoff is asymptotically optimal.

Theorem 7.

The makespan of Log-Log Backoff is .


For the first part of our analysis, assume at least packets remain. Consider the first window with size for . By Lemma 2, each window finishes at least the following number of packets: