# Approximate Neighbor Counting in Radio Networks

For many distributed algorithms, neighborhood size is an important parameter. In radio networks, however, obtaining this information can be difficult due to ad hoc deployments and communication that occurs on a collision-prone shared channel. This paper conducts a comprehensive survey of the approximate neighbor counting problem, which requires nodes to obtain a constant factor approximation of the size of their network neighborhood. We produce new lower and upper bounds for three main variations of this problem in the radio network model: (a) the network is single-hop and every node must obtain an estimate of its neighborhood size; (b) the network is multi-hop and only a designated node must obtain an estimate of its neighborhood size; and (c) the network is multi-hop and every node must obtain an estimate of its neighborhood size. In studying these problem variations, we consider solutions with and without collision detection, and with both constant and high success probability. Some of our results are extensions of existing strategies, while others require technical innovations. We argue this collection of results provides insight into the nature of this well-motivated problem (including how it differs from related symmetry breaking tasks in radio networks), and provides a useful toolbox for algorithm designers tackling higher level problems that might benefit from neighborhood size estimates.

## Authors

• 10 publications
• 7 publications
09/09/2019

07/14/2020

### Performance analysis of a distributed algorithm for admission control in wireless networks under the 2-hop interference model

A general open problem in networking is: what are the fundamental limits...
11/28/2018

### Asynchronous Local Construction of Bounded-Degree Network Topologies Using Only Neighborhood Information

We consider ad-hoc networks consisting of n wireless nodes that are loca...
12/30/2017

### Randomized Communication in Radio Networks

A communication network is called a radio network if its nodes exchange ...
09/17/2020

### Finding Subgraphs in Highly Dynamic Networks

In this paper we consider the fundamental problem of finding subgraphs i...
03/08/2020

### Neighborhood Information-based Probabilistic Algorithm for Network Disintegration

Many real-world applications can be modelled as complex networks, and su...
09/14/2018

### Multi-hop assortativities for networks classification

Several social, medical, engineering and biological challenges rely on d...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Many distributed algorithms assume nodes have advance knowledge of their neighborhood, allowing them to take steps that depend, for example, on gathering information from every neighbor (e.g., [17]), or flipping a coin weighted with their neighborhood size (e.g., [2]).

In standard wired network models, where nodes are connected by static point-to-point links, obtaining this neighbor information is often trivial (e.g., as in the LOCAL or CONGEST models). In radio networks, by contrast, this information might be harder to obtain. Specifically, because nodes in these networks are often deployed in an ad hoc manner, and subsequently communicate only on a contended shared channel, we cannot expect that they possess advance knowledge of their neighborhood. In fact, learning this information might require non-trivial feats of contention management.

Some distributed algorithms for radio networks depend on nodes possessing an estimate of their neighborhood size (e.g., [10, 11]), while other algorithms could be significantly simplified if this information was available (e.g., [22, 19, 14]). Though it is generally assumed that calculating these size estimates should not take too long in most settings, this problem has escaped the more systematic scrutiny applied to related tasks like contention resolution.

In this paper, we work toward filling in more of this knowledge gap. We conduct a comprehensive survey of lower and upper bounds for the approximate neighbor counting problem in the radio network model under different combinations of common assumptions for this setting. Some of our results require only extensions of existing strategies, while many others require non-trivial technical innovations.

Combined, this collection of results provides two important contributions to the study of distributed algorithms for radio networks. First, it supports a deeper understanding of the well-motivated neighbor counting problem, highlighting both its similarities and differences to related low-level radio network tasks. Second, the collection acts as a useful toolbox for algorithm designers tackling higher level problems.

#### Result summary.

The radio network model we study describes the underlying network topology with an undirected connected graph , with the vertices corresponding to the radio devices (usually called nodes in this paper), and the edges in describing which node pairs are within communication range. For every node , describes the number of neighbors of in . We sometimes call this parameter the neighbor count of . In single-hop networks (i.e., is a clique), all nodes have the same neighbor count, while in multi-hop networks these counts can differ.

The approximate neighbor counting problem requires nodes to calculate constant factor estimates of their neighbor counts. We study the variant where every node must obtain this estimate (e.g., during network initialization), and the variant where only a designated node must obtain this estimate (e.g., when neighborhood of a node changes). We study these variants in single-hop and multi-hop networks, and consider solutions with and without collision detection. We study both lower and upper bounds for randomized solutions. When relevant, we look at both results that hold with constant and high probability.

Our results are summarized in Figure 1. Notice that we do not study both designated and all nodes counting in single-hop networks, as in this setting all nodes have the same neighbor count, making these two cases essentially identical (e.g., a designated node in a single-hop network can simply announce its count, transforming the solution to an all nodes counting solution). We also do not study constant probability solutions for all nodes counting in multi-hop networks. This follows because in the multi-hop setting the success probability applies to each individual node. A constant success probability, therefore, implies that a constant fraction of the nodes are expected to generate inaccurate neighbor counts—a result that is too weak in most scenarios. In the single-hop setting, by contrast, the success probability refers to the probability that all nodes generate good counts.

Also notice that two upper bounds are given for multi-hop all nodes counting without collision detection: and . The first bound describes an algorithm that generates good neighbor counts but never terminates (specifically, each node must keep participating to help neighbors that are still counting). The second bound does terminate, but requires an upper bound on the maximum possible network size. This is the only algorithm we study that requires this information to work properly.

#### Discussion.

For all but one cases that we have lower bounds, they match our upper bounds. For the single-hop results, these bounds also match the relevant bounds from the related single-hop contention resolution problem (c.f., [20]). In fact, most of the lower bounds in this single-hop setting follow by reduction from contention resolution. That is, we show that if you can solve approximate neighbor counting fast, then you can also solve contention resolution fast—allowing existing lower bounds from the latter to carry over to the former.

The single-hop upper bounds, however, required more than the simple application of existing contention resolution strategies. In contention resolution, for example, if you get lucky with your coin flips, and a node broadcasts alone earlier than expected, this is good news—you have solved the problem even faster! In neighbor counting, however, this “luck” might lead you to output an inaccurate size estimate. The analysis used for neighbor counting must bound the probabilities of these precocious symmetry breaking events.

Another complexity of neighbor counting (in single-hop networks) as compared to contention resolution is that all nodes must learn an estimate. This requires extra mechanisms to ensure that once some nodes learn a good estimate, this information is spread to all others. The most difficult single-hop case is the combination of high probability correctness and collision detection. To achieve an accurate estimate in an optimal rounds required the adaptation of a technique based on one-dimensional random walks [19, 4].

Obtaining lower bounds for the multi-hop designated node setting required technical innovations. In the single-hop setting, our lower bounds used reduction arguments that applied the contention resolution bounds from [20] as a black box. In the multi-hop designated node setting, by contrast, we were forced to open the black boxes and modify them to handle the issues specific to multi-hop topologies. For the particular case of collision detection and high probability, substantial new arguments were needed to transform the bound.

In the multi-hop all nodes setting, obtaining upper bounds also required techniques beyond standard symmetry breaking strategies, as each node may simultaneously participate in multiple estimation processes. Our collision detector algorithm for this case has nodes use detectable noise to notify neighbors that they are still counting. When collision detection is not available, we consider two different approaches and hence present two algorithms. The first one returns an estimate for in rounds, which is correct with high probability in . The second algorithm uses a “double counting” trick and takes longer time, but the returned estimate is correct with high probability in .

Last but not least, we would like to clarify a point about our lower bound statements. As shown in Figure 1, our lower bounds are expressed with respect to the maximum possible neighbor counts (e.g., and ), whereas, to obtain the strongest possible results, our upper bounds are expressed with respect to the actual neighbor counts in the analyzed execution (e.g., and ). The right way to interpret our lower bounds is that they claim in a setting where the number of participants comes from a set of (or ) possible participants, there exists a subset of these participants for which the stated bound holds.

Our lower bound technique does not directly tell us anything about the size of the participant set that induces the slow performance. Given our matching upper bounds, however, we can conclude that the worst case participant sets for these algorithms must have a size close to the maximum bounds. Consider, for example, single-hop counting with no collision detection. The lower bound says that for each algorithm there exists a collection of no more than participants that requires rounds to generate a good count with constant probability. Our upper bound, on the other hand, guarantees a good count in rounds with constant probability, where is the size of the participant set. It follows that when the lower bound is applied to our algorithm, the bad participant set must have a size that is polynomial in (i.e., , for some constant ), as otherwise the existence of both bounds is a logical contradiction.

## 2 Related Work

Algorithms to reduce contention and enable communication on shared channels date back to the early days of networking (c.f., [12, 7]), and remain an active area of study today. In the study of distributed algorithms for shared radio channels, many strategies explicitly execute approximate neighbor counting as a subroutine. For example, in their study of energy-efficient initialization with collision detection, Bordim et al. [3] propose a protocol that returns an estimate of in the range within time, while requiring each node to be awake for at most rounds. Similarly, Gilbert et al. [11] use approximate neighbor counting as part of a neighbor discovery protocol in cognitive radio networks. It is also common for algorithms in this setting to simply assume these estimates are provided in advance. E.g., the often-used decay strategy introduced by Bar-Yehuda et al. [2], requires a bound on local neighborhood size to limit the estimates it tests.

As mentioned throughout this paper, neighbor counting is often closely related to contention resolution, which requires a single node to broadcast alone on the channel. Some common contention resolution strategies implicitly provide this approximation as a side-effect of their operation (e.g., [22, 19, 14]). At the same time, under some assumptions, a good estimate simplifies the problem of contention resolution. As we detail throughout this paper, however, this relationship is not exact. Lower bounds for neighbor counting often require more intricate arguments than contention resolution, and in some cases, contention resolution algorithms require nontrivial extra analysis and mechanisms to provide counts. Teasing apart this intertwined relationship is one of the main contributions of this paper.

Others have directly studied approximate neighbor counting in radio networks. Jurdzinski et al. [13] develop an algorithm that provides a constant factor approximation of within time without collision detection for arbitrary constant . Their algorithm guarantees that no node participates in more than rounds. (Our relevant algorithm only needs rounds, but consumes more energy.) Caragiannis et al. [5] devise two constant-factor approximation algorithms: the first one requires collision detection and takes time, while the second one works without collision detection and takes time. (Our relevant algorithms only need rounds with collision detection, and perform as well as theirs without collision detection.) In [15, 16], the authors discuss how to approximate network size when adversaries are present.

Approximate neighbor counting has also been studied in the beeping model [8], which is similar to, but somewhat weaker than, the standard radio network model. In this setting, Chen et al. [6] conduct an excellent mini survey on recent works in RFID counting (e.g., [24, 21, 23, 6]). They conclude that a two-phase approach is the key to achieve efficient and accurate RFID counting. They also prove several lower bounds, one of which shows rounds are needed to obtain a constant factor approximation with constant probability. More recently, Brandes et al. [4] study how to efficiently estimate the size of a single-hop beeping network: they provide both lower and upper bounds for a parameterized approximation accuracy. Notice, the main objective of [6] and [4] differs from ours not just in the model, but in that they seek a approximation of for any ( can be non-constant). Nonetheless, they both use constant factor approximation as a key subroutine.

## 3 Model and Problem

We consider a synchronous radio network. We model the topology of this network with a connected undirected graph , with the vertices corresponding to the radio devices (usually called nodes in this paper), and the edges in describing which node pairs are within communication range.

For each node , we use to denote the set of neighbors of , and use to denote the number of neighbors of . Let . Our algorithms assume . That is, we do not confront the possibility of a node isolated from the rest of a multi-hop network, or a single-hop network consisting of only a single node (we see the so-called loneliness detection problem as an interesting but somewhat orthogonal challenge; e.g., [9]). For the ease of presentation, we assume and are always a power of two. This assumption does not affect the correctness or asymptotic time complexities of our results. We define and to be upper bounds on the maximum possible size of and , respectively. To obtain the strongest and most general possible results, our algorithms are not provided with knowledge of and , with the exception of an time algorithm for multi-hop all nodes counting without collision detection.

In this paper, we are interested in the approximate neighbor counting problem. This problem requires selected node(s) to obtain a constant factor approximation of their neighborhood size(s). In more detail, let constant be the fixed approximation threshold for this problem. Each node that produces an estimate must satisfy . We consider three variations of this problem that differ with respect to the allowable network topologies and requirements on which nodes produce an estimate. The first variant assumes is single-hop and all nodes must produce an identical estimate. The second variant assumes is multi-hop, but only a single designated node must produce an estimate. The third variant is the same as the second, except that now every node must produce an estimate. We study randomized algorithms that are proved to be correct with a given probability . In the single-hop variant, describes the probability of the event in which all nodes generate a single good approximation. In the multi-hop variants, by contrast, is the probability that an individual counting node generates a good approximation.

Throughout this paper, we cite the related contention resolution problem. In single-hop networks, the contention resolution problem is solved once some node broadcasts alone. Later in the paper, we consider a version of multi-hop contention resolution in which a single designated node must receive a message from a neighbor to solve the problem.

Finally, in the following, we say an event occurs with high probability in parameter (or “w.h.p. in ”) if it occurs with probability at least , for some constant .

## 4 Lower Bounds

In this section, we presents our lower bounds for the approximate neighbor counting problem. We begin, in Section 4.1 by looking at lower bounds that can be proved by reducing from the contention resolution problem. That is, in that subsection, we prove lower bounds by arguing that solving neighbor counting fast implies an efficient algorithm to contention resolution, allowing the relevant contention resolution lower bounds to apply.

We employ this approach to derive bounds for constant probability and high probability counting with no collision detection in both single-hop and designated node multi-hop settings. We also apply this approach to derive bounds for constant probability counting with collision detection in these settings. We cannot, however, apply this approach to high probability counting with collision detection, as the reduction itself is too slow compared to the desired bounds. We note that for the single-hop arguments, we leverage existing contention resolution bounds from [20]. For the multi-hop arguments, however, we must first generalize the results from [20] to hold for the considered network topology.

In Section 4.2, we look at lower bounds for high probability approximate neighbor counting with collision detection in both single-hop and designated node multi-hop settings. Unlike in Section 4.1, we cannot deploy a reduction-based argument. We instead prove a new lower bound that directly argues a sufficiently accurate estimate requires the stated rounds.

Finally, in Section 4.3 we look at lower bounds for the remaining case of multi-hop all nodes counting. We establish these bounds by reduction from designated node multi-hop bounds, as solving all nodes counting trivially also solves designated node counting.

### 4.1 Lower Bounds via Reduction from Contention Resolution

We begin with our lower bound arguments that rely on reductions from contention resolution. For the single-hop scenario, we can reduce from single-hop contention resolution and apply existing lower bounds from [20]. (Due to space constraint, see Appendix A.1 for details on contention resolution lower bounds in single-hop networks.) For multi-hop designated node counting, however, we must first prove new contention resolution lower bounds.

In particular, consider the definition of multi-hop contention resolution in which there is a well-defined designated node , and the goal is for exactly one of ’s neighbors—which is a size subset drawn from a size universe—to broadcast alone in some time slot. At first glance, this problem might seem easier than single-hop contention resolution as we are provided with a designated node that could coordinate its neighbors in their quest to break symmetry among themselves. We prove, however, that this is not the case: the lower bounds are the same as their single-hop counterparts. In more detail, we prove the following two lemmas by adapting the techniques from [20] to this new set of assumptions (see Appendix A.3 for the omitted proofs of this section):

Let be an algorithm that solves contention resolution in time slots with probability in multi-hop networks with no collision detection. It follows that: (a) if is some constant, then ; and (b) if , then .

Let be an algorithm that solves contention resolution in time slots with probability in multi-hop networks with collision detection. It follows that if is some constant, then .

With the needed contention resolution lower bounds in hand, we turn our attention to reducing this problem to approximate neighbor counting. Take the single-hop scenario as an example, the basic idea behind the reduction is that once nodes have an estimate of , they can simply broadcast with probability in each time slot. If this estimate is good, then in each time slot, they have a constant probability of isolating a broadcaster, thus solving contention resolution. Moreover, repeating this step multiple times increases the chance of success proportionally. Building on these basic observations, we prove the following:

Assume there exists an algorithm that solves approximate neighbor counting in (or, in the multi-hop scenario) time slots with probability . Then, there exists an algorithm that solves contention resolution in (resp., in the multi-hop scenario) time slots with probability at least . Here, is an integer, and is the constant defined in Section 3.

Combining the reduction described in Lemma 4.1 with the single-hop lower bounds for contention resolution from [20] and the new multi-hop lower bounds proved above, we get the following lower bounds for approximate neighbor counting:

In a single-hop radio network containing at most nodes:

• [itemsep=0.5pt, topsep=0.5pt, parsep=0.5pt]

• When collision detection is not available, solving approximate neighbor counting with constant probability requires time in the worst case; solving approximate neighbor counting with high probability in requires time in the worst case.

• When collision detection is available, solving approximate neighbor counting with constant probability requires time in the worst case.

In a multi-hop radio network in which the designated node has at most neighbors:

• [itemsep=0.5pt, topsep=0.5pt, parsep=0.5pt]

• When collision detection is not available, solving approximate neighbor counting with constant probability requires time in the worst case; solving approximate neighbor counting with high probability in requires time in the worst case.

• When collision detection is available, solving approximate neighbor counting with constant probability requires time in the worst case.

### 4.2 Custom Lower Bounds for High Probability and Collision Detection

At this point, for single-hop and designated node multi-hop variants of the approximate neighbor counting problem, the only lower bounds missing are the ones of ensuring high success probability with collision detection. As we detail in Appendix A.2, our previous reduction-based approach no longer works in these scenarios. (Roughly speaking, the reduction itself takes at least as long as the lower bound we intend to prove.) Therefore, we must construct custom lower bounds for this problem and exact set of assumptions.

We start by proving the following combinatorial result:

Let and be two positive integers such that . Let be the set containing all size subsets from . Let be an arbitrary set of size less than such that each element in is a subset of . Then, there exists some such that for each , either or .

Intuitively, a set can be interpreted as a broadcast schedule generated by an algorithm : a node labeled broadcasts in slot if and only if it is activated, and is in the th set in . Given this interpretation, Lemma 4.2 suggests: for both the single-hop and the multi-hop designated node scenario, for any approximate neighbor counting algorithm , and for any broadcast schedule generated by of length less than , there exists a set of nodes (or a set of neighbors of the designated node in the multi-hop scenario) such that if these nodes are activated and execute , then during each of the first time slots, either none of them broadcast or all of them broadcast. This further implies, if only two of these nodes are activated, then their view (of the first time slots of the execution) is indistinguishable from the case in which all of these nodes are activated.

Now, imagine an adversary who samples a size subset from with uniform randomness, and then flips a fair coin to decide whether to activate all these nodes, or just two of them. If the adversary happens to have chosen the set proved to exist in Lemma 4.2, then by the end of slot , algorithm cannot distinguish between two and nodes. Notice, if is large compared to the approximation threshold , then this difference matters: outputting two when the real count is (or vice versa) is unacceptable. Thus, in such case, the algorithm gets the right answer with only probability —not enough for high success probability.

A complete and rigorous proof for the above intuition is actually quite involved, again see Appendix A.3 for more details. In the end, we obtain the following lower bounds:

Assume collision detection is available, then:

• [itemsep=0.5pt, topsep=0.5pt, parsep=0.5pt]

• In a single-hop radio network containing at most nodes, solving approximate neighbor counting with high probability in requires time in the worst case.

• In a multi-hop radio network in which the designated node has at most neighbors, solving approximate neighbor counting with high probability in requires time in the worst case.

### 4.3 All Nodes Multi-Hop Lower Bounds

If an algorithm can solve multi-hop all nodes approximate neighbor counting, then clearly the same algorithm can be used to solve multi-hop designated node approximate neighbor counting, with same time complexity and success probability. Therefore, the lower bounds we previously proved for the latter variant naturally carries over to the former variant:

In a multi-hop radio network containing at most nodes:

• [itemsep=0.5pt, topsep=0.5pt, parsep=0.5pt]

• When collision detection is not available, solving approximate neighbor counting with high probability in requires time in the worst case.

• When collision detection is available, solving approximate neighbor counting with high probability in requires time in the worst case.

## 5 Upper Bounds

In this section, we describe and analyze several randomized algorithms that solve the approximate neighbor counting problem. We will begin with single-hop all nodes counting. Specifically, four algorithm are presented for this variant, each based on a different approach. Though most of the strategies used are previously known, extensions to design and analysis are often needed. We then introduce three algorithms for multi-hop all nodes counting, including one which is particularly interesting, as it uses a “double counting” trick that is not related to contention resolution at all to obtain high success probability. Finally, we briefly discuss solutions for multi-hop designated node counting, as most of these algorithm are simple variations of their counterparts for single-hop all nodes counting.

Due to space constraint, if not otherwise stated, complete proofs for lemma and theorem statements are provided in the appendix. Nonetheless, we will usually discuss the intuitions or high-level strategies for proving them.

### 5.1 Single-Hop Networks: No Collision Detection

Our algorithms often adopt a classical technique inspired by the contention resolution literature: “guess and verify”. In more detail, take a guess about the count, and then verify its accuracy; if the guess is good enough then we are done, otherwise take another guess and repeat. This simple approach is versatile: depending on how the guesses are made and verified, many variations exist, resulting in efficient algorithms suitable for different settings.

A standard approach to this guessing is to use a geometric sequence with common ratio two, which is usually called (exponential) decay [2]. This sequence leverages the fact that we only need a constant factor estimate to speed things up. Particularly, if the real count is , then only iterations are needed before reaching an accurate estimate.

Once a guess is made, we need to verify its accuracy. To accomplish this, it is sufficient to let each participating node broadcast with a probability proportional to the reciprocal of the guess, and then observe the status of the channel. The intuition is simple: underestimate will result in collision and overestimate will result in silence; and we expect one node to broadcast alone—a distinguishable event—iff the estimate is accurate enough. The algorithms described below adapt this general approach to their specific constraints.

#### Constant probability of success.

We now present Count-SH-noCD-Const. (In the algorithm’s name, SH means “single-hop”, noCD means “no collision detection”, and Const means “success with constant probability”.) This algorithm applies the most basic form of the “guess and verify” strategy. It provides a correct estimate with constant probability in time for single-hop radio networks, when collision detection is not available.

Count-SH-noCD-Const contains multiple iterations, each of which has two time slots. In the th iteration, nodes assume , and verify whether this estimate is accurate or not. More specifically, in the first time slot within the th iteration, each node will broadcast a beacon message with probability and listen otherwise. If a node decides to listen and hears a beacon message in the first time slot, then it will set its estimate to and terminate after this iteration. That is, if a single node broadcasts alone in the first time slot of iteration , then all listening nodes—which is all nodes except —will terminate by the end of this iteration, with being their estimate. Notice, we still need to inform about this estimate, which is the very purpose of the second time slot within each iteration. More specifically, in the second time slot within the th iteration, for each node , if it has heard a beacon message in the first time slot of this iteration, then it will broadcast a stop message with probability . Otherwise, if has broadcast in the first time slot of this iteration, then it will listen in this second time slot. Moreover, it will terminate with its estimate set to , if it hears a stop message in this second time slot.

Despite its simplicity, proving the correctness of Count-SH-noCD-Const requires efforts beyond what would suffice for basic contention resolution. First, by carefully calculating and summing up the failure probabilities, we show no node will terminate during the first iterations, with at least constant probability. Then, we show during iteration where , either no node terminates or all nodes terminate, with at least constant probability. Finally, we prove that if all nodes are still active by iteration , then all of them will terminate by the end of it, again with at least constant probability. The full analysis can be found in Appendix B.1. Here we present only the main theorem:

The Count-SH-noCD-Const approximate neighbor counting algorithm guarantees the following properties with constant probability when executed in a single-hop network with no collision detection: (a) all nodes terminate simultaneously within slots; and (b) all nodes obtain the same estimate of , which is in the range .

#### High probability of success.

Observe that in the aforementioned simplest form of “guess and verify”, as the estimate increases, the probability that multiple nodes broadcast decreases, and the probability that no node broadcasts increases. A more interesting metric is the probability that a single node broadcasts alone: it first increases, and then decreases; not surprisingly, the peak value is reached when the estimate is the real count. These facts suggest, for each estimate , we could repeat the procedure of broadcasting with probability multiple times, and use the fraction of noisy/silent/clear-message slots to determine the accuracy of the estimate. This method can provide stronger correctness guarantees, but complicates termination detection (i.e., when should a node stop), as different nodes may observe different fraction values. For example, if some nodes have already obtained a correct estimate but terminate too early, then remaining nodes might never get correct estimates, since there are fewer nodes remaining. The situation becomes more challenging when an upper bound of the real count is not available. To resolve this issue, sometimes, we have to carefully craft and embed a “consensus” mechanism.

Count-SH-noCD-High highlights our above discussion. This algorithm contains multiple iterations, each of which has three phases. In the th iteration, nodes assume . The first phase of each iteration —which contains time slots—is used to verify the accuracy of the current estimate. In particular, in each slot within the first phase, each node will broadcast a beacon message with probability , and listen otherwise. By the end of the first phase, each node will calculate the fraction of time slots (among all its listening slots in this phase) in which it has heard a beacon message. For each node, if at the end of the first phase of some iteration , this fraction value has reached for the first time since the start of execution, the node will set as its private estimate for . Recall nodes might not obtain private estimates simultaneously, thus they cannot simply terminate and output private estimates as the final estimate. This is the place where the latter two phases come into play. More specifically, in the second phase, nodes that have already obtained private estimates will try to broadcast informed messages to signal other nodes to stop. In fact, hearing an informed message is the only situation in which a node can safely terminate, even if the node has already obtained its private estimate. On the other hand, the third phase is used to deal with the case in which one single “unlucky” node successfully broadcasts an informed message during phase two (thus terminate all other nodes), but never gets the chance to successfully receive an informed message (thus cannot terminate along with other nodes). The complete description of Count-SH-noCD-High is given in Appendix B.2.

To prove the correctness of Count-SH-noCD-High, we need to show: (a) nodes can correctly determine the accuracy of their estimates; and (b) all nodes terminate simultaneously and output identical estimate. Part (b) follows from our careful protocol design, as phase two and three in each iteration act like a mini “consensus” protocol, allowing nodes to agree on when to stop. Proving part (a), on the other hand, needs more effort. Recall we use the fraction of clear message slots to determine the accuracy of an estimate, and the expected fraction value should be identical for all nodes. However, due to random chances, the actual fraction value observed by each node might deviate from expectation. If we have an upper bound of , then by making the first phase to contain slots, Chernoff bounds [18] will enforce the observed fraction value to be tightly concentrated around its expectation. In our case, is not available, and we rely on more careful analysis. Specifically, during iterations one to where is some sufficiently large constant, in each time slot in phase one, at least two nodes will broadcast (since the estimate is too small), thus no node will obtain private estimate in these iterations. Starting from iteration , the length of phase one is long enough so that concentration inequalities will ensure the observed fraction value is close to its expectation. Building on these observations, we can eventually conclude the following theorem (the full analysis is in Appendix B.2):

The Count-SH-noCD-High approximate neighbor counting algorithm guarantees the following properties with high probability in when executed in a single-hop network with no collision detection: (a) all nodes terminate simultaneously within slots; and (b) all nodes obtain the same estimate of , which is in the range .

### 5.2 Single-Hop Networks: Collision Detection

Without collision detection, the feedback to a listening node is either silence or a message. Failing to receive a message, therefore, does not hint the nature of the failure: either no node is sending, or multiple nodes are sending. As a result, in the two previous algorithms, when nodes “guess” the count, they have to do it in a linear manner: start with a small estimate, and double if the guess is incorrect. With collision detection, by contrast, listening nodes can distinguish whether too few (i.e., zero) or too many (i.e., at least two) nodes are broadcasting. As first pointed out back in the 1980’s [22], this extra power enables an exponential improvement over linear searching, since nodes can now perform a binary search.

#### Constant probability of success.

Here we leverage the aforementioned binary search strategy to return a constant factor estimate of in time, with at least some constant probability. Recall efficient binary search requires a rough upper bound of as input. To this end, we first introduce an algorithm called EstUpper-SH: it can provide a polynomial upper bound of within time. At a high level, EstUpper-SH is doing a linear “guess and verify” search to estimate . (Notice, it is not estimating .) This strategy, to the best of our knowledge, is first discussed by Willard in the seminal paper [22], and has later been used in other works (see, e.g., [6, 4]). Due to space constraint, detailed description and analysis of EstUpper-SH are provided in Appendix B.3.

Once this estimate is obtained, we switch to the main logic of Count-SH-CD-Const. This algorithm contains multiple iterations, each of which has four time slots. In each iteration , all nodes have a lower bound and an upper bound , and will test whether the median is close to or not. More specifically, in the first time slot in iteration , each node will broadcast a beacon message with probability , and listen otherwise. Listening nodes will use the channel status they observed to adjust (or ), or terminate and output the final estimate. On the other hand, the other three time slots in each iteration allow nodes that have chosen to broadcast in the first time slot to learn the channel status too, with the help of the nodes that have chosen to listen in the first time slot. (See Appendix B.4 for complete description of Count-SH-CD-Const.)

To prove Count-SH-CD-Const can provide a correct estimate, we demonstrate that during one execution of Count-SH-CD-Const: (a) whenever is too large or too small, all nodes can correctly detect this and adjust or accordingly; and (b) when is a good estimate, all nodes can correctly detect this as well and stop execution. The full analysis is presented in Appendix B.4, here we state only the main theorem:

The Count-SH-CD-Const approximate neighbor counting algorithm guarantees the following properties when executed in a single-hop network with collision detection: (a) all nodes terminate simultaneously; and (b) with at least constant probability, all nodes obtain the same estimate of in the range within time slots.

#### High probability of success.

Our last algorithm for the single-hop scenario is called Count-SH-CD-High. It significantly differs from the other algorithms studied so far in that it does not use a “guess and verify” strategy. Instead, it deploys a random walk to derive an estimate. The use of random walks for contention resolution was introduced by Nakano and Olariu [19], in the context of leader election in radio networks. It was later adopted by Brandes et al. [4] for solving approximate counting in beeping networks.

Prior to executing Count-SH-CD-High, nodes will first use time slots to run EstUpper-SH to obtain a polynomial upper bound of . Call this upper bound . All nodes then perform a random walk, the state space of which consists of potential estimates of . More specifically, Count-SH-CD-High contains iterations, each of which has three time slots. In each iteration, all nodes maintain a current estimate on which is denoted by . (Initially, is set to .) In the first slot in a iteration, each node will broadcast a beacon message with probability , and listen otherwise. If a node hears silence, it will decrease by a factor of four; if a node hears noise, it will increase by a factor of four; and if a node hears a beacon message, it will keep unchanged. Similar to what we have done in Count-SH-CD-Const, in each iteration, the nodes that have chosen to listen in the first time slot will use the latter two slots to help nodes that have chosen to broadcast in the first time slot to learn the channel status of the first time slot. After these iterations, all nodes will use to be the final estimate of , where is the most frequent estimate used by the nodes during the iterations.

The high-level intuition of Count-SH-CD-High is: when the estimate is too large or too small, it will quickly shift towards correct estimates; and when the estimate is correct, it will remain unchanged. Therefore, the most frequent estimate will likely to be a correct one. The full analysis is deferred to Appendix B.5, here we provide only the main theorem:

The Count-SH-CD-High approximate neighbor counting algorithm guarantees the following properties when executed in a single-hop network with collision detection: (a) all nodes terminate simultaneously; and (b) with high probability in , all nodes obtain the same estimate of in the range within time slots.

### 5.3 Multi-Hop with All Nodes Counting: No Collision Detection

All nodes counting in a multi-hop network is challenging as different nodes may have significantly different number of neighbors. In this part, we present two algorithms that attempt to overcome this obstacle, the second of which is particularly interesting.

We begin with the first algorithm—called Count-All-noCD—which still relies on the linear “guess and verify” approach. However, it requires the upper bound as an input parameter to enforce termination. Nonetheless, for each node, Count-All-noCD always returns an accurate estimate, even when knowledge of is absent.

In more detail, Count-All-noCD contains iterations, and the th iteration contains time slots. In each slot in iteration , each node will choose to be a broadcaster or a listener each with probability . If a node chooses to be a listener in a time slot, it will simply listen. Otherwise, if a node chooses to be a broadcaster, it will broadcast a beacon message with probability , and do nothing otherwise. After an iteration , for a node , if for the first time since the beginning of protocol execution, it has heard beacon messages in at least fraction of slots among the listening slots (within this iteration), then will use as its estimate for . Proving the correctness of Count-All-noCD borrows heavily from our analysis of Count-SH-noCD-High (see Appendix B.6 for more details), here we only state the main theorem:

For each node , the Count-All-noCD approximate neighbor counting algorithm guarantees the following with high probability in when executed in a multi-hop network with no collision detection: will obtain an estimate of in the range within time. Moreover, will terminate after time when is known.

Notice that in Count-All-noCD, for a node , the high correctness guarantee is with respect to . This implies, when is some constant, the probability that the obtained estimate is desirable is also a constant. Sometimes, we may want identical and high correctness guarantees for all estimates, such as high probability in . Our second algorithm—which is called Count-All-noCD-2—achieves this goal, at the cost of accessing and demanding longer execution time. (Count-All-noCD only needs to enforce termination, while Count-All-noCD-2 needs to work properly.)

Careful readers might suspect Count-All-noCD-2 just extends the length of each iteration of Count-All-noCD to . Unfortunately, this simple modification is not sufficient: we still cannot change the fact that when a node listens, the number of broadcasters among its neighbors is concentrated to only with high probability in .

Instead, Count-All-noCD-2 takes a different approach, the core of which is a “double counting” trick. Specifically, Count-All-noCD-2 contains iterations, each of which has time slots. At the beginning of each iteration, each node chooses to be a broadcaster or a listener each with probability . Then, by applying the “guess and verify” strategy, each listener will spend the time slots in this iteration to obtain a constant factor estimate on the number of neighboring broadcasters. When all iterations are done, each node will sum the estimates it has obtained, divide it by , and output the result as its estimate for the neighborhood size.

Due to space constraint, detailed description for each iteration is deferred to Appendix B.7. We only note here that the estimates obtained by the listeners are quite accurate:

Consider an arbitrary iteration during the execution of Count-All-noCD-2, assume node is a listener with neighboring broadcasters. By the end of this iteration, node will obtain an estimate of in the range , with high probability in .

We can now state and prove the guarantees provided by Count-All-noCD-2:

The Count-All-noCD-2 approximate neighbor counting algorithm guarantees the following properties when executed in a multi-hop network with no collision detection: (a) all nodes terminate after time slots; and (b) with high probability in , for each node , the node will obtain an estimate of in the range .

###### Proof sketch..

Consider a node and one of its neighbor . Assume Count-All-noCD-2 contains iterations, where is a sufficiently large constant. In expectation, in iterations, will be listener and will be broadcaster. Apply a Chernoff bound, we know will be listener and will be broadcaster in at least iterations, and at most iteration, w.h.p. in . Here, is a small constant determined by . Take a union bound over all neighbors of , we know this claim holds true for them as well. Therefore, if were able to accurately count the number of broadcasting neighbors without any error in each listening iteration, the sum it will obtain would be in the range , w.h.p. in .

Now, due to Lemma 5.3, we know the actual sum of counts will obtain is in the range , w.h.p. in . As a result, according to our algorithm description, the final estimate will obtain is in the range , w.h.p. in . Take a union bound over all nodes, the theorem is proved. ∎

### 5.4 Multi-Hop with All Nodes Counting: Collision Detection

In Count-All-noCD, we resolve the termination detection problem by accessing . This allows nodes to run long enough so that they could be sure that everyone has a chance to learn what it needed to learn. With the addition of collision detection, however, the assumption that nodes know can be removed for many network topologies. In particular, we can leverage the idea that neighbors of that have not obtained estimates yet can use noise to reliably inform that they wish to continue. We call this algorithm Count-All-CD.

Count-All-CD contains multiple iterations, each of which has two parts. The first part of any iteration is identical to iteration of Count-All-noCD. The second part, on the other hand, helps nodes to determine when to stop. In particular, the second part of iteration contains a single slot. For a node , if it has not obtained an estimate of by the end of the first part of iteration yet, then in the second part, it will broadcast a continue message. On the other hand, if has already obtained an estimate of by the end of the first part of iteration , it will simply listen in part two. Moreover, will continue into the next iteration iff it hears continue or noise during part two. The guarantees provided by Count-All-CD are stated below, and the proof of it is provided in Appendix B.8.

The Count-All-noCD approximate neighbor counting algorithm guarantees the following properties for each node when executed in a multi-hop network with collision detection: (a) will obtain an estimate of in the range within time slots, with high probability in ; and (b) if , then will terminate within time slots, with probability at least .

A key point about the above termination bound is that it requires . (In fact, this constraint can be relaxed to , for an arbitrarily chosen constant .) If this is not the case (e.g., in a dense star network), the termination detection mechanism might not work properly. In that situation, the default dependence on from the no collision detection case can be applied as a back-up.

### 5.5 Multi-Hop with Designated Node Counting

Compared to the all nodes counting variant, multi-hop neighbor counting with only the designated node is easier: the strategies we previously used for single-hop counting are still applicable, and the introduction of the designated node can actually make coordination easier. (In particular, this node can greatly simplify termination detection). Due to space constraint, we defer the upper bounds for this variant to Appendix C.

We note that one interesting algorithm in this variant is Count-Desig-noCD-Const, which achieves constant success probability without collision detection. Count-Desig-noCD-Const differs from its single-hop counterpart (i.e., Count-SH-noCD-Const) in that it uses fraction of clear message slots to determine the accuracy of nodes’ estimate. The primary reason we develop this algorithm is that the success probability of Count-SH-noCD-Const is fixed. In contrast, in Count-Desig-noCD-Const, by tweaking the running time (up to some constant factor), the success probability—despite being a constant—can be adjusted accordingly. More details of this algorithm are presented in Appendix C.1.

## 6 Discussion

We see at least two problems that worth further exploration. First, how do termination requirements affect the complexity of the problem? This is particularly interesting in the multi-hop all nodes counting scenario, when knowledge of or is not available. Count-All-noCD shows lower bound can be achieved at the cost of no termination, but how much time must we spend if termination needs to be enforced, or is simply impossible without knowing or ? Another open problem concerns the gap between the lower and upper bounds, in the multi-hop all nodes counting scenario with collision detection. On the one hand, the lower bound might be loose as it is a simple carry over, ignoring the possibility that all nodes counting could be fundamentally harder than designated node counting. Yet on the other hand, we have not found a way to leverage collision detection to reduce algorithm runtime (e.g., it seems hard to run multiple instances of binary search in parallel). Currently, our best guess is that both the lower bound and the upper bound are not tight.

## References

• [1] Noga Alon, Amotz Bar-Noy, Nathan Linial, and David Peleg. A lower bound for radio broadcast. Journal of Computer and System Sciences, 43(2):290–298, 1991.
• [2] Reuven Bar-Yehuda, Oded Goldreich, and Alon Itai. On the time-complexity of broadcast in radio networks: An exponential gap between determinism and randomization. In Proceedings of the 6th Annual ACM Symposium on Principles of Distributed Computing, PODC ’87, pages 98–108, New York, NY, USA, 1987. ACM.
• [3] Jacir Luiz Bordim, JiangTao Cui, Tatsuya Hayashi, Koji Nakano, and Stephan Olariu. Energy-efficient initialization protocols for ad-hoc radio networks. In International Symposium on Algorithms and Computation, ISAAC ’99, pages 215–224, Berlin, Heidelberg, 1999. Springer Berlin Heidelberg.
• [4] Philipp Brandes, Marcin Kardas, Marek Klonowski, Dominik Pajak, and Roger Wattenhofer. Fast size approximation of a radio network in beeping model. Theoretical Computer Science, In Press, 2017.
• [5] Ioannis Caragiannis, Clemente Galdi, and Christos Kaklamanis. Basic computations in wireless networks. In International Symposium on Algorithms and Computation, ISAAC ’05, pages 533–542, Berlin, Heidelberg, 2005. Springer Berlin Heidelberg.
• [6] Binbin Chen, Ziling Zhou, and Haifeng Yu. Understanding rfid counting protocols. In Proceedings of the 19th Annual International Conference on Mobile Computing and Networking, MobiCom ’13, pages 291–302, New York, NY, USA, 2013. ACM.
• [7] Israel Cidon and Moshe Sidi. Conflict multiplicity estimation and batch resolution algorithms. IEEE Transactions on Information Theory, 34(1):101–110, 1988.
• [8] Alejandro Cornejo and Fabian Kuhn. Deploying wireless networks with beeps. In International Symposium on Distributed Computing, DISC ’10, pages 148–162, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg.
• [9] Mohsen Ghaffari, Nancy Lynch, and Srikanth Sastry. Leader election using loneliness detection. Distributed Computing, 25(6):427–450, 2012.
• [10] Seth Gilbert, Valerie King, Seth Pettie, Ely Porat, Jared Saia, and Maxwell Young. (near) optimal resource-competitive broadcast with jamming. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’14, pages 257–266, New York, NY, USA, 2014. ACM.
• [11] Seth Gilbert, Fabian Kuhn, and Chaodong Zheng. Communication primitives in cognitive radio networks. In Proceedings of the 2017 ACM Symposium on Principles of Distributed Computing, PODC ’17, pages 23–32, New York, NY, USA, 2017. ACM.
• [12] Albert G. Greenberg, Philippe Flajolet, and Richard E. Ladner. Estimating the multiplicities of conflicts to speed their resolution in multiple access channels. Journal of the ACM, 34(2):289–325, 1987.
• [13] Tomasz Jurdziński, Mirosław Kutyłowski, and Jan Zatopiański. Energy-efficient size approximation of radio networks with no collision detection. In International Computing and Combinatorics Conference, COCOON ’02, pages 279–289, Berlin, Heidelberg, 2002. Springer Berlin Heidelberg.
• [14] Tomasz Jurdzinski and Grzegorz Stachowiak. Probabilistic algorithms for the wake-up problem in single-hop radio networks. Theory of Computing Systems, 38(3):347–367, 2005.
• [15] Jedrzej Kabarowski, Mirosław Kutyłowski, and Wojciech Rutkowski. Adversary immune size approximation of single-hop radio networks. In International Conference on Theory and Applications of Models of Computation, pages 148–158, Berlin, Heidelberg, 2006. Springer Berlin Heidelberg.
• [16] Marek Klonowski and Kamil Wolny. Immune size approximation algorithms in ad hoc radio network. In European Conference on Wireless Sensor Networks, pages 33–48, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg.
• [17] Michael Luby. A simple parallel algorithm for the maximal independent set problem. In Proceedings of the 17th Annual ACM Symposium on Theory of Computing, STOC ’85, pages 1–10, New York, NY, USA, 1985. ACM.
• [18] Michael Mitzenmacher and Eli Upfal. Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis. Cambridge University Press, 2005.
• [19] Koji Nakan and Stephan Olari. Uniform leader election protocols for radio networks. IEEE Transactions on Parallel and Distributed Systems, 13(5):516–526, 2002.
• [20] Calvin Newport. Radio network lower bounds made easy. In Proceedings of the 28th International Symposium on Distributed Computing, DISC ’14, pages 258–272, Berlin, Heidelberg, 2014. Springer Berlin Heidelberg.
• [21] Muhammad Shahzad and Alex X. Liu. Every bit counts: Fast and scalable rfid estimation. In Proceedings of the 18th Annual International Conference on Mobile Computing and Networking, MobiCom ’12, pages 365–376, New York, NY, USA, 2012. ACM.
• [22] Dan E. Willard. Log-logarithmic selection resolution protocols in a multiple access channel. SIAM Journal on Computing, 15(2):468–477, 1986.
• [23] Yuanqing Zheng and Mo Li. Zoe: Fast cardinality estimation for large-scale rfid systems. In Proceedings of the 32nd IEEE International Conference on Computer Communications, INFOCOM ’13, pages 908–916. IEEE, 2013.
• [24] Yuanqing Zheng, Mo Li, and Chen Qian. Pet: Probabilistic estimating tree for large-scale rfid estimation. In Proceedings of the 31st International Conference on Distributed Computing Systems, ICDCS ’11, pages 37–46, Washington, DC, USA, 2011. IEEE.

## Appendix A Omitted Description and Analysis of Lower Bound Results

### a.1 The k-hitting game and lower bounds for contention resolution in single-hop networks

Newport [20] introduced the simple combinatorial -hitting game, which acts as a flexible and generic “wrapper” for the notion of “hitting sets” proposed by Alon et al. in their seminal paper on centralized broadcast lower bounds [1]. Through carefully-crafted reduction arguments, one can reduce -hitting game to many different variations of contention resolution—allowing a fixed lower bound on hitting to carry over to many different contention resolution assumptions. Because we utilize and modify these existing bounds to study neighbor counting, we review the relevant definitions and results here.

In the -hitting game where is an integer, there is one player and one referee. Before the game starts, the referee privately selects a target set . The game then proceeds in rounds. In each round, the player submits a proposal to the referee. If , the player wins. Otherwise, the referee simply tells the player that is incorrect, and the game proceeds into the next round.

Intuitively, in a radio network, denotes the nodes that are activated from “the universe of possible nodes”. The set of nodes that would broadcast in a given round if all nodes were running the given algorithm implicitly defines a proposal . In this case, is equivalent to isolating a broadcaster among , solving contention resolution.

In [20], the author proves that winning the -hitting game with constant probability requires at least rounds, while solving it with probability at least requires at least rounds.

[[20]] Fix some player that guarantees, for all , to win the -hitting game in rounds with probability at least . It follows that: (a) if is some constant, then ; and (b) if , then .

Then, by reduction arguments, the author is able to extend these lower bounds to handle contention resolution in a single-hop radio network with and without collision detection.

[[20]] Let be an algorithm that solves the contention resolution problem in time slots with probability in the single-hop network model with no collision detection. It follows that: (a) if is some constant, then ; and (b) if , then .

[[20]] Let be an algorithm that solves the contention resolution problem in time slots with probability in the single-hop network model with collision detection. It follows that: if is some constant, then .

### a.2 Discussions on why the reduction approach fail in certain cases

Recall the two remaining cases are: all nodes counting in single-hop radio networks and designated node counting in multi-hop radio networks, when collision detection is available and success with high probability is required. Careful readers might have already realized that the reduction approach can still be applied here. Unfortunately, however, this approach would no longer give the desired results.

More specifically, an lower bound for contention resolution with high success probability in single-hop scenario with collision detection is already given in [20], and we believe an lower bound for contention resolution (of the designated node’s neighbors) with high probability in multi-hop scenario with collision detection could also be derived. Moreover, with Lemma 4.1, in these two cases, we can still link the complexities of contention resolution and approximate neighbor counting together. However, the critical issue is that given an efficient algorithm for approximate neighbor counting, the overhead incurred by utilizing this algorithm to solve contention resolution is too high. In particular, the lower bounds we intend to prove (for approximate neighbor counting) are and , respectively; yet the overhead incurred by the reduction process have already reached and (see Lemma 4.1), respectively.

### a.3 Omitted proofs

###### Proof of Lemma 4.1..

We claim if there exists an algorithm that solves the contention resolution problem in time slots with probability , then there exists an algorithm that allows a player to win the -hitting game in rounds with probability at least . By Lemma A.1, this claim implies our lemma.

We now prove the above claim. Assume the target set chosen by the referee is , further assume is of size and contains elements . Imagine a star network in which the designated node is , and has neighbors whose identities are . (These identities are generated and used by the player, and nodes in the network do not have access to them.) Now, consider the following strategy. The player simulates running in . In each round, if any neighbors of decides to broadcast, then the player generates the proposal according to the identities of these nodes. If the proposal is correct, then we are done. If the proposal is incorrect and is listening, the player simulates hearing nothing. If no neighbor of broadcasts and is listening, then the player again simulates hearing nothing. On the other hand, if in a round broadcasts, then the player simulates listening neighbors hear the message. Otherwise, if in a round remains silent, then the player simulates listening neighbors hear nothing. Clearly, this simulation correctly reflects how would proceed if is a real radio network.

Now, consider the proposals generated by the player. If in a round solves the contention resolution problem, then the proposal in that round must be of size one, which implies . That is, this proposal will also let the player win the -hitting game.

By now, we know if can construct and knows an algorithm that solves the contention resolution problem in time slots with probability , then also has a strategy to win the -hitting game in rounds with probability at least .

However, a critical issue is that the player cannot construct directly! In particular, he does not know the size of the target set; he also does not know . In fact, once he knows , he can simply propose and wins the game in a single round.

To overcome this difficulty, will simulate running on a different star network —one that he can construct directly. In , the designated node has neighbors named to . In each round, if any neighbor of decides to broadcast, generates the proposal according to the identities of these nodes. Moreover, for each round, the simulation rules are the same with the ones described above for .

Now, the crucial observation is, until the proposal submits contains exactly one element in (i.e., by which point wins), for each node in , the two execution histories it sees in and are identical, assuming the node uses the same random bits in each of these two executions. (An interesting point worth noting is, the simulation of might be inconsistent with how would proceed in in real. For example, when a node not in broadcasts alone in , we still simulate as receiving nothing. Nonetheless, such inconsistency is fine, so long as we ensure for each node in , its views in and are identical.)

To prove the above crucial observation, we do an induction on the simulated time slots. Prior to the first simulated slot, the claim trivially holds. Assume by the end of slot the claim still holds, we now consider slot . First, focus on the designated node . According to the induction hypothesis, has identical execution histories in and , till the end of slot . Thus, in slot , will perform same action (i.e., broadcast or listen). Particularly, if broadcasts, then it sends identical messages in and . On the other hand, if listens, since is a slot prior to winning the game, we know either no neighbor of in broadcasts or at least two neighbors of in broadcast. In both cases hears nothing, in both and . Next, consider an arbitrary neighbor of that is in . According the induction hypothesis, has identical execution histories in and , till the end of slot . Thus, in slot , will perform same action. Particularly, if broadcasts, then it sends identical messages in and . On the other hand, if listens, then the situation depends on whether the designated node broadcasts in slot . In case remains silent, hears silence in both and . If broadcasts, then according to our previous analysis, will send identical messages in and . This implies will receive identical messages in and . The proof for the inductive step thus completes.

Since the execution histories in and are identical for nodes in , and since solves the contention resolution problem in time slots with probability in , we know by the end of round , with probability at least , the player must have generated a proposal , where and , and . This proposal will let win the -hitting game. ∎

###### Proof sketch of Lemma 4.1..

We claim if there exists an algorithm that solves the contention resolution problem with collision detection in time slots with probability , then there exists an algorithm that allows a player to win the -hitting game in rounds with probability at least . Together with Lemma A.1, and the fact that if , our lemma is immediate. The reminder of this proof is dedicated to proving this claim.

Assume the target set chosen by the referee is , where . Imagine the following two star networks: the first one is called in which the designated node has neighbors with identities ; and the second one is called in which the designated node has neighbors with identities .

To win the -hitting game, the player first builds a complete binary tree of depth , where the root is at level zero. For each non-root node in the tree, we attach a binary label to it according to the following rule: if the node is the left child of its parent then it has label zero, otherwise it has label one. For the ease of presentation, for a node in the tree, we use to denote its depth, and to denote the binary string generated by concatenating the labels of the nodes on the path from root to . For the root node, this binary string is simply an empty string.

Now, for each node in , the player simulates running algorithm in for time slots, with same sequence of random bits (for each node in ), according to the following rules. In the th slot where , if the th bit in is zero, then simulates in hearing silence in case it is listening; and if the th bit in is one, then simulates in hearing collision in case it is listening. One the other hand, for each listening neighbor in , simulates it hears whatever broadcasts, or silence in case does not broadcast anything. In the st time slot, proposes the identities of neighbors of in that decide to broadcast to the referee.

Recall that if we run in a real radio network with topology identical to , we can solve contention resolution in time slots with probability . Since the network is fixed, the only uncertainty comes from the random choices made by during the execution. Assume the sequence of random bits used by is . If this execution indeed solves contention resolution within time slots, then without loss of generality, assume it solves contention resolution at slot . This implies, for each time slot prior to , the designated node hears either silence or noise, if it is listening. As a result, this further implies, there must exist a node in tree , for slots up to (and including) , for nodes in , the simulation of (in ) according to is consistent with running in in real (when using as the source of randomness). (A more rigorous proof for this claim can be obtained via induction on simulated time slots, which is very similar to the one we have done in the proof for Lemma 4.1.) Hence, when processing in tree , the proposal generated by will win the -hitting game.

(In essence, to correctly simulate running in while the actual topology is , the hard scenario is when multiple neighbors of broadcast yet the resulting proposal does not win the game, since the simulator cannot distinguish between the case where none of the broadcasters were in (in which case, the correct thing is to simulate hearing silence) and the case where multiple broadcasters were in (in which case, the correct thing is to simulate hearing noise). The binary tree enables the simulator to essentially guess at the sequence of ’s collision detection information. One of these guesses must be right for the particular definition of and random bits used in the execution.)

Since a complete binary tree of depth contains nodes, we know can win the -hitting game with probability at least in rounds. This completes the proof of the lemma. ∎

###### Proof sketch of Lemma 4.1..

We first construct in the single-hop scenario.

In this situation,

contains multiple steps, each of which has two time slots. In odd slots, all nodes simply run

. In even slots, if a node has already obtained the estimate of , it will broadcast with probability ; otherwise, it will remain silent. Since solves approximate neighbor counting in time with probability , we know starting from step , with probability , in each even slot, each node will broadcast with probability , where .

Assume starting from step , in each even slot, each node indeed broadcasts with probability . Thus, in the second slot in each such step, the probability that some node will broadcast alone is . That is, in each such slot, the probability that some node will broadcast alone (thus solving the contention resolution problem) is a constant. Since each such slot is independent, our lemma follows in the single-hop scenario.

Next, we turn our attention to the multi-hop scenario.

In this situation, contains two parts. In the first part, there are multiple steps, each of which has two slots. In odd slots, all nodes simply run ; and in even slots, all non-designated nodes will listen while does nothing. According to the assumption, after steps, with probability , the designated node will obtain a constant factor estimate of such that . Once obtains , in the next even slot, it will broadcast this estimate. Clearly, will be successfully received by all neighbors of . This marks the end of part one of . The second part of is simple: in each slot, each non-designated node which knows broadcasts with probability , and simply listens.

By a similar analysis as in the single-hop scenario, we know in each slot in part two of , the probability that some neighbor of broadcasts alone (thus solving the contention resolution problem) is at least . Since each slot is independent, our lemma follows in the multi-hop scenario as well. ∎

###### Proof sketch of Theorem 4.1..

As an example, we consider the single-hop without collision detection scenario. The proofs for other scenarios are very similar.

First, consider the success with constant probability case.

For the sake of contradiction, assume there exists an algorithm that solves approximate neighbor counting in time in this scenario, with constant probability . Due to Lemma 4.1, this implies we can devise an algorithm that solves contention resolution in time, with probability at least . So long as is some constant, we know will be a constant, and . However, this contradicts the lower bounds shown in Lemma A.1.

We next consider the success with high probability case.

For the sake of contradiction, assume there exists an algorithm that solves approximate neighbor counting in time in this scenario, with high probability in . I.e., with a probability for some constant . Due to Lemma 4.1, this implies we can devise an algorithm that solves contention resolution in time, with probability at least . By setting , we know solves contention resolution in time, with probability at least . Due to proof of Theorem 4 shown in [20], by setting , we know can be used to devise an algorithm that wins -hitting game in time, with probability at least . I.e., with high probability in . However, this contradicts the lower bounds shown in Lemma A.1. ∎

###### Proof of Lemma 4.2..

Order the sets in in some arbitrary manner. For every , we associate a binary string of length in the following manner: the th bit of is one iff is in the th set of . Since we know there are less than distinct binary strings of length , and since we use these binary string to label distinct values, we know there must exist some binary string that is associated with at least distinct values in . Let be a set containing of these values, we know . Since the values in are associated with the same binary string, by the definition of this binary string, we know for each , either (in which case the corresponding bit in the binary string is one) or (in which case the corresponding bit in the binary string is zero). ∎

###### Proof of Theorem 4.2..

We first focus on the single-hop scenario.

Let be an arbitrary (and potentially randomized) distributed algorithm for approximate neighbor counting. Assume is a single-hop radio network in which nodes are activated and each node has a unique identity in . (These identities are for the ease of presentation, the nodes themselves cannot access these identities.) Simulate each of the first time slots according to the following rules: if a node chooses to broadcast, then we simulate it broadcasting the content specified by ; if a node chooses to listen, then we simulate it hearing silence. We call this execution segment of as . Let denote the identities of the nodes that broadcast in slot , define set .

According to Lemma 4.2, there exists a size subset of such that for each , either or . Assume is a single-hop radio network in which the nodes with identities in are activated. Run in for time slots, call the resulting execution segment . Notice, if is randomized, then for each node, assume the random bits generated for that node are identical in and .

The third single-hop radio network contains two arbitrary nodes with identities in . We run in for time slots and call the resulting execution segment . Again, if is randomized, then for each node, assume the random bits generated for that node are identical in , , and .

Now, the critical claim is, for each node in , its views in and are identical. I.e., for each node with an identity in , execution segments and are indistinguishable. We prove this by a slot to slot induction.

In the first time slot, if a node with identity chooses to broadcast in , then clearly it will also choose to broadcast in . Moreover, will broadcast same content in and . On the other hand, in the first time slot, if chooses to listen in , then will also choose to listen in . This means , implying . That is, all nodes with identities in will choose to listen in the first time slot in , which in turn means all nodes in will choose to listen in the first time slot. Thus, will hear silence in the first time slot in ; and so does in , according to our simulation rule. By now, we have proved the induction basis: for each node in , in the first time slot, its views in and are identical.

Assume during the first time slots, for each node in , its views in and are identical. We now consider time slot . If a node with identity chooses to broadcast in , then according to the induction hypothesis, it will also choose to broadcast in , with the same content. On the other hand, in time slot , if chooses to listen in , then according to the induction hypothesis, will also choose to listen in . This means , implying . That is, all nodes with identities in will choose to listen in time slot in , which in turn means all nodes in will choose to listen in time slot (according to the induction hypothesis). Thus, will hear silence in time slot in ; and so does in , according to our simulation rule. This completes our proof for the claim.

Similarly, we can also prove: for each node in , its views in and are identical.

Now, imagine an adversary generating a single-hop radio network in the following way: it arbitrarily picks of nodes and arbitrarily gives each of these nodes a unique identity in , it then samples a size subset of uniformly at random and picks the nodes with corresponding identities. Lastly, the adversary flips a fair coin to decide whether to activate all these nodes, or just two of them (chosen arbitrarily).

Consider the scenario that the adversary chooses as the size subset of , which happens with probability greater than since there are less than size subsets of . In such case, according to our above analysis, by the end of slot , nodes cannot distinguish whether there are two nodes in the network or nodes in the network. If , then we know by the end of slot , there is a chance that the approximation given by is incorrect. (Recall we define in Section 3.)

To sum up, if is a (potentially randomized) distributed algorithm for approximate neighbor counting in single-hop radio network with collision detection, and if guarantees to output an estimate by the end of time slot , then this estimate is incorrect with probability at least , so long as . Let where is some constant, we know . That is, will be if is some constant. By now, we have shown if guarantees to output an estimate by the end of time slot , then this estimate is incorrect with probability at least . This proves the first part of the theorem.

Next, we turn our attention to the multi-hop scenario, which generally follows the same high-level strategy as in the single-hop case, but is more involved due to topology changes.

Let be an arbitrary (and potentially randomized) distributed algorithm for approximate neighbor counting (in the multi-hop scenario). Assume is a star network in which the designated node has neighbors, and each neighbor has a unique identity in . We now define simulations, each focusing on one neighbor (of the designated node ) along with , and runs for time slots. Each simulation follows the following rules. In each time slot, for the designated node , if it chooses to broadcast, then we simulate it broadcasting the content specified by . If chooses to listen, then the result depends on whether broadcasts: hears collision if broadcasts, otherwise hears silence. On the other hand, for node , if it chooses to broadcast, then we simulate it broadcasting the content specified by . If chooses to listen, then the result depends on the behavior of : if broadcasts in this slot, then we simulate hearing the message sent by ; if does not broadcast in this slot, then we simulate hearing silence. For each neighbor of , we call this execution segment of as . Let denote the identities of the neighbors of that broadcast in slot (among these simulations), define set .

According to Lemma 4.2, there exists a size subset of such that for each , either or . Assume is a star network in which the neighbors of the designated node are the nodes with identities in . Run in for time slots, call the resulting execution segment . Notice, if is randomized, then for each node in , assume the random bits generated for that node are identical in and . (I.e., for each neighbor of in , same random bits are used in and ; and for designated node , same random bits are used in all and .)

The third star network contains the designated node and two arbitrary nodes with identities in . We run in for time slots and call the resulting execution segment . Again, if is randomized, then for each node in , assume the random bits generated for that node are identical in , , and .

We claim, for each neighbor of the designated node in , its views in and are identical. Moreover, for the designated node , its views in all such and are identical. We prove this by a slot to slot induction.

We begin with the first time slot. For the designated node , if it chooses to broadcast in , then clearly it will also choose to broadcast in all , with the same content. If chooses to listen in , since each node with identity in will take the same action in the first time slot in (corresponding) and , and since either or , we know will hear either silence or collision in , and will hear same thing in all and by our simulation rules. On the other hand, for a neighbor of the designated node in , if chooses to broadcast in , then clearly it will also choose to broadcast in , with the same content. If chooses to listen in , then will also choose to listen in . In such case, what hears in and depend on the behavior of in and . If broadcasts, then will hear this message; and if remains silent, then will hear silence. Since we have already shown the behavior of is identical in and in this case, we know ’s view will also be identical in and in this case. By now, we have proved the induction basis: for each node in , in the first time slot, its views in and are identical.

Assume during the first time slots, for each node in , its views in and are identical. We now consider time slot . For the designated node , if it chooses to broadcast in , then by the induction hypothesis it will also choose to broadcast in all , with the same content. If chooses to listen in , since (by induction hypothesis) each node with identity in will take same action in time slot in (corresponding) and , and since either or , we know will hear either silence or collision in , and will hear same thing in all and by our simulation rules. On the other hand, for a neighbor of the designated node in , if chooses to broadcast in , then by the induction hypothesis it will also choose to broadcast in , with the same content. If chooses to listen in , then the by induction hypothesis will also choose to listen in . In such case, what hears in and depends on the behavior of in and . If