Flipping the Perspective in Contact Tracing

10/08/2020 ∙ by Po-Shen Loh, et al. ∙ 0

We introduce a fundamentally different paradigm for contact tracing: for each positive case, do not only ask direct contacts to quarantine; instead, tell everyone how many relationships away the disease just struck (so, "2" is a close physical contact of a close physical contact). This new approach, which has already been deployed in a publicly downloadable app, brings a new tool to bear on pandemic control, powered by network theory. Like a weather satellite providing early warning of incoming hurricanes, it empowers individuals to see transmission approaching from far away, and incites behavior change to directly avoid exposure. This flipped perspective engages natural self-interested instincts of self-preservation, reducing reliance on altruism, and the resulting caution reduces pandemic spread in the social vicinity of each infection. Consequently, our new system solves the behavior coordination problem which has hampered many other app-based interventions to date. We also provide a heuristic mathematical analysis that shows how our system already achieves critical mass from the user perspective at very low adoption thresholds (likely below 10 empirically in the first practical deployment); after that point, the design of our system naturally accelerates further adoption, while also alerting even non-users of the app. This article seeks to lay the theoretical foundation for our approach, and to open the area for further research along many dimensions.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The COVID-19 pandemic has severely impacted daily life across much of the world. Due to extensive public discussion and large-scale implementation, the paradigm of traditional contact tracing has reached conversational familiarity: test for positive individuals, and then trace their proximal contacts, so as to isolate them from potentially infecting the rest of the community [46].

Relatively early in the pandemic, many projects sprang up to develop apps to digitize and accelerate this process (see, e.g., the book [26], the media reports [33, 48], or the list [17]). Although these apps were very numerous, the entire class of apps sought to deliver essentially the same primary value to the user: after the user became directly exposed to a COVID-positive individual, the app would notify them to take a variety of actions designed to protect the rest of society from the app user (who might now be infected). These apps achieved this goal through a variety of sensing methods, ranging from scanning QR codes [11], to GPS [42], Wi-Fi [16], Bluetooth [6], and ultrasound [30, 34]. A lively debate emerged over whether the infrastructure for delivering this user experience should be centralized or decentralized [18], ultimately tilting in favor of decentralized architectures, consistent with the principle of seeking the least privacy-invasive approach for delivering the post-exposure notification user experience.

Although many prominent governments strongly preferred a centralized approach, the benefits of centralization were primarily explained from the perspective of the central government being able to make better decisions for the common good. However, there was substantial pushback against the collection of a centralized interaction network by a government or large corporation which already inherently had access to personal information (e.g., in other databases within the government or other business units under the same organizational umbrella), as that would compromise anonymity. The centralization vs. decentralization debate was ultimately quelled by a historic collaboration between Apple [4] and Google [22], whereby they integrated a decentralized framework into their mobile operating systems, and strictly regulated access to it. Specifically, their system provided the only way for iPhones to reliably detect nearby iPhones via Bluetooth under all circumstances. At the time, there were significant complaints from the national governments of France, Germany, and the United Kingdom about decentralization [21], but the system remained.

Of the three, Germany was the first to deploy the decentralized approach, but after investing 25 million euros to develop and promote the resulting app, the resulting impact is in question. One report [37] noted the weakened power of decentralization and the research documenting the inaccuracy of Bluetooth [27, 28, 29]. Unfortunately, the pandemic continues to spread in many regions of the world, with the United States surpassing 9 million cases on October 29 [41].

To estimate the effect of the now-standard exposure notification technique, a recent preprint

[1] by researchers from Google and Oxford reported that “in a model in which 15% of the population participated, we found that digital exposure notification systems could reduce infections and deaths by approximately 8% and 6%,” under the assumption that when someone anonymously receives an exposure notification, which is tuned so that “80% of all ‘too close for too long’ interactions are captured between users that have the app,” they voluntarily are “90% likely to begin a quarantine until 14 days from initial exposure with a 2% drop out rate each day for noncompliance.”

Another recent preprint [47]

, coauthored by people affiliated with a leading app using the Google-Apple system, contextualized the probability of infection associated with exposure notifications, indicating that 14-day quarantines could be requested for individuals when their infection risk exceeded a threshold around 1%, indicating that such a figure was “broadly compatible with the attack rate reported in Taiwan (1.0%, 95% CI: 0.6–1.6%) for those interacting with infected individuals in the first 5 days of symptom onset

[13], which is similar to the 1.9% attack rate (95% CI 1.8%–2.0%) reported in South Korea [35].” In light of these figures, it is likely that in order to capture 80% of all interactions that result in infection, the probability of being infected upon receiving an anonymous exposure notification would be under 10%, due to limitations from anonymization and technology accuracy.

It is realistic in the non-anonymous manual contact tracing setting to expect 90% initial quarantine compliance, because each affected contact is directly contacted by the authorities who know their identity (and who may be able to enforce policies), and the manual contact tracer is likely to have taken other factors into account (e.g., indoor/outdoor, usage of personal protective equipment) to significantly reduce false positives. On the other hand, it is less clear that there would be 90% compliance with non-enforceable anonymous exposure notifications corresponding to under-10% infection probability. Indeed, if the voluntarily compliance rate upon receipt of low-infection-probability exposure notifications were sharply lower than 90%, then even if there were 100% app adoption, the entire intervention would have minimal impact.

These observations indicate that it is very important to explore other interventions. They also indicate that since human behavior plays a major factor in driving efficacy, it is valuable to consider the behavioral science perspective, as compliance with post-exposure notifications requires altruism. There has been substantial work in the behavioral science domain discussing COVID-19 in particular (see, e.g., the surveys [7, 31]), and it would be beneficial to design a system which naturally aligns with human behavior.

In this article, we introduce a fundamentally different approach to the general problem space of contact tracing, which realigns incentives with self-interest, to boost both the initial app adoption and the response to signals from the app. It can seamlessly coexist with post-exposure notification (manual and/or digital), and we recommend both interventions to be deployed in parallel. We summarize our system in the next section, and provide a detailed specification for a sample full implementation in Section 3. Then, Section 4 presents some empirical observations from actual deployments, together with a heuristic mathematical analysis of how the natural incentive alignment in our approach is likely to very significantly boost effectiveness. We address privacy and security issues in Section 5. The purpose of this article is to introduce our new approach, demonstrate its substantial theoretical advantages, and open the large area of investigation around it, which may be of interest to researchers from disciplines ranging from public health to behavioral science to mathematical modeling.

2 New approach

We use the physical interaction (contact) network from a different perspective. Abstractly, that network represents the potential disease transmission network, and may be enriched and extended by attaching a variety of times, strengths, and other properties to describe the natures of interactions and individuals. In the context of this network, the standard contact tracing paradigm is: for each newly positive node , send post-exposure notifications to the neighboring nodes which are directly connected to in the network by sufficiently-strong connections that occurred at times overlapping with ’s contagious period. The existing intervention achieves its main impact on disease spread by quarantining (and testing) the nodes in , with the hope of cutting off a significant fraction of the spread of the disease.

Our approach flips the perspective from that of the central planner seeking to reduce spread, to that of the individual seeking to avoid infection. For each newly positive node in the network, we notify all other nodes connected by paths of network distance111The network distance between two nodes in a network is the minimum number of connections that need to be traversed to go from one node to the other. For example, two nodes that are directly connected are at network distance 1; and, two nodes that are not directly connected to each other, but are both directly connected to the same other node, are at network distance 2. up to some value (e.g., 12 in our implementation as of November 2020) of the numerical network distance that they are from , but not the direct identity of . The network used is based on all recent physical interactions across the entire user population, as detailed in Section 3.1. By continually providing this network distance information and animating it over time like a weather radar map, every individual user can visualize the infection approaching or receding in their own network. This is illustrated in Figure 1.

Figure 1: Two representative animation frames, indicating the approach of the infection over a multi-day period. The height of each bar represents the number of users who have tested positive (dark red) or been confirmed as contacts of positive cases (light red), and the horizontal axis is the network distance between the positive case and you, where the network is constructed based on the last 14 days. Positive cases disappear from the chart after a specified period.

Our “pre-exposure notification” system provides lead time as compared to traditional exposure notification (which only notifies nodes at network distance 1), akin to how a hurricane satellite video empowers people to take precaution before the storm is directly overhead. By seeing transmission approaching from far off, the individual has time to protect themself. Since there is lead time, the user’s context is fundamentally different. Instead of being asked to take actions that protect others from themselves (as would be necessary if they already had direct contact with a confirmed-infected individual), they have the opportunity to protect themselves from others. This important asymmetry enables our intervention to achieve its main impact on disease spread by dynamically engaging natural self-interested instincts of self-preservation, as people near each infection hotspot take actions to avoid infection, such as using stronger protective equipment, practicing more vigilant distancing, etc.

By focusing on the individual’s perspective, our approach aims to deliver actionable information to individuals in a way where they clearly perceive direct value. We are definitely not the only ones to propose tools that let individuals visualize their risk (e.g., [10]) or visualize the spread in geographical regions (e.g., [25]). However, it is significantly more actionable and relevant to learn of a new case at network distance 3 from you (maybe your housemate’s coworker’s husband Bob), rather than to learn that there were 5 new cases in your postal code. Indeed, Bob might live on the other side of town.

Conveniently, the evolution of positive cases is relatively easy to plot as a bar chart, to animate against network distance on a smartphone display. Figure 1 displays two animation frames of this simple interface, which even to a non-technical audience communicates the alarming approach of infection from far away. This animated visual provides a naturally easy-to-use tool for any individual to avoid infection, regardless of their technical background. There is no abstract risk score to interpret, nor an infection probability to estimate after direct contact. That is because our objective is not to quarantine the user to prevent them from infecting others, but rather to inspire the user to increase caution to avoid getting infected. All they need to do is to watch the bars animating day-by-day, so they can take extra precaution when the bars seem to be on track to reaching the user. Such is the value of notifying all users of their numerical network distance from each new positive case, instead of only notifying the users at network distance 1 that they have already been exposed.

Our approach thus effectively becomes an automatic way to dynamically modulate social distancing in sub-regions of a community, as people close to hotspots (in network distance) see more red on their charts. Network distance is the correct metric, and is more appropriate than geographic distance because it accounts for people traveling to workplaces in different parts of town. If people just took greater precaution (e.g., more vigilant use of personal protective equipment, more intense distancing, etc., but not even a rigorous quarantine) whenever the virus struck within close network distance (e.g., 3) of them according to their chart, then this temporary behavior change would tend to reduce the person-to-person transmission by a significant factor in the vicinity of every infection cluster. We hypothesize that this would already have a nontrivial impact on the infection’s basic reproduction number () in the population.

3 System specification

Our framework is currently deployed as an app, which has been publicly available for free download from the official Apple and Google app stores since May 2020 [32]. To provide the reader with a concrete example of an implementation, we sketch that particular app’s construction in this section. That app actually also provides post-exposure notifications as well, but this article will concentrate on the pre-exposure notification component. The purpose of this article is not to focus on one particular implementation of our approach, but this section seeks to include sufficient detail to demonstrate that our approach is practically implementable in today’s environment. We describe the different sensing mechanisms because their practical idiosyncrasies profoundly impact the feasibility of our framework. It is interesting that this system only recently became possible to implement at wide scale, thanks to the proliferation of smartphones possessing short-range communication capabilities.

3.1 Pseudonymous network construction

Strictly speaking, our system is pseudonymous rather than anonymous, in the sense that each user has a persistent Version 4 Universally Unique Identifier (UUID), which is randomly generated at install time and never revealed to the user, nor associated with personally identifiable information. Importantly, it is impossible to recover any hardware identifiers or phone numbers corresponding to the user’s smartphone from their UUID. The app then periodically scans its vicinity to identify proximal app-running devices using Bluetooth Low Energy, ultrasound, and Wi-Fi.

Bluetooth Low Energy (BLE).

Similarly to most other apps in this space, this app wakes up every few minutes to scan the vicinity for several seconds, searching for other BLE devices running the same app. If it finds any, the devices communicate over BLE to exchange temporary random identifiers (not their UUID’s), and they each independently send our central server their UUID’s, the temporary identifiers they sent and received, the current time, and the received signal strength indicator (RSSI). Due to iOS limitations, a backgrounded iOS app222Generally speaking, an app is running in the foreground when it is on screen; it is running in the background when it is not on screen or the phone screen is off. cannot detect other backgrounded iOS apps when scanning, but it can be detected by foregrounded iOS apps and Android apps in both backgrounded and foregrounded states, whereby it will respond to scans from those devices. Android devices have no significant limitations.

Ultrasound.

During the BLE communication, if neither device is a backgrounded iOS app, then the devices use a near-ultrasound communication protocol in the 18.5–19.5 KHz range, to estimate their relative distance with a significantly higher level of accuracy than can be deducted from BLE RSSI (see, e.g., [34]). Since ultrasound does not penetrate walls, this additional sensing technique verifies whether the devices are in the same airspace. Custom sonic waveforms are used to optimize robustness in noisy environments with obstacles. Importantly, all Fourier Analytic signal processing is handled on-device, and no audio recordings are stored on the device or transmitted to a server for processing. Only the estimated distance is transmitted to the server, augmenting the BLE data record mentioned above. The implementation of this Ultrasound capability is non-trivial, and falls beyond the scope of this article. A future technical publication will address it in greater detail.

Wi-Fi.

Every few minutes, wake-up signals are sent from our own remote server to phones, as well as via Bluetooth from Android phones or foregrounded iOS apps. Upon receiving a signal, the app checks if it is connected to a Wi-Fi access point. If it is, it sends a hashed version of the specific access point’s fingerprint (technically, its BSSID) to a separate server without including the user’s UUID. That server returns a temporary and randomly generated identifier for that BSSID which is only stable for a fraction of an hour. All associations between BSSID’s and temporary identifiers are rapidly expired and expunged. The app works only with that temporary Wi-Fi identifier, and sends it to our main server together with the time at which this check occurred. This provides the ability to detect whether two devices were connected to the same Wi-Fi access point at around the same time, while making it difficult to use the interaction database to reverse-engineer the original BSSID’s from the stored Wi-Fi information.

No GPS.

Very importantly, GPS is never used, because a device’s GPS coordinates constitute personally identifiable information.

No constant Internet connectivity required.

Among some populations in the world, many people only have intermittent Internet connectivity. Those populations tend to correlate with heavier Android use, at which point our BLE/ultrasound sensing fully operates even without the Internet. Whenever the device reaches Internet connectivity, detection records created since the last instance of connectivity are uploaded for server processing, and the latest notification information is downloaded. Even iPhone users without cellular data can use a “Standby Mode” which we created to keep the app in the foreground to use BLE/ultrasound while dimming the screen to conserve battery.

The Wi-Fi sensing component is the unique element that enables our system to operate on iPhones even when the app is in the background. Its resolution is insufficient for the purpose of post-exposure notification, and so it was not an option for apps using that existing paradigm. Background iOS operation was a fatal issue for other apps. Unfortunately, the United Kingdom’s initial attempt to build its own Bluetooth app outside of the specially-regulated Google-Apple Bluetooth system only detected 4% of iPhones [38], understandably leading them to abandon the project.

The power of our alternative paradigm is that it makes Wi-Fi extremely useful, because our method only needs to construct an approximate physical interaction network. Our method of impact is to provide anonymized insights which confer the same degree of risk as “your housemate’s coworker’s husband Bob became positive,” but without identifying that the relationships in question are those specific co-living or co-working ones, instead stating that someone at network distance 3 became positive. In many situations, these types of strong relationships (co-living and co-working) are reliably captured by Wi-Fi access point overlaps of several hours. These are the types of relationships that form much of the backbone of the network along which the infection spreads. Indeed, one of the most-referenced COVID-19 agent-based modeling software packages, OpenABM-Covid19 [24], models the infection network by overlaying “household networks” and “occupation networks,” together with some random interactions. Our Wi-Fi system effectively captures a large fraction of the household and occupation networks, supplemented by Bluetooth and ultrasound to provide additional accuracy and valuable redundancy. To capture the random interactions, we tune our Bluetooth and ultrasound parameters to be fairly generous, picking up contacts if they have been within around 10 meters for at least 15 minutes.

In summary, the app’s 3-sensor system provides multiple overlapping technologies with different strengths and weaknesses. We use 14-day windows of data to construct approximations of the physical interaction network, with 14 days chosen in order to capture two weekends for redundancy, because people often have different interaction patterns on weekends. This suffices to construct an approximation of the network which serves the purpose of our new paradigm, and unlocks its potential.

3.2 Anonymous positive case labeling

In order to reliably label some nodes as positive, we use a one-time token system which has a very important improvement over other apps’ systems that vastly amplifies our approach’s power: in our system, tokens can be entered333In some alternative implementations of our system, users may be able to pre-authorize a trusted authority to enter signals on their behalf, by submitting a token of their own. not only by confirmed positive cases, but also by confirmed contacts of positive cases, as confirmed by a contact tracer. The latter type of signal is not useful for the traditional quarantine-driven paradigm, because the infection risk upon being a contact of a contact of a positive case is far too low to recommend quarantine. It is unique to our alternative distance-based paradigm.

Specifically, a trusted authority (e.g., government department of public health, university health center, etc.) is able to securely generate tokens from a separate system we operate, to which they input no personal data to generate each token. They then distribute the tokens to individuals in their jurisdiction who they have confirmed to be positive. When a user enters a recognized token into their app, together with the date symptoms started, we mark their UUID as positive. Our databases do not store the association between which UUID used which token. For evaluation purposes, it is also possible to submit a positive report in the general app without a one-time token, but official community deployments typically disable unauthenticated reporting.

In addition, the trusted authority can generate a different type of tokens to distribute to contacts of confirmed positive cases during their (possibly manual) contact tracing process.444In our deployments, we do not store any linkage between the tokens that correspond to each individual case, to preserve privacy. In settings where the resulting lack of privacy is acceptable, it is possible to store such linkages. Then, the system could even estimate the distance from Person to a positive case for which only contact tokens were submitted, by adding 1 to the minimum of all network distances from Person to people who submitted tokens linked to that positive case. Other natural variations would also be possible. Users can enter these confirmed contact tokens into their app, so that other users can see how many relationships away they were from people who were identified as confirmed contacts of some positive case during the contact-tracing process. This is still useful in our paradigm because if a user sees that there is such a person relationships away, then they are less than or equal to relationships away from a confirmed positive case. Yet this significantly amplifies the probability that each positive case generates some signal in the system. Indeed, if a positive case had 10 contact-traced contacts, then even if each of the 11 tokens independently only had 20% chance of being entered, there is now over 90% chance that at least one of the 11 people enters their tokens.

3.3 Chart construction

To minimize privacy concerns, we only send new signals to other users at the time each new positive test token is entered into the mobile app. In particular, we do not continue to send new signals that would reveal whether the positive person chooses to continue walking around in public. In order to make the positive signals more visible on the animated chart from each particular user’s perspective (Figure 1), when our server discovers that there is a newly positive person, that person contributes to the chart at the network distance to the user based on the network constructed from the last 14 days of population-wide interaction data. Then, the contribution from this positive case remains on the user’s chart at this fixed distance until 10 days after the reported symptom start date (duration constantly evaluated based upon latest guidance), similarly to how an old-fashioned radar screen displays glowing blips that take some time to fade away. Confirmed contacts of positive cases (as entered via the contact codes in the previous section) are overlaid on this chart in a lighter color.

4 Heuristic analysis and empirical observations

An essential question for all digital interventions is to determine what adoption level is required for impact. Other approaches which primarily protect other people from the app user, or more generally do not provide a direct benefit to the user, tend to have greater difficulty intrinsically motivating adoption. In contrast, our approach takes inspiration from apps that enjoy organic growth, by delivering the direct value of helping the user proactively protect themself. Therefore, our approach raises two adoption-related questions. First, since our approach leverages network effects, what is the critical adoption threshold beyond which users conclude that a critical mass of people in their region have adopted this solution, to facilitate organic growth? Second, at what (possibly higher) threshold does the app deliver enough impact to make a meaningful impact on infection spread? It is worth noting that the existence of the first question is a major benefit, because it provides a roadmap to high adoption from a low initial threshold. It turns out that the physical interaction network very significantly amplifies the power of our approach.

4.1 Low threshold for critical mass

In this subsection, we include some empirical observations from the first community deployment of an earlier version of the app based on our approach, at Georgia Tech [44], where the community was encouraged to participate on a voluntary basis. It is important to note that although the particular app version provided significant function, it did not yet have the Wi-Fi-based capability to operate fully in the background on iOS devices, and so required more user intervention in order to operate at all times. No confidential data collected from any deployment is referenced in this section.

User feedback during the rollout period pointed to the value our network-based approach provided in indicating personally-relevant critical mass. As reported by an independent student newspaper at the unaffiliated University of Georgia [20], when the app was at around a 10% level of adoption at Georgia Tech, one interviewed user noted they had nearly 2,000 connections at various distances from them in particular. The app displayed this to them in a chart analogous to Figure 2, which is itself a screenshot shared in a Reddit post [19] by a different student who said: “I checked in yesterday, and I seem to have my room mates on there, and the charts are unbelievable. I can’t believe it linked our little enclave to much of campus residents.” This critical mass effect also inspired installation by users outside of the Georgia Tech campus community, as a parent independently communicated that he had installed the app, noting the 1,000+ connection counts that he observed on his app due to connecting with the student during regular home visits. The fact that people highlighted these individually relevant connection counts indicates that our network-based approach already helps people perceive critical mass at low adoption rates.

Figure 2: This chart, publicly shared by a Reddit user [19], plots the number of other users at each network distance from the user in an active deployment. Even though the particular user only had 2 direct connections, there is an exponential growth in the number of users until the perimeter of the connected cluster is reached.

There is heuristic mathematical support for this observation that in dense environments such as universities and schools, a low adoption threshold suffices to enable users to have many other users connected to them in the system. This corresponds to the property that our collected network contains large connected clusters, where within each large cluster, every pair of nodes is connected by at least one path (possibly via other nodes). This is not the same as effectiveness, but it does answer the first question above of helping people in a region conclude that critical mass has been reached for the app user network to be connecting them to many participants. Our system (as of November 2020) builds its network by looking over 14-day windows, and identifying app users who have been detected within about 10 meters555Since our alternative approach is not focused on sending exposure notifications to trigger quarantine, it is more useful to have a wider radius for establishing connections that indicate relationships, as opposed to the commonly-used value of 2 meters that is used for transmission. for at least 15 minutes, or concurrently on the same Wi-Fi access point for at least 3 hours. Charts with large numbers like Figure 2 start to appear when a typical user is connected to at least 2 other app users in this network, and each of them is connected to at least 2 other users, and so on. In some sense, we are building a viral app which spreads along a similar network to the virus itself, and this corresponds to the app having its own . This type of phenomenon roughly emerges when the adoption rate in a region of the actual network (e.g., a university or school) exceeds about , where estimates a typical person’s number of contacts as defined with the generous parameters above over a 14-day period. For schools and universities with in-person classes or students living in residence halls, may be over 30; then, this critical threshold is below 10%, which is highly favorable. Furthermore, adoption is far from independently distributed, as the conditional probability that an app user’s contact joins the system is likely higher than that for a random individual. So, even for lower values of , the same critical threshold may be sufficient due to positive correlation amidst the heterogeneity of adoption.

As soon as that level of connectivity is reached, the amplification power of our approach is unlocked. Each positive signal which is entered into the system notifies thousands of people. Conversely, each user is able to receive signals from thousands of other people, which facilitates user acquisition and retention. The remaining subsections can then operate with higher adoption thresholds.

4.2 Accuracy of signals

Next, we turn to analyze the potential discrepancy between the reports of network distances to new signals based on the network of app users, and actual network distances to those signals if they were calculated against the full network of people (whether they are app users or not). One failure mode corresponds to the situation where the actual network distance to a new positive case is finite, but the user is at infinite distance according to the network of app users (not connected, possibly because the positive case is a non-user, or because of a non-user along the path). Another failure mode corresponds to the app-reported distance being significantly longer than the actual network distance, again due to non-users along the shortest paths. The root cause therefore comes from two categories: non-users along the path, and non-users who turn positive. The presence of these sources of error make it imperative that in any deployment of our approach, careful communication must be sent to inform the community that the app is a tool to help increase caution, but a null signal should not provide a false sense of security. Regardless, it is still a useful tool, analogous to how convex passenger-side automobile mirrors are useful even though they have blind spots and bear the legend “objects in mirror are closer than they appear.”

For non-users along the path, this becomes an interesting random network structure question in itself: what is the effect on network distance of missing nodes? We point to several influential factors which work to our advantage, and which theoretically indicate that our approach would likely yield favorable results. First, the typical structure of the physical interaction network has “small-world” character, in that neighbors of a given node tend to be more likely to be neighbors of each other. As mentioned above, in the OpenABM-Covid19 model [24], the network is composed of three parts: a recurring household network, a recurring occupation network, and some random interactions that change daily. Digging deeper, their model of the household network is such that each household has all pairs of interactions occurring every day, and each occupation network is a Watts-Strogatz small-world network with a fixed set of connections between individuals, where each day a random subset of half of these connections is chosen as the interactions between individuals. They perform additional refinements to account for networks of children.

In this type of model, a missing node does introduce significant distortion in reported network distances between other nodes in the same household as and other nodes in the same occupation network as , because was the main point of connection between those two worlds. However, due to the small-world nature of the occupation network, that sub-network is significantly more resilient to ’s absence, because there are many alternative paths to consider that go around . This indicates that within networks corresponding to schools, universities, or large workplaces, the network distance should be somewhat robust to missing nodes, making our approach useful there even before widespread general adoption. Therefore, the most fragile paths are those of the form between nodes and from different households, where the connection is between two nodes in one occupation network, the connection is between two nodes in another household network, and the connection is between two nodes in yet another occupation network. If either or is not participating, then the app-reported distance between and could be too high if there are no other short paths between and .

Fortunately, the combination of these observations points to an example pathway to impact: a city’s high schools could adopt our approach as part of its safe reopening plan, with an emphasis on all members of each household joining the system (not only the student). Since many occupation networks will have multiple connections to the high school network via households, this will effectively leverage the structures of both the high school network and the small-world nature of occupation networks to provide the redundancy required to make the system resilient to non-participation.

The issue of positive cases occurring among non-users is easier to resolve. The nature of infection spread is such that positive cases appear in contiguous clusters when viewed in the context of the actual interaction network. Consequently, even if only some fraction of the positive cases occurs among app users, those positive case reports will still appear at roughly the correct distances from every other user. Since the success of our approach is not based on identifying and quarantining every exposure, but rather providing situational awareness that inspires self-protective caution, it is enough for us as long as the fraction of signal that appears corresponds to nonzero actual signal appearing around the appropriate distance.

4.3 Impact of signal

The previous subsections provided heuristic analysis of how partial adoption affects the generation of signals of positive infection at various network distances. Finally, we discuss the impact of these signals once received. Our impact mechanism is based on engaging the natural self-preservation instincts of people within short network distance of positive cases. The nature of the communication does not emphasize quarantine, but rather offers an opportunity to protect oneself from others. It is not yet definitively known what the precise impact of various precautions is, but an early research study on coronaviruses by the World Health Organization [15] provided some figures which we may use to estimate the order of magnitude of impact. Specifically, they were moderately confident that “a physical distance of more than 1 m probably results in a large reduction in virus infection; for every 1 m further away in distancing, the relative effect might increase 2.02 times.” There has been quantitative uncertainty around the effect of mask use, with “relative risk (RR) reductions for infection ranging from 6–80%” stated by Schünemann et al. in The Lancet Respiratory Medicine [39], while a recent study by Asadi et al. in Scientific Reports [5] found that in terms of particle emission, surgical masks blocked 90% of particles while speaking, while cloth masks increased the number of particles. That said, if people choose to reduce their duration or frequency of in-person interaction, that certainly reduces their risk by a corresponding rate.

Even though these quantitative figures are not definite, it appears likely that if people in the vicinity of the cluster actively seek to protect themselves from infection, they can reduce the infection transmission probability by a factor that is large relative to . And, as further research emerges regarding self-protection techniques, people will be able to adopt the most effective methods, whatever they may be. Furthermore, the network distance information transcends the app userbase, because if a household member is an app user who sees positive case(s) approaching, they have a likelihood of informally alerting others in their household, even if others are not app users. This information is still relevant and useful, because if the original app user had a case at network distance , then the case is at distance at most from their household member. The same phenomenon holds within occupational networks. Therefore, in terms of the impact delivered by our system, once there is a signal of nearby positivity, its impact is likely to affect the behavior of not only the app user, but also non-app users in the vicinity. This represents a fundamental distinction from the situation with anonymous post-exposure notification apps. Indeed, the primary mechanism for impact of anonymous post-exposure notification is via quarantine, but the anonymity inherently makes it difficult to know which non-app users should quarantine, and it would be highly inefficient for everyone nearby to quarantine as well, not to mention the stigma of asking others to quarantine based on one’s own exposure.

In order to model the impact of a signal delivered by our approach, one would need to estimate the probability that the recipient of a signal of definitive nearby infection takes action to avoid infection, and the probability that their action, if taken, interrupts a would-be transmission. It is an added bonus that there is also some probability that they tell people nearby. Since the purpose of this article is to introduce our alternative approach, we leave it to future behavioral science research to precisely estimate the actual values of these parameters. We instead reason about their order of magnitude relative to the existing post-exposure approach from a theoretical perspective. Because of its alignment with self-protection, our first probability is likely to be an order of magnitude higher than the probability of a user complying with an anonymous multi-day quarantine request, especially when quarantine requests are sent to people whose current infection probability is 10% or below. And, with proper communication and education, our can be increased because it is often directly in the user’s self-interest to protect themself; on the other hand, even with more education, it will be extremely difficult to convince people to substantially inconvenience themselves at low infection risk, unless there are no alternatives. It is likely that our achieved by actively increasing vigilance is comparable in order of magnitude to that from the post-exposure system because although quarantine stops further transmission, post-exposure notification systems lack the lead time of our approach, and due to delays in testing, they may start the quarantine too late. Finally, because there is relatively little stigma attached to communicating the approach of positive cases from afar (as compared to disclosing an actual exposure), our is also substantial. Note, however, that the corresponding for post-exposure notification (inspiring multi-day voluntary quarantines for nearby non-users of a post-exposure notification app after receiving a low-infection-probability exposure notification) would be near-zero, and so any reasonable value here already represents a significant advance. In conclusion, the net impact of our signal is likely to be at a higher order of magnitude than that of the signals sent by existing approaches. It still is sensible to apply all approaches, however, because they control the infection spread in different ways.

5 Privacy and security

Many articles on digital interventions in the context of contact tracing have discussed issues of privacy and security at great length [2, 8, 9, 12, 14, 23, 36, 43, 45]. This section is written in a somewhat more technical form, so as to more accurately address such issues. In the context of the existing literature, our approach can be summarized as a pseudonymous, centralized system which directly collects no personally identifiable information and no GPS location information, but does collect timestamped relative proximity information and pseudonymized Wi-Fi information. As the purpose of this article is to introduce our approach of informing individuals of their network distance from positive reports, as opposed to detailing a specific app implementation, we have written this section so that it serves as a preliminary privacy analysis of our general concept. We do not claim to enumerate the entire attack space here, nor to provide proofs of security.

Significant concerns about centralized systems revolve around what could be done with access to the central database. An important initial observation is that our central database does not strictly need to collect any more information than Singapore’s TraceTogether system [6], and therefore our system represents a very significantly more effective intervention for a comparable privacy loss to the central authority, as compared to TraceTogether or any similar centralized system. Indeed, even though we have an additional Wi-Fi sensing layer which could be abused by a dishonest central authority to deduce location data, any such centralized Bluetooth-only system could similarly be compromised by positioning identified Bluetooth beacons in known locations. Alternatively, since we only use Wi-Fi to affirm relative proximity, if one believes it is possible to have non-colluding entities, it is possible to use a different implementation of the Wi-Fi system in Section 3.1 as follows. A completely separate and non-colluding Wi-Fi Matching Entity could take the responsibility for determining Wi-Fi matches, whereby each mobile app sends a single-use random identifier representing its identity together with its Wi-Fi information to the separate Wi-Fi Matching Entity, while informing the main central server of which single-use random identifier it just sent. Then, the Wi-Fi Matching Entity informs the main central server of which pairs of single-use random identifiers correspond to users who were proximal, without disclosing any actual Wi-Fi information, and it permanently destroys all of its data within hours of collection. The main central server then looks up which real user UUID’s correspond to the proximal single-use random identifiers. Importantly, the main central server finds out which devices were proximal, without ever receiving any Wi-Fi information, and the Wi-Fi Matching Entity is unable to deduce any user identities from the single-use random identifiers. (It is worth noting that although our approach avoids the use of GPS coordinates due to the general perception of GPS as being privacy-invasive, a similar non-colluding-entity technique could even be applied to anonymously determine relative proximity relationships using GPS coordinates or other absolute location information instead of Wi-Fi information, so that neither entity is independently able to associate absolute location information with users.)

Due to our system’s centralized operation, individual devices process very little information: only the temporary random identifiers of the other devices they were directly near, with times and relative distances. Consequently, even if an individual device were hacked, it would yield only one node’s local information in the network. Centralization also prevents individual devices from knowing the exact time that they encountered a positive case (an issue for some decentralized approaches), because the central server provides intentionally ambiguous ranges instead of precise values. In particular, since our approach only seeks to trigger increased caution, we do not continue sending proximity-based warnings even if a positively tagged device continues to roam in public during its contagious period. It is sufficient for our purposes to generate signals only at the time of reporting, as detailed in Section 3.3. That said, if a user was only ever around one other person, and they receive a positive signal at network distance 1, then they have certainly learned about the status of that other individual. This type of edge case occurs in many contact tracing methods, both digital or manual, and should be handled via disclosure in the app’s privacy policy.

5.1 Abuse by the central authority itself

It is true that if it were operating the central database, a government could understand the interaction history between Person and Person by sending Agents and to follow each of Person and Person around, and then using the central database’s records to start from the timestamped interaction record of Agent , identify which UUID corresponds to Person , and then perform the same deduction from Agent to Person , after which they could consult the database with the now-known UUID’s of Person and Person . Therefore, it is beneficial for our approach to be operated by a non-governmental entity. It is worth mentioning that long before contact tracing apps, mobile telecom operators already could deduce information about people’s whereabouts due to cell tower triangulation, and mobile operating system providers already could deduce even more precise location information through their multi-sensor location services. Indeed, the Israeli government acquired location information from mobile telecom operators to control the spread of COVID-19 [3]. Other concerns arise if the organization operating our framework also has control over the integration between our framework and the hardware (e.g., if the framework is integrated with the underlying operating system of the phone), as it is in theory possible for personal information entered via the operating system to mix with data collected by our approach.

More computationally intensive attacks might be possible with sufficient resources, with access to the full interaction network. For example, by analyzing clustering within the interaction network, sub-networks corresponding to major cities can be guessed. After sorting by population and taking relative geography into account, the sub-networks could be guessed to correspond to specific cities. Next, within a city’s sub-network, the further sub-networks containing a particularly high density of frequently changing 2 a.m. interactions could be identified, and those might correspond to universities. Again by comparing sizes of those sub-networks, it might be possible to guess which universities corresponded to each sub-network. Using the relative proximity distances, it might be possible to identify further sub-networks corresponding to individual dormitories, and then individual floors of dormitories. Finally, it is conceivable that in some special cases if a student lives at the very end of a hallway, the interaction network might identify them as an extreme point. (However, the accuracy of the Bluetooth distance measurements would likely be insufficient for that deduction, and ultrasonic distance measurements do not penetrate walls.)

Although all of these attacks are theoretically possible, it is important to note that many commonly used apps collect far more personal information. For example, in theory one could apply facial recognition to the videos uploaded on a social app to deduce a great deal of proximity information, capturing much more additional information as well. Often, those other apps are also explicitly monetizing that personal information by selling advertisements. Yet many people flock to those apps because they perceive direct value in the apps. In the case of our approach, we seek to similarly deliver a direct value proposition, while actively taking measures to limit invasions of privacy.

5.2 Attacks by users outside the central authority

We consider two general classes of attacks by ordinary users who do not have central database access: those who seek to compromise privacy by deducing the network structure, and those which seek to disrupt civil society by injecting false information. It is worth noting that the first class of attacks can be conducted by “semi-honest” participants, who do not actively disrupt the system’s behavior, but record the information they are given, and make more sophisticated deductions based upon it.

We begin by considering attacks aimed at disrupting society, e.g., by sowing unfounded panic. Fortunately, the Bluetooth and ultrasound portions of our protocol involve both parties in each interaction, and so it is difficult for an attacker to unilaterally insert an interaction between themselves and a legitimate user. They can, however, insert an interaction between themselves and other illegitimate users that they control. However, in order to affect legitimate users, they do still need to invest the effort to physically come close to legitimate users, which is impractical at large scale. Our Wi-Fi interactions can in theory be inserted unilaterally and illegitimately (without actually being in proximity of the Wi-Fi access point) if the app is decompiled and the Wi-Fi interaction recording procedure is reverse-engineered. Fortunately, due to the fact that the devices should still be constantly communicating via Bluetooth and ultrasound, the structure of the network can be used to identify suspicious devices which are inserting illegitimate Wi-Fi interactions.

In order to maliciously insert positive signals into any proper deployment which uses positive test confirmation tokens as in Section 3.2, the attacker needs to acquire valid one-time-use tokens. This would involve either compromising a token-issuing authority, or acquiring tokens that were distributed to others and entering them into the wrong app. These attacks can be mitigated by expiring tokens shortly after their issuance, and by limiting a token’s usage scope (e.g., only to a certain community of users who has opted into working with a specific token-issuing authority).

We finish by considering attacks which seek to deduce the network structure. By sending only transient identifiers during Bluetooth communication, our system is more robust to “linkage” attacks which seek to identify the same user over time based on their Bluetooth transmission activity. On the other hand, by providing users with charts of the number of positive cases (and the numbers of individuals) at various network distances from them, our system does indeed supply additional information that could be used to deduce some elements of network structure.

For example, in the following situation it is possible to deduce that Person and Person did not spend substantial time together in the last 14 days. Agent is deployed to spend at least 15 minutes around Person , while Agent is deployed to do the same with Person . Then, Agent reports a positive case. If Agent does not see a positive case appear in their chart at distance less than or equal to 3, then Person and Person did not spend time with each other over the last 14 days.

The converse does not immediately hold. Even if Agent sees a positive case appear at distance 3, that is not definitely because Person and Person spent time together, because it could be that when Agent built a connection to Person , Agent also accidentally built a connection to Person who happened to be within 10 meters at the time, and Person was the one who had spent time with Person . Or, it could be that an unrelated positive case was reported at around the same time, and showed up at distance 3 from Agent . These challenges can be overcome with a more complicated attack in which the agents ensure that there is nobody else within 10 meters of them besides their target, and send multiple positive reports to increase the confidence that Person and Person had spent time together. Note, however that this still does not indicate what time within the 14-day window such an interaction could have occurred, or for exactly how long (beyond 15 minutes). And, if Person and Person meet again within 14 days of their last meeting, it will be impossible to periodically repeat the same style of attack to deduce the earlier meeting time, because even the 14-day window immediately following that earlier meeting time will contain an interaction that constitutes a connection (the result will look the same).

In summary, it is true that some network information can be deduced even without access to the central database, and we leave it to the reader to judge whether access to information such as the above is a worthwhile tradeoff for a system which can provide early warning of an approaching infection.

6 Conclusion

In this article, we have introduced a novel approach which empowers every individual member of society to actively avoid infection. It uses the contact tracing network “in reverse,” where instead of asking everyone within distance 1 of a positive case to quarantine, it tells everyone how far away the new positive cases have struck in their physical interaction network. This reversal changes the nature of the intervention, from one which “protects others from you” to one which “protects you from others.” Through that flip, the incentive structure also reverses, as users are given the opportunity to protect themselves before it is too late. Suddenly, users prefer false positives over false negatives (“better safe than sorry”), which is the opposite of the situation when they use apps that ask them to quarantine (the culturally unfamiliar “guilty until proven innocent”). The final reversal is what enables us to use Wi-Fi as a sensor, as Wi-Fi access point overlap generates too many false positives for post-exposure notifications, yet extended duration on the same Wi-Fi access point fairly accurately captures much of the underlying physical interaction network, which is all we need. As discussed in our heuristic analysis, the potential impact of our approach is very significant, likely reaches critical mass at low adoption rates, and even is likely to spill over to alert non-users. It is unrealistic to expect there to be a perfect solution, but our theoretical heuristics indicate that on balance, this could be a highly beneficial intervention.

It is natural to ask why such a technique was never used before. The answer lies in the current state of technology. It is only in the past decade that the necessary transmitters commonly appeared on smartphones, and those smartphones proliferated. Wi-Fi has certainly been around for some time, but our other Bluetooth and ultrasound sensors augment the system in important ways. Conveniently, in the midst of this devastating COVID-19 pandemic, we just happen to be in a position where we can deploy this new system at global scale.

That said, the author acknowledges that his primary expertise is in mathematics and technology, and the problem of pandemic control is complex, requiring expertise in other disciplines. Therefore, the purpose of this article is not to claim a proof of a solution to the pandemic. On the contrary, the reason for putting much of the logic of this article into words was to produce a written document which could invite comment, discussion, and collaboration. This article opens more questions than it answers, acknowledging that in order to definitively understand the potential of this approach, there are many questions in public health, medicine, biology, modeling, behavioral science, and security that need to be answered. If practice aligns sufficiently with theory, then this new approach could equip everyone with the equivalent of a personal weather satellite, to protect themselves by tracking an otherwise-invisible disease approaching from afar.

Acknowledgments

Thanks to Anna Bershteyn, Timothy Chu, Pete Hoch, Ian McCullough, Janet Mertz, Francesmary Modugno, Philip Welkhoff, Lowell Wood, Shannon Yee, and Yun William Yu for providing valuable feedback on earlier versions of this manuscript.

References