Risk Estimation of SARS-CoV-2 Transmission from Bluetooth Low Energy Measurements

04/22/2020 ∙ by Felix Sattler, et al. ∙ Berlin Institute of Technology (Technische Universität Berlin) Fraunhofer 13

Digital contact tracing approaches based on Bluetooth low energy (BLE) have the potential to efficiently contain and delay outbreaks of infectious diseases such as the ongoing SARS-CoV-2 pandemic. In this work we propose a novel machine learning based approach to reliably detect subjects that have spent enough time in close proximity to be at risk of being infected. Our study is an important proof of concept that will aid the battery of epidemiological policies aiming to slow down the rapid spread of COVID-19.



There are no comments yet.


page 3

Code Repositories


Bluetooth Low Energy RSSI measurements with ground-truth distances.

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


  • Bourouiba [2020] Lydia Bourouiba. Turbulent Gas Clouds and Respiratory Pathogen Emissions: Potential Implications for Reducing Transmission of COVID-19. JAMA, 03 2020. ISSN 0098-7484. doi: 10.1001/jama.2020.4756. URL https://doi.org/10.1001/jama.2020.4756.
  • Chen et al. [2018] Hechang Chen, Bo Yang, Hongbin Pei, and Jiming Liu. Next generation technology for epidemic prevention and control: Data-driven contact tracking. IEEE Access, 7:2633–2642, 2018.
  • [3] DP-3T. https://github.com/DP-3T/documents.
  • European Centre for Disease Prevention and Control [2020] European Centre for Disease Prevention and Control. Contact tracing: public health management of persons, including healthcare workers, having had contact with covid-19 cases in the european union – second update, 2020.
  • Ferretti et al. [2020] Luca Ferretti, Chris Wymant, Michelle Kendall, Lele Zhao, Anel Nurtay, Lucie Abeler-Dörner, Michael Parker, David G Bonsall, and Christophe Fraser. Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing. Science, 2020. doi: TBA. URL TBA.
  • Freunde Liberias, e. V. [2018] Freunde Liberias, e. V. EBOLAPP. https://www.ebolapp.org/, 2018.
  • [7] PEPP-PT. https://www.pepp-pt.org.
  • Salathé et al. [2010] Marcel Salathé, Maria Kazandjieva, Jung Woo Lee, Philip Levis, Marcus W Feldman, and James H Jones. A high-resolution human contact network for infectious disease transmission. Proceedings of the National Academy of Sciences, 107(51):22020–22025, 2010.
  • Singapore Government Technology Agency and Ministry of Health [2020] Singapore Government Technology Agency and Ministry of Health. TraceTogether. https://www.tracetogether.gov.sg/, 2020.
  • Voigt and Von dem Bussche [2017] Paul Voigt and Axel Von dem Bussche. The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing, 2017.
  • Wrapp et al. [2020] Daniel Wrapp, Nianshuang Wang, Kizzmekia S Corbett, Jory A Goldsmith, Ching-Lin Hsieh, Olubukola Abiona, Barney S Graham, and Jason S McLellan. Cryo-em structure of the 2019-ncov spike in the prefusion conformation. Science, 367(6483):1260–1263, 2020.
  • Xie et al. [2007] X Xie, Y Li, AT Chwang, PL Ho, and WH Seto. How far droplets can move in indoor environments–revisiting the wells evaporation-falling curve. Indoor air, 17(3):211–225, 2007.
  • Yoneki [2011] Eiko Yoneki. Fluphone study: Virtual disease spread using haggle. In Proceedings of the 6th ACM Workshop on Challenged Networks, pages 65–66, 2011.

Appendix A Supplementary Material

a.1 Epidemiological Models

In our experiments we use three different epidemiological models to convert the proximity values into infectiousness scores


where is the contact distance measured in cm. All three models are monotonically decreasing functions of the distance and the infectiousness score decreases with increasing distance.

The main use of epidemiological models in our experiments is to generate ground truth labels for our data, which consists of a time series of RSS values and corresponding distances (the latter is not available in real settings). To generate the labels, we integrate the infectiousness scores over the contact time according to the equation


a.1.1 Local and Global Risk Thresholds

For every epidemiological model there exists a reference, from which on no infection is expected. For instance, for COVID-19 it is assumed that a physical proximity between two people of less than 2 meters over a time period of 900 seconds (15 minutes) results in a high risk of being infected [European Centre for Disease Prevention and Control, 2020]. Inserting the reference sequence , with


into equation (5) results in a local threshold


By selecting the epidemiological model and the infectiousness threshold we can determine, which time series of distance measurements should be considered dangerous and which should not:


An alternative approach is to label the data with a global threshold. For that we need to have an estimate of the expected number of newly infected contact persons from previously infected persons. This number can be computed with the basic reproduction number as


One can then chose in a way so that the number of high risk encounters matches the expected number of new infections, i.e.,


where is the total number of recorded proximity histories.

a.2 Infection Risk Estimation as a Regression Problem

Given an epidemiological model and the true distances we can label encounters into “high risk” and “low risk”. Since the true distances are not available in real settings, we aim to train a machine learning model to predict these labels from the raw555For practical reasons we resampled the RSS values to 1Hz. RSS measurements of the BLE signal. To simplify the learning task, we extract features from the RSS data and provide them as input to the ML algorithm. In particular, we tested the following three feature sets:

  1. sum: total sum of received RSS values resulting in one-dimensional features

  2. dur_max_mean: duration, maximum and mean of received RSS values resulting in three-dimensional features.

  3. freq: amplitudes of first frequencies of received RSS values resulting in -dimensional features.

We input these features into a linear regression model in order to obtain a predicted “risk” score:


The input to the linear regression thus comprises a vector of parameters

, a bias term and a vector of extracted features . The resulting predicted risk score is then compared to a threshold, which can be set to . If the predicted risk exceeds the threshold the encounter which resulted in the sequence of RSS measurements is considered “high risk“.

a.3 Real-World Experiment

a.3.1 Experimental Setup

A measurement campaign was performed to test and validate the proposed infection risk estimation model. This section describes the setup of the experiment.

The measurements on the 1st of April and the 7th of April were performed using 48 Samsung A40 smartphones of the same type that were carried by 48 protected soldiers, respectively. Tests were carried out at five different locations within the Julius Leber barracks in Berlin. There were three rooms within a conference center and two outdoor locations, with ten subjects each. All test subjects were equipped with face masks so that there was no risk of infection.

The floor of the test areas was marked (Fig. 2). These markings consisted of a 5 m x 5 m grid with lines spaced 50 cm apart. From the starting point (box within a box) to the ending point (multiplication sign), the test subjects had to walk through markings and stay on each marker for a predetermined amount of time (2, 4, 6, or 10 min). The markings are numbered on the green path from 1 to 9 and on the black path from 2 to 10 (Fig. 2, right). Two cameras were installed at each location to video record the test so that the exact locations of the test subjects could be checked after the test.

Figure 2: Test pattern on the floor of the five test areas (left with grid, right without grid).

The test was carried out in four runs. During the runs, the test subjects were instructed not to move too much, to hold the positions of the mobile phones relatively stable, and to stand within the square.

a.3.2 Data and Preprocessing

RSS data was collected via a prototype of the PEPP-PT App. The RSS data - recorded at a random and potentially varying frequency between 0.1 Hz and 10 Hz - was re-sampled to 1Hz. Ground truth distance data was derived from the predefined movement pattern on the grid. The labeling was additionally verified with the help of video footage that was taken at the test area. For every pair of soldiers we collected multiple data points, where one data point comprised of two aligned sequences:

  • A time series of distances (from which the ground truth risk can be derived).

  • A time series of BLE RSS values , recorded by mobile phones held by the soldiers.

a.3.3 Training and Testing Data

For training and testing, the time series data was separated into two folds according to the room in the test area in which the data was collected. Data collected in rooms 1 and 2 (indoor) and room 4 (outdoor) was combined in the training set. Data collected in rooms 3 (indoor) and 5 (outdoor) was combined in the validation set. In previous tests multiple combinations of indoor and outdoor rooms were tested to investigate possible covariate shift between indoor and outdoor scenarios. No significant effects could be detected, therefore the aforementioned mixed split was used.

a.3.4 Results

We trained a machine learning model to predict the ground truth risk, by only using features extracted from the RSS time series data

. Since the labels are not balanced (i.e. there are more negative than positive events), we use area under the ROC (receiver operating characteristics) curve (AUC) metric to evaluate the performance of our model. The AUC metric is a measure for how well the data can be separated using our classifier. An AUC value of

indicates no predictive power and indicates perfect predictive power.

The obtained results are presented in Fig. 3. The columns correspond to different epidemiological models, namely (linear, box, sigmoid), whereas the rows represent different combinations of features which we feed into the linear regression. Given the critical risk threshold derived by applying the respective risk model to the reference sequence (1), we display the achieved AUC for every combination of risk model and feature combination.

Figure 3: Ground truth risk vs predicted risk for different epidemiological risk models and combinations of features supplied to our machine learning model.

An encounter between two individuals is labeled as “high risk” if the value of exceeds a predefined critical risk threshold . This threshold can either be set locally, i.e., for each encounter, or globally based on the basic reproduction rate .

a.3.5 Follow-up Study

In order to evaluate the reliability of our results, we tested the model on data recorded with the same experimental setup, but on a different dates (7th April 2020 and 14th April 2020). In the experiments conducted during the 14th of April, participants were using different smart phone models and the phone holding positions were varied (”hand”, ”ear”, ”pocket”). Figure 4 compares the AUC values of the two measurement campaigns for the three epidemiological models (linear, box, sigmoid) and three sets of features (sum, dur_max_mean, freq). As can be seen, the performance of the proposed infection risk estimation method is comparable for the experiments conducted on the 1st of April and the 7th of April. For the experiments conducted on the 14th of April however the feature set dur_max_mean distinctively outperforms all other tested feature combinations. Evidently this combination of features is able to approximate the ground truth risk in a more robust way than the other investigated feature combinations.

Figure 4: Comparison of the results on data recorded on three different days. Every marker corresponds to a epidemiological model ((linear - green, box - orange, sigmoid - blue) and a set of features (sum, dur_max_mean, freq). Only the feature set dur_max_mean is robust to the changes in testing environment that occurred during the third measurement campaign on April 14th.