Evaluating shifts in mobility and COVID-19 case rates in U.S. counties: A demonstration of modified treatment policies for causal inference with continuous exposures

by   Joshua R. Nugent, et al.

Previous research has shown mixed evidence on the associations between mobility data and COVID-19 case rates, analysis of which is complicated by differences between places on factors influencing both behavior and health outcomes. We aimed to evaluate the county-level impact of shifting the distribution of mobility on the growth in COVID-19 case rates from June 1 - November 14, 2020. We utilized a modified treatment policy (MTP) approach, which considers the impact of shifting an exposure away from its observed value. The MTP approach facilitates studying the effects of continuous exposures while minimizing parametric modeling assumptions. Ten mobility indices were selected to capture several aspects of behavior expected to influence and be influenced by COVID-19 case rates. The outcome was defined as the number of new cases per 100,000 residents two weeks ahead of each mobility measure. Primary analyses used targeted minimum loss-based estimation (TMLE) with a Super Learner ensemble of machine learning algorithms, considering over 20 potential confounders capturing counties' recent case rates as well as social, economic, health, and demographic variables. For comparison, we also implemented unadjusted analyses. For most weeks considered, unadjusted analyses suggested strong associations between mobility indices and subsequent growth in case rates. However, after confounder adjustment, none of the indices showed consistent associations after hypothetical shifts to reduce mobility. While identifiability concerns limit our ability to make causal claims in this analysis, MTPs are a powerful and underutilized tool for studying the effects of continuous exposures.



There are no comments yet.


page 7


Commuting Network Spillovers and COVID-19 Deaths Across US Counties

This study explored how population mobility flows form commuting network...

Causal Inference in the Time of Covid-19

In this paper we develop statistical methods for causal inference in epi...

Observed mobility behavior data reveal social distancing inertia

The research team has utilized an integrated dataset, consisting of anon...

Causal Impact of Masks, Policies, Behavior on Early Covid-19 Pandemic in the U.S

This paper evaluates the dynamic impact of various policies, such as sch...

On the association between COVID-19 vaccination levels and incidence and lethality rates at a regional scale in Spain

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which ...

The policy is always greener: impact heterogeneity of Covid-19 vaccination lotteries in the US

Covid-19 vaccination has posed crucial challenges to policymakers and he...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

During a pandemic caused by a virus novel to the human immune system, such as today’s SARS-nCoV-2 (COVID-19), policymakers have only non-pharmaceutical interventions (NPIs) to rely upon while treatments and vaccines are developed, tested, and distributed. Many NPIs, such as school closings or stay-at-home orders, are designed to reduce the number of close contacts a person has in order to curtail community spread. However, NPIs are only effective if they actually create the desired changes in behavior and if the changes in behavior have a causal impact on disease transmission.

To aid in COVID-19 response, private companies and research groups [huang_twitter_2020, pepe_covid-19_2020, couture_measuring_2020, gao_mapping_2020] developed and maintain data sets aiming to quantify mobility and social distancing in the U.S. and worldwide [google_covid19_nodate] [facebook_movement_nodate]. These mobility indices are generated from anonymized mobile device tracking data aggregated at some geographic level, such as ZIP code or county. Despite privacy concerns, these indices may be a useful tool for researchers and public health officials [buckee_aggregated_2020]. However, it is not clear which of these indices, if any, are meaningful proxies for behaviors that are related to COVID-19 spread, and whether any associations identified persist over time [grantz_use_2020]. Evaluating the potential causal impact of mobility on disease spread is complicated by, among other things, the presence of confounders, varying epidemic arrival time across different areas, and changing governmental policies.

Our goal was to evaluate the effect of mobility, as captured through shifts in the distributions of various indices, on subsequent COVID-19 case rates during Summer and Fall 2020, a period when social distancing was generally encouraged in the U.S. To do so, we employed state-of-the-art methods in causal inference to first define the causal effect using a modified treatment policy (MTP) and then estimate the corresponding statistical parameter using targeted minimum loss-based estimation (TMLE) with Super Learner to adjust for a wide range of confounders and avoid unsubstantiated modeling assumptions [laan_super_2007, diaz_nonparametric_2021, laan_targeted_2011, munoz_population_2012, haneuse_estimation_2013]. We focused our analysis on U.S. counties, the administrative level where the most fine-grained data are available on mobility as well as a rich set of economic and demographic confounders.

The remaining article is organized as follows. We first provide a brief overview of prior work investigating mobility and COVID-19 transmission. We then discuss the definition and use of MTP in the context of our research question, detail our causal identification assumptions and statistical estimation approach, and present our results. We conclude with a discussion how our findings compare to prior research and some limitations of our analysis. Additionally, we highlight software options for researchers interested in using MTP in their own work. Finally, detailed descriptions of the observed data and sensitivity analyses are included in the appendices.

1.1 Brief review of the literature on mobility & COVID-19 transmission

Recent papers investigating the associations between mobility, COVID-19 spread, and NPIs can be roughly categorized into one of three bins: work that uses mobility as an input into a more complex model to forecast COVID-19 cases and deaths (see, e.g., [chang_mobility_2020, chernozhukov_causal_2021, karaivanov_face_2020]); work that assesses the effectiveness of, or compliance with, NPIs, using mobility as an outcome variable (see, e.g., [jay_neighbourhood_2020, gupta_tracking_2020, lai_unsupervised_2020, yabe_non-compulsory_2020, allcott_what_2020, wellenius_impacts_2021]); and finally work that treats mobility as an exposure variable and new COVID-19 case rates or deaths as the outcome. This work falls clearly in the last category.

Most papers focusing on the early-pandemic (Spring 2020) timeframe found strong correlations between mobility and COVID-19 spread [li_association_2020, badr_association_2020, xiong_mobile_2020, persson_monitoring_2021, wellenius_impacts_2021]. However, these early studies are potentially confounded by the spatial dynamics of where the pandemic arrived first and unmeasured variables such as underlying changes in behavior including mask-wearing or the propensity to socialize outdoors rather than in closed indoor environments (see, e.g., the heterogeneity identified by [allcott_what_2020]). In addition, they may be subject to bias from undercounted cases due to lack of testing capacity in many parts of the U.S. [noauthor_covid_nodate]. Given these challenges, it is perhaps unsurprising that the associations between mobility and COVID-19 cases were strongly attenuated in studies extended through mid-Summer of 2020 [badr_limitations_2020, gatalo_associations_2020].

Our work contributes to the existing literature in four critical ways. First, we extend the examination of the relationships between mobility and COVID-19 case rates until just before the U.S. Thanksgiving holiday. Second, we use an MTP approach, described next, to define relevant causal effects and avoid arbitrary categorizations of the continuous exposure. Third, we use TMLE with Super Learner for estimation and inference, allowing us to adjust for a rich set of county-level confounders while relaxing parametric assumptions used in many of the other modeling research. Finally, we examine a wide set of mobility indices to determine which, if any, might be associated with COVID-19 case rates, and hence be possible proxies for the underlying behaviors that drive COVID-19 spread.

1.2 The modified treatment policy approach

Given the extensive literature on causal inference for binary exposures, one approach to examining mobility effects would be to categorize the continuous exposure into “high” and “low” levels and ask what would be the expected difference in counterfactual COVID-19 case rates if all counties had a “high” vs. “low” level of mobility. However, selecting a binary cut point for this categorization is arbitrary and risks losing information. Additionally, counties with a high proportion of essential workers might not be able to reduce their overall mobility to the specified “low” level [bembom_practical_2007]

, violating the positivity assumption that all counties must have nonzero probability of receiving the specified exposure level within all values of measured adjustment variables

[petersen_diagnosing_2012]. As an alternative to categorizing the exposure, we could consider targeting the parameters of a working marginal structural model [robins_marginal_1998, neugebauer_nonparametric_2007] or directly attempt to estimate a causal ‘dose-response’ curve to capture how a continuous measure of mobility impacts the expected counterfactual case rates [westling_causal_2019, kennedy_non-parametric_2017]. However, as before, there may be challenges with data support from structural and practical violations of the positivity assumption.

Instead, we took non-parametric approach to specify a causal effect that minimized parametric assumptions and naturally satisfied the positivity assumption [laan_targeted_2011, laan_targeted_2018]. To do so, we defined our research question using an MTP, which aims to quantify the impact on the counterfactual outcome of an intervention that shifts the exposure for each observation to a new level , which is set by deterministic function of the observed ‘natural’ exposure level and covariates through some shift function [haneuse_estimation_2013]. In our example, we might consider an MTP where the mobility shift amount depended on population density: sparse counties would be assigned a mobility reduction of 10% while dense counties would be assigned a reduction of 20%. However, for simplicity and demonstration, we ignored covariates when determining shift amounts, instead examining the expected counterfactual outcome if the mobility distribution across counties were shifted by an additive or multiplicative constant - for example, if the percentage of people who stayed in their home all day were increased by 5 points from the observed level in each county.

An MTP is similar to a stochastic intervention, where the exposure is a random draw from a shifted distribution of the observed exposures [didelez_direct_2012, munoz_population_2012, stock_nonparametric_1989] and where the shift amount can depend on covariate values. For example, Díaz and van der Laan [munoz_population_2012] considered patients’ pre-existing health conditions when evaluating the effect on mortality of a shift in exercise levels, the exposure of interest. Recent work has extended the MTP approach to longitudinal settings and introduced a less computationally intensive estimation algorithm of the corresponding statistical parameter [diaz_nonparametric_2021].

The MTP approach was attractive in our context because it allows, as per Haneuse and Rotnitzky [haneuse_estimation_2013], “the set of feasible treatments for each subject [to depend] on attributes that are not fully captured by baseline covariates but which are reasonably captured by the treatment actually received.” There is no assumption that, for example, the densely populated New York County would be shifted to the mobility level of a rural area, even if some observed covariate values (say, demographics) were similar. Full details of the assumptions and estimation strategy are given in Section 2. The MTP approach has been used in several biomedical settings [diaz_causal_2020, kamel_relationship_2019, hubbard_time-dependent_2013] and in at least one public policy context [rudolph_when_2021], but to our knowledge this is the first time it has been applied to infectious diseases.

2 Methods

At the county-level, we aimed to evaluate the effect of shifting mobility on new COVID-19 cases per 100,000 residents two weeks later during the period from June 1 - November 14, 2020. Data on confirmed COVID-19 case rates were obtained from the New York Times [the_new_york_times_coronavirus_2020] and USAFacts [noauthor_coronavirus_nodate]. In line with other studies [badr_association_2020, li_association_2020], a two-week “leading” period was selected based on the reported incubation time for COVID-19 [linton_incubation_2020, lauer_incubation_2020, li_early_2020] and to account for delays in case reporting. In sensitivity analyses, we examined the impact on one-week leading case rates.

We considered 10 mobility indices, generated from mobile phone tracking data and selected to represent a wide range of behaviors that potentially impact and are impacted by COVID-19 case rates. Specifically, Google’s “residential” index [google_covid19_nodate] and Facebook’s “single tile user” and “tiles visited” indices [facebook_movement_nodate] were expected to capture behavior related to staying at home (i.e., sheltering-in-place). Additional indices from Google [google_covid19_nodate] captured changes in workplace attendance (“workplaces”), non-essential travel (“retail and recreation”), and use of public transportation (“transit stations”). Likewise, the “Device Exposure Index” (DEX) and “Adjusted Device Exposure Index” (DEX-A) [couture_measuring_2020] captured the density of human traffic at locations visited. Finally, Descartes Labs’ “m50” and “m50 Index” [descartes_labs_mobility_nodate] captured distance traveled outside the home. Full definitions of each index are provided in Appendix A.1. The ten mobility indices were only reported in counties where sufficient mobile phone observations were available to provide accurate and anonymous data, typically more populous counties. To address this and make the results more comparable across different indices, we excluded counties with fewer than 40,000 residents. This led to the exclusion of 1,948 of 3,139 counties (62%) in our main analysis, many of them rural and sparsely populated. The 1,182 included counties represent 90% of the U.S. population. A comparison of the counties included/excluded is provided in Appendix A.2. Notably for the interpretation of results, this inclusion/exclusion criteria changes our target population from all U.S. counties to the most populous U.S. counties.

To account for measured differences between counties on factors potentially influencing mobility and COVID-19 case rates (i.e., confounders), we collected a wide range of covariate values from public data sources. These spanned several domains: socioeconomic (ex: levels of poverty, unemployment, education, median household income, crowded housing, population density, urbanization); health (ex: rates of smoking and obesity, pollution, extreme temperatures); demographic (ex: distribution of age, race, sex); and political (ex: 2016 presidential election result, mask mandate level). Additionally, since the one-week-behind case rate could influence that week’s mobility (e.g., people may be less mobile if the perceived danger is higher) and subsequent transmission, previous week case rates were also included in the confounder set . The full list of variables collected and links to their sources are provided in Appendix A.3.

2.1 Data structure, causal model, & causal parameter

To assess impact over time while avoiding holiday- and weekday-level case reporting idiosyncrasies [noauthor_analysis_nodate], mobility and case data were binned into weeks using a simple mean. We then repeated our analysis in cross-sectional snapshots over each week during the period beginning June 1, 2020 and ending November 14, 2020 and separately for each mobility index. Our observed data consisted of our outcome , the new case rate; exposure , the mobility level; and the set of potential confounders . For each week and mobility index, we assumed that the following nonparametric structural equation model (NPSEM) characterized the data generating process [pearl_causality_2000]:

where , , and

are unobserved random variables and

, and are unspecified (non-parametric) functions. This causal model also encoded our assumption of independence between counties. (We discuss the plausibility of this assumption in the limitations section.) For each week and mobility index, we assumed the county-level observed data were generated by sampling from a distribution compatible with this causal model.

Following [diaz_nonparametric_2021], we then modified in the causal model to generate counterfactual case rates under the shifted exposure distributions. In our application, shifts did not depend on the covariates . Therefore, we simplified the treatment rule to and for shifts by a multiplicative constant or an additive constant , respectively. We selected shift amounts to reduce mobility or based on the index definition and range of observed values. For example, Facebook’s “single tile user” is a proportion of users who stayed in a single location (presumably, their residence) in a day and is thus bounded by 0 and 1. Based on its observed distribution (pooling over all follow-up weeks), we considered relative increases of 5, 7, and 8 percentage points, truncated at the upper bound of 1 if needed. For some indices, such as the DEX-A, which measures the density of mobile devices visiting commercial locations, larger multiplicative shifts were reasonable; an example of a reduction shift of 0.75 is shown in the left panel of Figure 1. As a third example, consider Google’s “retail and recreation” index, defined as the percentage of visits to non-essential businesses relative to a pre-COVID baseline of 100%. Here, we examined additive shifts of -4, -5, and -6 percentage points, bounded by the index limit of -100, an example of which is shown in the right panel of Figure 1. A list of shifts examined for each variable is given in Appendix A.1.

Figure 1: Example distribution shifts of mobility exposure variables. Observed values in black, shifted values in red.

Given the selected shift for each mobility index, we then specified the causal parameter as the expected counterfactual COVID-19 case rate if the mobility index had been shifted to versus if mobility index remained at its observed level :


With for multiplicative shifts or for additive shifts, setting or in the treatment rule recovers the observed exposure and outcome . In other words, the counterfactual outcomes under no shift equal the outcomes we observe.

In order to pare down our large covariate set, we chose eight variables to include in based on which had the strongest univariate associations with the outcome and exposure, pooled over all time slices. In supplementary analyses, we also tested strategies where (i) covariates were chosen week-by-week instead of being held constant across all time slices and (ii) a smaller covariate set was chosen to make possible examinations of larger shift values without practical positivity violations (discussed below in Section 2.2).

2.2 Identification & the statistical estimand

In order for the causal parameter (Eq. 1, written in terms of a summary measure of the distribution of counterfactuals) to be identified in terms of the observed data distribution, several assumptions were required [munoz_population_2012, haneuse_estimation_2013, diaz_nonparametric_2021]:

  • No unmeasured confounding: There are no unmeasured common causes of the mobility index and subsequent COVID-19 case rates, which we can formalize as . This can also be expressed in terms of the following distributional assumptions on the unmeasured factors: and either or . This can also be expressed in terms of the weaker conditional exchangeability assumption [haneuse_estimation_2013]: For each within the support of and , . Intuitively, this means that counties with mobility level could have been assigned to mobility level

    (by some hypothetical randomization mechanism that may involve W) with an unchanged outcome probability distribution for

    . In our context, this assumption would be violated if, for example, an unmeasured variable influenced both the county-level mobility and the COVID case rate. One possible example of this could be issuing of and compliance with NPIs; an NPI might induce reduced mobility and also influence non-mobility-related behavior that is associated with future case rates. We attempted improve the plausibility of this assumption by considering a very large confounder set that includes known correlates for our outcome variable and exposure. Given plausible concerns about data support (described next) and desire to avoid controlling for instrumental variables, we reduced the adjustment set based on univariate correlations between the exposure and outcome (pooling over follow-up weeks). Altogether, we cannot guarantee that this assumption holds, and therefore limit our interpretations to be statistical associations as opposed to causal effects.

  • Positivity: If is within the support of , then must also be within the support of . In practice, this means that for any given time-slice and set of adjustment covariates, there is a positive probability of finding a county with the same covariate values and a mobility level that matches the shifted value. As noted earlier, this is a weaker positivity assumption than for other causal effects, which require nonzero probability of all observations receiving a specified level of treatment (say, “low” or “high” mobility) within values of adjustment variables. Our weaker positivity assumption, while reasonable in theory, is viable in practice only when the specified shifts in exposure are not too drastic and the adjustment set not too large. In our main analysis, we restricted to shifts that were small enough to make this assumption tenable, while recognizing that this approach makes our causal effect data-adaptive. In a supplementary analysis, we used a smaller adjustment set and tested larger shifts.

  • There are three additional assumptions that are implied by our adoption of the NPSEM. (1) Independence of counties. This independence assumption is likely violated since there were state-level policies (e.g., stay-at-home orders) that plausibly created dependence between counties in a state. The independence assumption also implies no interfence: The exposure level for a given county does not affect the outcomes of the other counties. Given that infectious disease spread does not respect town, county, state, or country boundaries, this assumption may be unrealistic. (2) Consistency: If for any county, then , and hence the full observed set of outcomes when is simply . This means that the counterfactual outcome for a county with its observed mobility level is the observed outcome. This assumption is implied by our definition of counterfactuals as derived quantities through interventions on the NPSEM. (3) Time-ordering: The confounders precede the mobility exposure , which also precedes the case rate outcome .

If the identifiability assumptions hold, we could specify and focus our estimation efforts on a statistical estimand that equals the wished-for causal effect. In the (likely) case that they are not satisfied, we could still specify and focus our estimation efforts on a statistical estimand that is as close as possible to the causal parameter. Factoring the joint distribution of the observed data

into , it was shown in [haneuse_estimation_2013] that the statistical estimand corresponding to expected counterfactual outcome under treatment rule , , is given by


with the joint density of received exposures and covariate levels being integrated over. Equation 2 is equivalent to the extended g-computation formula of Robins et al. [robins_effects_2004]. Extending this slightly, our statistical estimand of interest, corresponding to the expected difference in COVID-19 case rates under shifted and observed mobility, is given by :

where denotes the new case rate and the observed data distribution. In words, our statistical estimand is the difference in expectation of new cases per 100,000-residents under a specified mobility shift , after adjusting for measured confounders, and the observed expectation of new cases per 100,000-residents . All interventions we examined were reductions in mobility; hence, under this definition, we expect to be negative if mobility reductions are associated with lower case rates.

3 Estimation

To estimate the expected outcome under the observed exposure , we can use the empirical mean outcome. For estimating the shifted parameter , we used targeted minimum loss-based estimation (TMLE) [laan_targeted_2011, laan_targeted_2018]

. TMLE is a general framework for constructing asymptotically linear estimators with an optimal bias-variance tradeoff for a target parameter; in our case,

. Typically, TMLE uses the factorization of into an outcome regression and an intervention mechanism , following which an initial estimate of the outcome regression is updated by a fluctuation that is a function of the intervention mechanism. The update step tilts the preliminary estimates toward a solution of the efficient influence function estimating equation, endowing it with desirable asymptotic properties such as double-robustness, described below. Further, it allows practictioners to utilize machine learning algorithms in the estimation of and while still maintaining valid statistical inference.

TMLE is a general framework; the exact method may vary depending on the data structure and (causal) question of interest. In this application, we used the TMLE implementation of Díaz [diaz_nonparametric_2021]. We summarize briefly here the algorithm; further technical details are supplied in [diaz_nonparametric_2021].

First we define the density ratio


where is the conditional density of under the shifted distribution and the conditional density as observed. Unfortunately, estimating the conditional density directly can be difficult in high dimensions. Few data-adaptive algorithms exist [diaz_nonparametric_2021], and for those that do [benkeser_highly_2016], computation time can be demanding and data sparsity can make estimates unstable. Luckily, an equivalent estimate for can be found by recasting it as a classification problem [diaz_nonparametric_2021]. In this formulation, a data set consisting of two copies of each observation is created, one assigned the observed exposure and one assigned the shifted exposure. Further, each is given an indicator for whether it received the natural value of treatment or the shifted value. Using Bayes’ rule and the fact that ,


In this formulation, the density ratio estimate can be calculated as a simple function of , which can be estimated using any binary classification method and which we will write as . This innovation by Díaz [diaz_nonparametric_2021] drastically improved computation time compared to estimating directly (as is done in several algorithms for stochastic interventions [hejazi_txshift_2020, hejazi_tmle3shift_2021]).

To implement estimation of and the conditional outcome regression flexibly, we used the ensemble machine learning algorithm Super Learner, allowing us to minimize parametric assumptions [laan_super_2007]

. Super Learner creates a convex combination of candidate algorithms chosen through cross-validation to minimize a user-specified loss function. In our case, the Super Learner ensemble included generalized linear models, generalized additive models, gradient boosted decision trees, random forest, and a simple mean.

With these nuisance function estimates in hand, we used the following process to estimate , modified for the single time point case from [diaz_nonparametric_2021].

  • Define and calculate weights for , calculated from estimated via Super Learner

  • After scaling our bounded continuous outcome to [gruber_targeted_2010]

    , fit the logistic regression on the observed (scaled)


    with weights , calculating the estimated intercept

  • Update the estimates as

After unscaling, with the updated estimates in hand, define as:

The TMLE estimator is ‘doubly robust’ in that it is consistent if either the estimators of or

are consistently estimated and converge at appropriate rates. Further, if both meet those conditions, along with certain regularity conditions, the TMLE converges to a normal distribution with mean 0 and variance equal to the non-parametric efficiency bound, as shown in Theorem 3 of

[diaz_nonparametric_2021]. While convergence at the required rate for each nuisance parameter is not guaranteed, the inclusion of a wide variety of algorithms in our Super Learner ensemble maximizes our chance of achieving these rates [laan_super_2007]. The efficiency bound for this TMLE depends on the variance of the outcome conditional on and , the amount of treatment effect variability, and the intensity of shift applied, as measured by the density ratio [diaz_nonparametric_2021].

Using the empirical mean to estimate , we hence have our desired estimate

From this result, variance estimation for can be calculated using the estimated influence curves from and the empirical mean:

with the influence curve defined as follows: [diaz_nonparametric_2021]

. This allowed for the calculation of the 95% Wald-style confidence interval


All statistical analysis was performed in using R version 4.0.3 [r_core_team_r_2020], in particular the ltmp_tmle function in the lmtp package [williams_lmtp_2020]. Code and the dataset to reproduce the analysis is available at the authors’ GitHub repository (https://github.com/joshua-nugent/covid-mtp).

4 Results

The following figures show estimates of , the difference between the COVID-19 case rate that would be expected under the shifted distribution, after adjusting for measured confounders, and the observed average case rate, with associated 95% confidence intervals (CIs), for weekly time slices from June 1, 2020 to November 14, 2020. For comparison, we also show the unadjusted estimate, which is defined analogously to , simply without any covariates included in the adjustment set.

Figure 2: Difference in mean predicted case rates two weeks ahead with shifted distribution vs. observed case rates for weekly time slices from June 1 - November 14, 2020. Unadjusted analyses show some correlation between mobility and case rates (negative gray values), but after adjustment (colored values), there is no consistent association across the set of weeks examined. The +5% increase corresponds to a reduction in mobility (more people staying at home).

As an illustrative example, consider Figure 2, showing for the exposure “Facebook single tile users,” which measures the proportion of mobile devices in a county that stayed in a single location (presumably, a home) each day, averaged over the week. In the first time-slice (June 1, 2020), the unadjusted results, in gray, showed a small association between reduced mobility (5% increase in the proportion of people staying at home) and reduced case rates: -3.2 (95% CI: -4.4, -1.9) cases per 100,000 residents two weeks later. After adjusting for confounders (in red), the change in case rates shrinks to -0.7 (-2.2, 0.7). In the week of September 28, the results are similar but more extreme: a 5% increase in the proportion of people staying at home in the unadjusted analysis suggests a change of -16.0 (-18.6, -13.3) cases per 100,000 residents two weeks later, while that association is attenuated to -4.5 (-7.0, -2.0) cases per 100,000 residents after confounder adjustment. Across all of the time slices for that mobility index, we see the same pattern: Unadjusted estimates of vary, but are negative, as we would expect; lower mobility appears to be associated with lower case rates. After adjustment, however, the estimates shrink towards zero and many of the CIs overlap with zero.

Figure 3: Difference in mean predicted case rates two weeks ahead with shifted distribution vs. observed case rates for weekly time slices from June 1 - November 14, 2020. Unadjusted analyses show some correlation between mobility and case rates (gray values with 95% CIs below the null line of 0), but after adjustment (colored values), no indices show consistent associations across the weeks examined. For most indices, the majority of confidence intervals include 0 after adjustment. Recall that the two stay-at-home measures, Facebook’s ‘single tile users’ and Google’s ‘residential’, were assigned positive shifts, which corresponded to a reduction in mobility.

Broadening the scope to the full set of indices examined in Figure 3, we see some consistent patterns. First, almost all unadjusted estimates are larger than the adjusted ones, though the point estimates are usually in the same direction after adjustment. Second, the CIs increase in October and November, suggesting higher outcome variance, more treatment effect variability, or higher density ratios (perhaps from near practical positivity violations). Third and most notable, after adjustment for baseline confounders and one-week lagged case rates, the associations were generally small, CIs overlapped with zero, and point estimates showed almost no consistent trends of positive or negative association.

For example, only six of the indices showed even short-term consistent associations with the outcome after adjustment, perhaps due to seasonal effects: Facebook’s ‘single tile users’ stay-at-home metric (Aug, Sept, Oct), PlaceIQ’s ‘DEX’ and ‘DEX-A’ measures of device density (Jun, Aug), Decartes Labs ‘m50 index’ of distance traveled from the home (Aug, Sept, Oct), Google’s workplace index (Sept, Oct), and Google’s stay-at-home residential index (Sept, Oct). However, these same indices had an uncertain associations in other time periods, with CIs overlapping zero, and in several instances had point estimates (and sometimes and CIs) unexpectedly above zero.

Surprisingly, three indices showed positive unadjusted point estimates in June and July (Facebook ‘tiles visited’ and Google’s ‘retail & recreation’ and ‘residential’ indices), with Facebook’s ‘tiles visited’ index showing an unusually large swing between summer and fall. After adjustment, the estimates were strongly attenuated. These data points may be displaying most clearly the confounding effect of the covariates included in our adjustment set. More discussion of possible explanations for the mixed results are presented in Section 5.

We performed several sensitivity analyses that did not meaningfully change our results (Appendix B). Those include using current case rates instead of one-week lagged case rates in the adjustment set, and using one-week leading case rates as an outcome instead of two-week leading rates. Further, after noting the limitations of tree-based methods in TMLE discussed in [balzer_demystifying_2021]

, we re-ran the analysis omitting the random forest algorithm from the Super Learner and adding multivariate adaptive regression splines; this change did not substantively alter the results. Our two supplementary analyses, one with a time-varying confounder set

and the other with a smaller confounder set (four rather than eight) and larger shifts, are presented in Appendix C. Neither supplementary analysis differed significantly from the main results.

5 Discussion

To the best of our knowledge, we are the first to apply a doubly-robust MTP/TMLE approach to provide insights into a pressing question in infectious disease research. Specifically, we aimed to answer: Did reductions in mobility, as captured the mobile phone data, lead to reduction in subsequent COVID-19 case rates across U.S. counties in the summer and fall of 2020? We also hope this paper will serve as a demonstration of how MTP can be used by other researchers to minimize (likely unjustifiable) parametric assumptions when defining causal effects and how TMLE can be applied to estimate the corresponding statistical parameters. We again note that the causal inference literature for binary exposures is vast and well-established, especially as compared to the corresponding literature for continuous exposures. We believe MTP is a powerful and under-utilized approach to defining effects for continuous exposures.

Our findings on how the associations between mobility and COVID spread vary over time and with adjustment are in line with the results found in earlier research. In an analysis using data from the 25 hardest-hit U.S. counties through mid-April 2020, Badr et al. [badr_association_2020] fit an unadjusted statistical model for future case growth rates based on a mobility measure using number of trips made per day compared to a pre-pandemic baseline. While strong associations were found between changes in their mobility index and county-level case growth 9-12 days later, a later updated analysis [badr_limitations_2020], using data through September 16 and expanding the dataset to all U.S. counties, showed that the association disappeared after April. Similarly, Gatalo et al. [gatalo_associations_2020], using a slightly different methodology, found strong correlations between mobility and case growth from March 27 to April 20, but only a weak correlation from then until July 22. Another analysis [xiong_mobile_2020], using data through June 9, 2020, treated the inflow of traffic into a county as the exposure, adjusted for a small number of socio-demographic covariates, and found a strong relationship between inflow and new cases 7 days later. Finally, an examination [li_association_2020] of the correlations between Google’s six mobility indices (retail & recreation, transit stations, workplaces, residential, parks, grocery & pharmacy) and 11-day ahead case growth rates at the county level, stratifying by urban-rural classification, suggested strong associations in the February to April timeframe, especially in more dense counties, but those correlations were weaker when the timeframe was extended to June.

Though our unadjusted analyses seem to suggest that decreased mobility was associated with reductions in COVID-19 case rates for the majority of timepoints examined, adjusted results suggest that the associations between commonly studied indices and COVID-19 case rates may be highly confounded by county characteristics and current case rates. For example, our adjustment set included variables for county-level political lean, which may be a proxy for adherence to NPIs such as mask-wearing mandates. On first blush, it could appear that counties with lower mobility have fewer new cases, but if low-mobility counties are also more consistently adhering to mask-wearing due to underlying political dynamics, the association with mobility could vanish after adjustment. Including one-week lagged case rates in the adjustment set (as we argue is necessary in the infectious disease context) also demonstrates important confounding by epidemic arrival time and intensity. Our work adds to other evidence of confounding in COVID-19 research; for example, McLaren [mclaren_racial_2020] found that some correlations between racial/ethnic groups and COVID-19 mortality disappeared once confounding factors were taken into account, and Wong and Balzer [wong_evaluating_2021] found that associations between mask mandates and future case rates were attenuated after confounder adjustment. We suspect this confounding may explain why our results contradict those of Wellenius et al. [wellenius_impacts_2021], who found that mobility levels (using many of the same county-level metrics) in Spring 2020 were strongly associated with COVID case growth rates two weeks later. Their work differed from this analysis in that the researchers did not include any socioeconomic covariates in their parametric model nor extend their analysis beyond March and early April.

There are several limitations to our analysis, some of which may explain the inconsistent patterns of association we see over time and suggest avenues for future investigation. First, while we chose a timeframe when COVID testing was widely available, differential ascertainment of case rates across counties could bias our estimates. Second, while county-level mobility and covariate data are easily accessible, there might be important sub-county level effects that are masked when aggregating to the county level. Performing this analysis at the zip code or census tract level might reveal stronger or more consistent associations. Indeed, an analysis of zip-code level mobility data and case rates in five cities [glaeser_how_2020], while finding a strong association, also reported substantial heterogeneity across space and time. Third, our assumption of independence between counties is unlikely to hold, given state-level policy enactment and spillover effects between counties. Network effects can be unpredictable [lee_network_2021]

, so we hesitate to speculate about the directions in which they might have biased our point estimates or standard errors, but they are likely in play.

Fourth, despite our large adjustment set, it is possible that we excluded an important confounder, or that we have unmeasured confounding, as is common in studies with observational rather than experimental data. For example, we had no data on the strength of a county’s public health infrastructure or compliance with NPIs, which could presumably impact both the level of mobility observed and new case rates. Related to the dimension of the adjustment set , practical positivity violations may have also limited our ability to detect associations. After adjusting for demographics, political lean, and economic variables, the variability in exposure and outcome across counties may have been small, limiting the amount of shift we could apply without generating extreme density ratios and the accompanying large standard errors (recall that the density ratio is a key component of the efficiency bound). If mobility index data were available for the sparsely populated counties that we excluded, it might be possible to expand our target population and investigate a wider range of intervention shift values that might detect stronger or more consistent associations. Fifth, there are still many unknowns about the dynamics of COVID transmission over space and time. Including one week lagged case rates in our adjustment set may have helped address some of those unknowns, but is likely not able to account for all of the complex dynamics of COVID transmission. For example, the return to schools in fall 2020 may partially explain how several indices show different behavior in the summer and fall: the adjusted DEX and DEX-A point estimates change sign and the adjusted Descartes Labs estimates shift from minimal associations to stronger ones. Finally, our choice of a cross-sectional time-series analysis in weekly slices may have hidden complex patterns over time. A longitudinal analysis that examines the full scope of mobility over time and its impact on cumulative cases per capita could provide a more complete view of the full impact (or lack thereof) of mobility on case transmission. Doubly-robust longitudinal MTP methods are available in the ltmp R package and offer a promising avenue to answer those questions.

Despite these limitations, our analysis provides evidence that mobility indices may not be a simple proxy measure for future case rates with airborne infectious diseases. Using MTP allowed us to consider shifts in mobility suppored by our observed data, and TMLE allowed for the leveraging of machine learning to minimize parametric assumptions while still providing statistical inference. For continuous exposures, MTP can be a useful tool for important research and policy questions.

Appendix A Data details

a.1 Mobility index sources, definitions, and shifts considered

Facebook Movement Range

  • Change in movement: Percentage change in the number of ‘Bing tiles’ (roughly 600 m by 600 m) a device pinged in a given day, compared to the baseline average for that weekday in February 2020, excluding Presidents’ Day, aggregated at the county level. We examined additive shifts of -3 and -4.

  • Stay put: Percentage of people who stay in a single Bing tile over an entire day. We examined multiplicative shifts of 1.05, 1.07, and 1.08.

Google COVID-19 Community Mobility Reports

For Google indices, all of the baseline values are defined as the median value for that day of the week from January 3, 2020, to February 6, 2020.

  • Retail & recreation: Percentage change in number of visits and length of stay from baseline for restaurants, malls, movie theaters, and retail stores (excluding grocery stores and pharmacies). We examined additive shifts of -4, -5, and -6.

  • Transit stations: Percentage change in number of visits and length of stay from baseline for mass transit stations (buses, subways, etc). We examined additive shifts of -5, -10, and -15.

  • Workplaces: Percentage change in number of visits and length of stay from baseline for an individual’s place of work. We examined additive shifts of -2, -3, and -4.

  • Residential: Percentage change from baseline of duration time spent at one’s residence each day. Since the baseline values are high, the variation in this index is smaller than the others. We examined additive shifts of +1, +2, and +3.


  • Device exposure index (DEX): number of distinct devices that entered the same commercial spaces that day as the target device, then averages that number across all devices in the county. We examined multiplicative shifts of 0.75, 0.7, and 0.65.

  • Adjusted DEX: Scaled version of the DEX that adjusts for the possibility of devices sheltering in place, generating no pings, and hence not having their zero-DEX values be counted in the overall average. We examined multiplicative shifts of 0.75, 0.7, and 0.65.

Descartes Labs m50

  • m50: Median maximum distance traveled from starting location by all tracked devices in a county. We examined multiplicative shifts of 0.6, 0.7, and 0.8.

  • m50 index: Relative value of m50 relative to median baseline value for that day of the week between February 2, 2020, and March 7, 2020. We examined multiplicative shifts of 0.9 and 0.85.

a.2 Comparison of counties included/excluded

All mobility indices require sufficient sample size within a county to report reliable data on a given day. In low-population counties, this can make the mobility data sparse or nonexistent over the timeframe in question. Hence, to improve comparability across different indices and address positivity violations, we restricted our analysis to counties with 40,000 or more residents. The figure below compares the covariate distribution across the included/excluded counties. While a vast number of counties were excluded, they represent only 10% of the U.S. population, and covariate means were similar to the counties included.

All counties Included Excluded
N 3,158 1,182 1,926
Mean population 105,362 250,886 16,053
Mean population density (/sq. mi) 27,167 62,366 5,608
% Adults without health insurance 14.3% 13% 15%
% Under 18 22.3% 22.6% 22.2%
% Households below poverty line 15.2% 13.8% 16.1%
% Hispanic 9.3% 10.8% 8.4%
% Women 49.9% 50.6% 49.5%
% Black 8.9% 9.9% 8.5%
% Clinton vote in 2016 election 31.7% 38.6% 27.8%
% Drive alone to work 79.2% 80.5% 78.9%

a.3 Adjustment variables and their sources

We gathered a rich set of social, economic, physical, and demographic variables from the U.S. Census American Community Survey, U.S. Bureau of Labor Statistics, U.S. Centers for Disease Control, U.S. Department of Agriculture, University of Washington Population Health Institute and Department of Political Science, National Oceanic and Atmospheric Administration, and MIT Election Data and Science Lab. These included the covariates below, which we screened into the adjustment set after examining their correlations with the exposure (mobility index) and outcome (leading new cases per 100,000 people).

  • U.S. Census American Community Survey

    • Percentage of households below the poverty line

  • University of Washington Population Health Institute, via Robert Wood Johnson Foundation County Health Rankings

    • Percentage of people that identify as Black

    • Percentage of people that identify as Hispanic

    • Percentage of people that drive alone to work

    • Percentage of people that are women (note: values under 49% are often an indicator of a large prison being in the county)

    • Percentage of people that are under 18 years of age

    • Percentage of adults without health insurance

  • MIT Election Data and Science Lab

    • Percentage of vote for Hillary Clinton in the 2016 Presidential election

The following covariates were also examined, but were removed from the final adjustment set after the screening process.

  • U.S. Census American Community Survey

    • Population density

    • Natural logarithm of population density

    • Average household size

    • Percentage of people living below the poverty line

    • Percentage of households with income below the poverty line

    • Percentage of people in the county who live outside of a metropolitan area

  • University of Washington Population Health Institute, via Robert Wood Johnson Foundation County Health Rankings

    • Percentage of adults who smoke

    • Percentage of people that identify as Asian

    • Percentage of people that are 65 years of age or older

    • Percentage of people with more than a high school education

    • Index of income inequality

    • Natural logarithm of median household income

    • Level of air particulate matter (pollution)

  • State-level social distancing policies in response to the 2019 novel coronavirus in the U.S., University of Washington Department of Political Science

    • Mask mandate level (statewide data only)

  • Centers for Disease Control and Prevention

    • Percentage of people who live in crowded housing

    • Percentage of people categorized as obese

    • Percentage of people living with diabetes

  • National Oceanic and Atmospheric Administration

    • Absolute daily high temperature deviation from 70 degrees Farenheit

  • U.S. Bureau of Labor Statistics

    • Unemployment rate

  • U.S. Department of Agriculture

    • Presence of a level 5 meatpacking plant (largest plants)

    • Presence of a level 4 meatpacking plant (second largest plants)

    • Presence of a level 4 or level 5 meatpacking plant

Appendix B Sensitivity analyses

b.1 Using one-week leading case rates as outcome

b.2 Using current-week cases in the adjustment set instead of one-week lagged cases

b.3 Observed exposure distributions over time

Appendix C Supplementary analyses

c.1 Using a time-varying confounder set

We tested three of the indices allowing for a different confounder set , screened in each week time-slice by univariate correlations with the exposure and outcome. As with the main analysis, the adjusted results (shown below for several shift value possibilities) show inconsistent patterns over time and CIs frequently overlap with zero.

c.2 Using a smaller confounder set with larger shifts

We tested three of the indices with a smaller, time-varying confounder set (four rather than eight), screened in each week time-slice by univariate correlations with the exposure and outcome, with larger shift values. As with the main analysis, the adjusted results (shown below for several shift value possibilities) show inconsistent patterns over time and CIs frequently overlap with zero.