1 Introduction
During a pandemic caused by a virus novel to the human immune system, such as today’s SARSnCoV2 (COVID19), policymakers have only nonpharmaceutical interventions (NPIs) to rely upon while treatments and vaccines are developed, tested, and distributed. Many NPIs, such as school closings or stayathome orders, are designed to reduce the number of close contacts a person has in order to curtail community spread. However, NPIs are only effective if they actually create the desired changes in behavior and if the changes in behavior have a causal impact on disease transmission.
To aid in COVID19 response, private companies and research groups [huang_twitter_2020, pepe_covid19_2020, couture_measuring_2020, gao_mapping_2020] developed and maintain data sets aiming to quantify mobility and social distancing in the U.S. and worldwide [google_covid19_nodate] [facebook_movement_nodate]. These mobility indices are generated from anonymized mobile device tracking data aggregated at some geographic level, such as ZIP code or county. Despite privacy concerns, these indices may be a useful tool for researchers and public health officials [buckee_aggregated_2020]. However, it is not clear which of these indices, if any, are meaningful proxies for behaviors that are related to COVID19 spread, and whether any associations identified persist over time [grantz_use_2020]. Evaluating the potential causal impact of mobility on disease spread is complicated by, among other things, the presence of confounders, varying epidemic arrival time across different areas, and changing governmental policies.
Our goal was to evaluate the effect of mobility, as captured through shifts in the distributions of various indices, on subsequent COVID19 case rates during Summer and Fall 2020, a period when social distancing was generally encouraged in the U.S. To do so, we employed stateoftheart methods in causal inference to first define the causal effect using a modified treatment policy (MTP) and then estimate the corresponding statistical parameter using targeted minimum lossbased estimation (TMLE) with Super Learner to adjust for a wide range of confounders and avoid unsubstantiated modeling assumptions [laan_super_2007, diaz_nonparametric_2021, laan_targeted_2011, munoz_population_2012, haneuse_estimation_2013]. We focused our analysis on U.S. counties, the administrative level where the most finegrained data are available on mobility as well as a rich set of economic and demographic confounders.
The remaining article is organized as follows. We first provide a brief overview of prior work investigating mobility and COVID19 transmission. We then discuss the definition and use of MTP in the context of our research question, detail our causal identification assumptions and statistical estimation approach, and present our results. We conclude with a discussion how our findings compare to prior research and some limitations of our analysis. Additionally, we highlight software options for researchers interested in using MTP in their own work. Finally, detailed descriptions of the observed data and sensitivity analyses are included in the appendices.
1.1 Brief review of the literature on mobility & COVID19 transmission
Recent papers investigating the associations between mobility, COVID19 spread, and NPIs can be roughly categorized into one of three bins: work that uses mobility as an input into a more complex model to forecast COVID19 cases and deaths (see, e.g., [chang_mobility_2020, chernozhukov_causal_2021, karaivanov_face_2020]); work that assesses the effectiveness of, or compliance with, NPIs, using mobility as an outcome variable (see, e.g., [jay_neighbourhood_2020, gupta_tracking_2020, lai_unsupervised_2020, yabe_noncompulsory_2020, allcott_what_2020, wellenius_impacts_2021]); and finally work that treats mobility as an exposure variable and new COVID19 case rates or deaths as the outcome. This work falls clearly in the last category.
Most papers focusing on the earlypandemic (Spring 2020) timeframe found strong correlations between mobility and COVID19 spread [li_association_2020, badr_association_2020, xiong_mobile_2020, persson_monitoring_2021, wellenius_impacts_2021]. However, these early studies are potentially confounded by the spatial dynamics of where the pandemic arrived first and unmeasured variables such as underlying changes in behavior including maskwearing or the propensity to socialize outdoors rather than in closed indoor environments (see, e.g., the heterogeneity identified by [allcott_what_2020]). In addition, they may be subject to bias from undercounted cases due to lack of testing capacity in many parts of the U.S. [noauthor_covid_nodate]. Given these challenges, it is perhaps unsurprising that the associations between mobility and COVID19 cases were strongly attenuated in studies extended through midSummer of 2020 [badr_limitations_2020, gatalo_associations_2020].
Our work contributes to the existing literature in four critical ways. First, we extend the examination of the relationships between mobility and COVID19 case rates until just before the U.S. Thanksgiving holiday. Second, we use an MTP approach, described next, to define relevant causal effects and avoid arbitrary categorizations of the continuous exposure. Third, we use TMLE with Super Learner for estimation and inference, allowing us to adjust for a rich set of countylevel confounders while relaxing parametric assumptions used in many of the other modeling research. Finally, we examine a wide set of mobility indices to determine which, if any, might be associated with COVID19 case rates, and hence be possible proxies for the underlying behaviors that drive COVID19 spread.
1.2 The modified treatment policy approach
Given the extensive literature on causal inference for binary exposures, one approach to examining mobility effects would be to categorize the continuous exposure into “high” and “low” levels and ask what would be the expected difference in counterfactual COVID19 case rates if all counties had a “high” vs. “low” level of mobility. However, selecting a binary cut point for this categorization is arbitrary and risks losing information. Additionally, counties with a high proportion of essential workers might not be able to reduce their overall mobility to the specified “low” level [bembom_practical_2007]
, violating the positivity assumption that all counties must have nonzero probability of receiving the specified exposure level within all values of measured adjustment variables
[petersen_diagnosing_2012]. As an alternative to categorizing the exposure, we could consider targeting the parameters of a working marginal structural model [robins_marginal_1998, neugebauer_nonparametric_2007] or directly attempt to estimate a causal ‘doseresponse’ curve to capture how a continuous measure of mobility impacts the expected counterfactual case rates [westling_causal_2019, kennedy_nonparametric_2017]. However, as before, there may be challenges with data support from structural and practical violations of the positivity assumption.Instead, we took nonparametric approach to specify a causal effect that minimized parametric assumptions and naturally satisfied the positivity assumption [laan_targeted_2011, laan_targeted_2018]. To do so, we defined our research question using an MTP, which aims to quantify the impact on the counterfactual outcome of an intervention that shifts the exposure for each observation to a new level , which is set by deterministic function of the observed ‘natural’ exposure level and covariates through some shift function [haneuse_estimation_2013]. In our example, we might consider an MTP where the mobility shift amount depended on population density: sparse counties would be assigned a mobility reduction of 10% while dense counties would be assigned a reduction of 20%. However, for simplicity and demonstration, we ignored covariates when determining shift amounts, instead examining the expected counterfactual outcome if the mobility distribution across counties were shifted by an additive or multiplicative constant  for example, if the percentage of people who stayed in their home all day were increased by 5 points from the observed level in each county.
An MTP is similar to a stochastic intervention, where the exposure is a random draw from a shifted distribution of the observed exposures [didelez_direct_2012, munoz_population_2012, stock_nonparametric_1989] and where the shift amount can depend on covariate values. For example, Díaz and van der Laan [munoz_population_2012] considered patients’ preexisting health conditions when evaluating the effect on mortality of a shift in exercise levels, the exposure of interest. Recent work has extended the MTP approach to longitudinal settings and introduced a less computationally intensive estimation algorithm of the corresponding statistical parameter [diaz_nonparametric_2021].
The MTP approach was attractive in our context because it allows, as per Haneuse and Rotnitzky [haneuse_estimation_2013], “the set of feasible treatments for each subject [to depend] on attributes that are not fully captured by baseline covariates but which are reasonably captured by the treatment actually received.” There is no assumption that, for example, the densely populated New York County would be shifted to the mobility level of a rural area, even if some observed covariate values (say, demographics) were similar. Full details of the assumptions and estimation strategy are given in Section 2. The MTP approach has been used in several biomedical settings [diaz_causal_2020, kamel_relationship_2019, hubbard_timedependent_2013] and in at least one public policy context [rudolph_when_2021], but to our knowledge this is the first time it has been applied to infectious diseases.
2 Methods
At the countylevel, we aimed to evaluate the effect of shifting mobility on new COVID19 cases per 100,000 residents two weeks later during the period from June 1  November 14, 2020. Data on confirmed COVID19 case rates were obtained from the New York Times [the_new_york_times_coronavirus_2020] and USAFacts [noauthor_coronavirus_nodate]. In line with other studies [badr_association_2020, li_association_2020], a twoweek “leading” period was selected based on the reported incubation time for COVID19 [linton_incubation_2020, lauer_incubation_2020, li_early_2020] and to account for delays in case reporting. In sensitivity analyses, we examined the impact on oneweek leading case rates.
We considered 10 mobility indices, generated from mobile phone tracking data and selected to represent a wide range of behaviors that potentially impact and are impacted by COVID19 case rates. Specifically, Google’s “residential” index [google_covid19_nodate] and Facebook’s “single tile user” and “tiles visited” indices [facebook_movement_nodate] were expected to capture behavior related to staying at home (i.e., shelteringinplace). Additional indices from Google [google_covid19_nodate] captured changes in workplace attendance (“workplaces”), nonessential travel (“retail and recreation”), and use of public transportation (“transit stations”). Likewise, the “Device Exposure Index” (DEX) and “Adjusted Device Exposure Index” (DEXA) [couture_measuring_2020] captured the density of human traffic at locations visited. Finally, Descartes Labs’ “m50” and “m50 Index” [descartes_labs_mobility_nodate] captured distance traveled outside the home. Full definitions of each index are provided in Appendix A.1. The ten mobility indices were only reported in counties where sufficient mobile phone observations were available to provide accurate and anonymous data, typically more populous counties. To address this and make the results more comparable across different indices, we excluded counties with fewer than 40,000 residents. This led to the exclusion of 1,948 of 3,139 counties (62%) in our main analysis, many of them rural and sparsely populated. The 1,182 included counties represent 90% of the U.S. population. A comparison of the counties included/excluded is provided in Appendix A.2. Notably for the interpretation of results, this inclusion/exclusion criteria changes our target population from all U.S. counties to the most populous U.S. counties.
To account for measured differences between counties on factors potentially influencing mobility and COVID19 case rates (i.e., confounders), we collected a wide range of covariate values from public data sources. These spanned several domains: socioeconomic (ex: levels of poverty, unemployment, education, median household income, crowded housing, population density, urbanization); health (ex: rates of smoking and obesity, pollution, extreme temperatures); demographic (ex: distribution of age, race, sex); and political (ex: 2016 presidential election result, mask mandate level). Additionally, since the oneweekbehind case rate could influence that week’s mobility (e.g., people may be less mobile if the perceived danger is higher) and subsequent transmission, previous week case rates were also included in the confounder set . The full list of variables collected and links to their sources are provided in Appendix A.3.
2.1 Data structure, causal model, & causal parameter
To assess impact over time while avoiding holiday and weekdaylevel case reporting idiosyncrasies [noauthor_analysis_nodate], mobility and case data were binned into weeks using a simple mean. We then repeated our analysis in crosssectional snapshots over each week during the period beginning June 1, 2020 and ending November 14, 2020 and separately for each mobility index. Our observed data consisted of our outcome , the new case rate; exposure , the mobility level; and the set of potential confounders . For each week and mobility index, we assumed that the following nonparametric structural equation model (NPSEM) characterized the data generating process [pearl_causality_2000]:
where , , and
are unobserved random variables and
, and are unspecified (nonparametric) functions. This causal model also encoded our assumption of independence between counties. (We discuss the plausibility of this assumption in the limitations section.) For each week and mobility index, we assumed the countylevel observed data were generated by sampling from a distribution compatible with this causal model.Following [diaz_nonparametric_2021], we then modified in the causal model to generate counterfactual case rates under the shifted exposure distributions. In our application, shifts did not depend on the covariates . Therefore, we simplified the treatment rule to and for shifts by a multiplicative constant or an additive constant , respectively. We selected shift amounts to reduce mobility or based on the index definition and range of observed values. For example, Facebook’s “single tile user” is a proportion of users who stayed in a single location (presumably, their residence) in a day and is thus bounded by 0 and 1. Based on its observed distribution (pooling over all followup weeks), we considered relative increases of 5, 7, and 8 percentage points, truncated at the upper bound of 1 if needed. For some indices, such as the DEXA, which measures the density of mobile devices visiting commercial locations, larger multiplicative shifts were reasonable; an example of a reduction shift of 0.75 is shown in the left panel of Figure 1. As a third example, consider Google’s “retail and recreation” index, defined as the percentage of visits to nonessential businesses relative to a preCOVID baseline of 100%. Here, we examined additive shifts of 4, 5, and 6 percentage points, bounded by the index limit of 100, an example of which is shown in the right panel of Figure 1. A list of shifts examined for each variable is given in Appendix A.1.
Given the selected shift for each mobility index, we then specified the causal parameter as the expected counterfactual COVID19 case rate if the mobility index had been shifted to versus if mobility index remained at its observed level :
(1) 
With for multiplicative shifts or for additive shifts, setting or in the treatment rule recovers the observed exposure and outcome . In other words, the counterfactual outcomes under no shift equal the outcomes we observe.
In order to pare down our large covariate set, we chose eight variables to include in based on which had the strongest univariate associations with the outcome and exposure, pooled over all time slices. In supplementary analyses, we also tested strategies where (i) covariates were chosen weekbyweek instead of being held constant across all time slices and (ii) a smaller covariate set was chosen to make possible examinations of larger shift values without practical positivity violations (discussed below in Section 2.2).
2.2 Identification & the statistical estimand
In order for the causal parameter (Eq. 1, written in terms of a summary measure of the distribution of counterfactuals) to be identified in terms of the observed data distribution, several assumptions were required [munoz_population_2012, haneuse_estimation_2013, diaz_nonparametric_2021]:

No unmeasured confounding: There are no unmeasured common causes of the mobility index and subsequent COVID19 case rates, which we can formalize as . This can also be expressed in terms of the following distributional assumptions on the unmeasured factors: and either or . This can also be expressed in terms of the weaker conditional exchangeability assumption [haneuse_estimation_2013]: For each within the support of and , . Intuitively, this means that counties with mobility level could have been assigned to mobility level
(by some hypothetical randomization mechanism that may involve W) with an unchanged outcome probability distribution for
. In our context, this assumption would be violated if, for example, an unmeasured variable influenced both the countylevel mobility and the COVID case rate. One possible example of this could be issuing of and compliance with NPIs; an NPI might induce reduced mobility and also influence nonmobilityrelated behavior that is associated with future case rates. We attempted improve the plausibility of this assumption by considering a very large confounder set that includes known correlates for our outcome variable and exposure. Given plausible concerns about data support (described next) and desire to avoid controlling for instrumental variables, we reduced the adjustment set based on univariate correlations between the exposure and outcome (pooling over followup weeks). Altogether, we cannot guarantee that this assumption holds, and therefore limit our interpretations to be statistical associations as opposed to causal effects. 
Positivity: If is within the support of , then must also be within the support of . In practice, this means that for any given timeslice and set of adjustment covariates, there is a positive probability of finding a county with the same covariate values and a mobility level that matches the shifted value. As noted earlier, this is a weaker positivity assumption than for other causal effects, which require nonzero probability of all observations receiving a specified level of treatment (say, “low” or “high” mobility) within values of adjustment variables. Our weaker positivity assumption, while reasonable in theory, is viable in practice only when the specified shifts in exposure are not too drastic and the adjustment set not too large. In our main analysis, we restricted to shifts that were small enough to make this assumption tenable, while recognizing that this approach makes our causal effect dataadaptive. In a supplementary analysis, we used a smaller adjustment set and tested larger shifts.

There are three additional assumptions that are implied by our adoption of the NPSEM. (1) Independence of counties. This independence assumption is likely violated since there were statelevel policies (e.g., stayathome orders) that plausibly created dependence between counties in a state. The independence assumption also implies no interfence: The exposure level for a given county does not affect the outcomes of the other counties. Given that infectious disease spread does not respect town, county, state, or country boundaries, this assumption may be unrealistic. (2) Consistency: If for any county, then , and hence the full observed set of outcomes when is simply . This means that the counterfactual outcome for a county with its observed mobility level is the observed outcome. This assumption is implied by our definition of counterfactuals as derived quantities through interventions on the NPSEM. (3) Timeordering: The confounders precede the mobility exposure , which also precedes the case rate outcome .
If the identifiability assumptions hold, we could specify and focus our estimation efforts on a statistical estimand that equals the wishedfor causal effect. In the (likely) case that they are not satisfied, we could still specify and focus our estimation efforts on a statistical estimand that is as close as possible to the causal parameter. Factoring the joint distribution of the observed data
into , it was shown in [haneuse_estimation_2013] that the statistical estimand corresponding to expected counterfactual outcome under treatment rule , , is given by(2) 
with the joint density of received exposures and covariate levels being integrated over. Equation 2 is equivalent to the extended gcomputation formula of Robins et al. [robins_effects_2004]. Extending this slightly, our statistical estimand of interest, corresponding to the expected difference in COVID19 case rates under shifted and observed mobility, is given by :
where denotes the new case rate and the observed data distribution. In words, our statistical estimand is the difference in expectation of new cases per 100,000residents under a specified mobility shift , after adjusting for measured confounders, and the observed expectation of new cases per 100,000residents . All interventions we examined were reductions in mobility; hence, under this definition, we expect to be negative if mobility reductions are associated with lower case rates.
3 Estimation
To estimate the expected outcome under the observed exposure , we can use the empirical mean outcome. For estimating the shifted parameter , we used targeted minimum lossbased estimation (TMLE) [laan_targeted_2011, laan_targeted_2018]
. TMLE is a general framework for constructing asymptotically linear estimators with an optimal biasvariance tradeoff for a target parameter; in our case,
. Typically, TMLE uses the factorization of into an outcome regression and an intervention mechanism , following which an initial estimate of the outcome regression is updated by a fluctuation that is a function of the intervention mechanism. The update step tilts the preliminary estimates toward a solution of the efficient influence function estimating equation, endowing it with desirable asymptotic properties such as doublerobustness, described below. Further, it allows practictioners to utilize machine learning algorithms in the estimation of and while still maintaining valid statistical inference.TMLE is a general framework; the exact method may vary depending on the data structure and (causal) question of interest. In this application, we used the TMLE implementation of Díaz [diaz_nonparametric_2021]. We summarize briefly here the algorithm; further technical details are supplied in [diaz_nonparametric_2021].
First we define the density ratio
(3) 
where is the conditional density of under the shifted distribution and the conditional density as observed. Unfortunately, estimating the conditional density directly can be difficult in high dimensions. Few dataadaptive algorithms exist [diaz_nonparametric_2021], and for those that do [benkeser_highly_2016], computation time can be demanding and data sparsity can make estimates unstable. Luckily, an equivalent estimate for can be found by recasting it as a classification problem [diaz_nonparametric_2021]. In this formulation, a data set consisting of two copies of each observation is created, one assigned the observed exposure and one assigned the shifted exposure. Further, each is given an indicator for whether it received the natural value of treatment or the shifted value. Using Bayes’ rule and the fact that ,
(4)  
In this formulation, the density ratio estimate can be calculated as a simple function of , which can be estimated using any binary classification method and which we will write as . This innovation by Díaz [diaz_nonparametric_2021] drastically improved computation time compared to estimating directly (as is done in several algorithms for stochastic interventions [hejazi_txshift_2020, hejazi_tmle3shift_2021]).
To implement estimation of and the conditional outcome regression flexibly, we used the ensemble machine learning algorithm Super Learner, allowing us to minimize parametric assumptions [laan_super_2007]
. Super Learner creates a convex combination of candidate algorithms chosen through crossvalidation to minimize a userspecified loss function. In our case, the Super Learner ensemble included generalized linear models, generalized additive models, gradient boosted decision trees, random forest, and a simple mean.
With these nuisance function estimates in hand, we used the following process to estimate , modified for the single time point case from [diaz_nonparametric_2021].

Define and calculate weights for , calculated from estimated via Super Learner

After scaling our bounded continuous outcome to [gruber_targeted_2010]
, fit the logistic regression on the observed (scaled)
values:with weights , calculating the estimated intercept

Update the estimates as
After unscaling, with the updated estimates in hand, define as:
The TMLE estimator is ‘doubly robust’ in that it is consistent if either the estimators of or
are consistently estimated and converge at appropriate rates. Further, if both meet those conditions, along with certain regularity conditions, the TMLE converges to a normal distribution with mean 0 and variance equal to the nonparametric efficiency bound, as shown in Theorem 3 of
[diaz_nonparametric_2021]. While convergence at the required rate for each nuisance parameter is not guaranteed, the inclusion of a wide variety of algorithms in our Super Learner ensemble maximizes our chance of achieving these rates [laan_super_2007]. The efficiency bound for this TMLE depends on the variance of the outcome conditional on and , the amount of treatment effect variability, and the intensity of shift applied, as measured by the density ratio [diaz_nonparametric_2021].Using the empirical mean to estimate , we hence have our desired estimate
From this result, variance estimation for can be calculated using the estimated influence curves from and the empirical mean:
with the influence curve defined as follows: [diaz_nonparametric_2021]
. This allowed for the calculation of the 95% Waldstyle confidence interval
.All statistical analysis was performed in using R version 4.0.3 [r_core_team_r_2020], in particular the ltmp_tmle
function in the lmtp
package [williams_lmtp_2020]. Code and the dataset to reproduce the analysis is available at the authors’ GitHub repository (https://github.com/joshuanugent/covidmtp).
4 Results
The following figures show estimates of , the difference between the COVID19 case rate that would be expected under the shifted distribution, after adjusting for measured confounders, and the observed average case rate, with associated 95% confidence intervals (CIs), for weekly time slices from June 1, 2020 to November 14, 2020. For comparison, we also show the unadjusted estimate, which is defined analogously to , simply without any covariates included in the adjustment set.
As an illustrative example, consider Figure 2, showing for the exposure “Facebook single tile users,” which measures the proportion of mobile devices in a county that stayed in a single location (presumably, a home) each day, averaged over the week. In the first timeslice (June 1, 2020), the unadjusted results, in gray, showed a small association between reduced mobility (5% increase in the proportion of people staying at home) and reduced case rates: 3.2 (95% CI: 4.4, 1.9) cases per 100,000 residents two weeks later. After adjusting for confounders (in red), the change in case rates shrinks to 0.7 (2.2, 0.7). In the week of September 28, the results are similar but more extreme: a 5% increase in the proportion of people staying at home in the unadjusted analysis suggests a change of 16.0 (18.6, 13.3) cases per 100,000 residents two weeks later, while that association is attenuated to 4.5 (7.0, 2.0) cases per 100,000 residents after confounder adjustment. Across all of the time slices for that mobility index, we see the same pattern: Unadjusted estimates of vary, but are negative, as we would expect; lower mobility appears to be associated with lower case rates. After adjustment, however, the estimates shrink towards zero and many of the CIs overlap with zero.
Broadening the scope to the full set of indices examined in Figure 3, we see some consistent patterns. First, almost all unadjusted estimates are larger than the adjusted ones, though the point estimates are usually in the same direction after adjustment. Second, the CIs increase in October and November, suggesting higher outcome variance, more treatment effect variability, or higher density ratios (perhaps from near practical positivity violations). Third and most notable, after adjustment for baseline confounders and oneweek lagged case rates, the associations were generally small, CIs overlapped with zero, and point estimates showed almost no consistent trends of positive or negative association.
For example, only six of the indices showed even shortterm consistent associations with the outcome after adjustment, perhaps due to seasonal effects: Facebook’s ‘single tile users’ stayathome metric (Aug, Sept, Oct), PlaceIQ’s ‘DEX’ and ‘DEXA’ measures of device density (Jun, Aug), Decartes Labs ‘m50 index’ of distance traveled from the home (Aug, Sept, Oct), Google’s workplace index (Sept, Oct), and Google’s stayathome residential index (Sept, Oct). However, these same indices had an uncertain associations in other time periods, with CIs overlapping zero, and in several instances had point estimates (and sometimes and CIs) unexpectedly above zero.
Surprisingly, three indices showed positive unadjusted point estimates in June and July (Facebook ‘tiles visited’ and Google’s ‘retail & recreation’ and ‘residential’ indices), with Facebook’s ‘tiles visited’ index showing an unusually large swing between summer and fall. After adjustment, the estimates were strongly attenuated. These data points may be displaying most clearly the confounding effect of the covariates included in our adjustment set. More discussion of possible explanations for the mixed results are presented in Section 5.
We performed several sensitivity analyses that did not meaningfully change our results (Appendix B). Those include using current case rates instead of oneweek lagged case rates in the adjustment set, and using oneweek leading case rates as an outcome instead of twoweek leading rates. Further, after noting the limitations of treebased methods in TMLE discussed in [balzer_demystifying_2021]
, we reran the analysis omitting the random forest algorithm from the Super Learner and adding multivariate adaptive regression splines; this change did not substantively alter the results. Our two supplementary analyses, one with a timevarying confounder set
and the other with a smaller confounder set (four rather than eight) and larger shifts, are presented in Appendix C. Neither supplementary analysis differed significantly from the main results.5 Discussion
To the best of our knowledge, we are the first to apply a doublyrobust MTP/TMLE approach to provide insights into a pressing question in infectious disease research. Specifically, we aimed to answer: Did reductions in mobility, as captured the mobile phone data, lead to reduction in subsequent COVID19 case rates across U.S. counties in the summer and fall of 2020? We also hope this paper will serve as a demonstration of how MTP can be used by other researchers to minimize (likely unjustifiable) parametric assumptions when defining causal effects and how TMLE can be applied to estimate the corresponding statistical parameters. We again note that the causal inference literature for binary exposures is vast and wellestablished, especially as compared to the corresponding literature for continuous exposures. We believe MTP is a powerful and underutilized approach to defining effects for continuous exposures.
Our findings on how the associations between mobility and COVID spread vary over time and with adjustment are in line with the results found in earlier research. In an analysis using data from the 25 hardesthit U.S. counties through midApril 2020, Badr et al. [badr_association_2020] fit an unadjusted statistical model for future case growth rates based on a mobility measure using number of trips made per day compared to a prepandemic baseline. While strong associations were found between changes in their mobility index and countylevel case growth 912 days later, a later updated analysis [badr_limitations_2020], using data through September 16 and expanding the dataset to all U.S. counties, showed that the association disappeared after April. Similarly, Gatalo et al. [gatalo_associations_2020], using a slightly different methodology, found strong correlations between mobility and case growth from March 27 to April 20, but only a weak correlation from then until July 22. Another analysis [xiong_mobile_2020], using data through June 9, 2020, treated the inflow of traffic into a county as the exposure, adjusted for a small number of sociodemographic covariates, and found a strong relationship between inflow and new cases 7 days later. Finally, an examination [li_association_2020] of the correlations between Google’s six mobility indices (retail & recreation, transit stations, workplaces, residential, parks, grocery & pharmacy) and 11day ahead case growth rates at the county level, stratifying by urbanrural classification, suggested strong associations in the February to April timeframe, especially in more dense counties, but those correlations were weaker when the timeframe was extended to June.
Though our unadjusted analyses seem to suggest that decreased mobility was associated with reductions in COVID19 case rates for the majority of timepoints examined, adjusted results suggest that the associations between commonly studied indices and COVID19 case rates may be highly confounded by county characteristics and current case rates. For example, our adjustment set included variables for countylevel political lean, which may be a proxy for adherence to NPIs such as maskwearing mandates. On first blush, it could appear that counties with lower mobility have fewer new cases, but if lowmobility counties are also more consistently adhering to maskwearing due to underlying political dynamics, the association with mobility could vanish after adjustment. Including oneweek lagged case rates in the adjustment set (as we argue is necessary in the infectious disease context) also demonstrates important confounding by epidemic arrival time and intensity. Our work adds to other evidence of confounding in COVID19 research; for example, McLaren [mclaren_racial_2020] found that some correlations between racial/ethnic groups and COVID19 mortality disappeared once confounding factors were taken into account, and Wong and Balzer [wong_evaluating_2021] found that associations between mask mandates and future case rates were attenuated after confounder adjustment. We suspect this confounding may explain why our results contradict those of Wellenius et al. [wellenius_impacts_2021], who found that mobility levels (using many of the same countylevel metrics) in Spring 2020 were strongly associated with COVID case growth rates two weeks later. Their work differed from this analysis in that the researchers did not include any socioeconomic covariates in their parametric model nor extend their analysis beyond March and early April.
There are several limitations to our analysis, some of which may explain the inconsistent patterns of association we see over time and suggest avenues for future investigation. First, while we chose a timeframe when COVID testing was widely available, differential ascertainment of case rates across counties could bias our estimates. Second, while countylevel mobility and covariate data are easily accessible, there might be important subcounty level effects that are masked when aggregating to the county level. Performing this analysis at the zip code or census tract level might reveal stronger or more consistent associations. Indeed, an analysis of zipcode level mobility data and case rates in five cities [glaeser_how_2020], while finding a strong association, also reported substantial heterogeneity across space and time. Third, our assumption of independence between counties is unlikely to hold, given statelevel policy enactment and spillover effects between counties. Network effects can be unpredictable [lee_network_2021]
, so we hesitate to speculate about the directions in which they might have biased our point estimates or standard errors, but they are likely in play.
Fourth, despite our large adjustment set, it is possible that we excluded an important confounder, or that we have unmeasured confounding, as is common in studies with observational rather than experimental data. For example, we had no data on the strength of a county’s public health infrastructure or compliance with NPIs, which could presumably impact both the level of mobility observed and new case rates. Related to the dimension of the adjustment set , practical positivity violations may have also limited our ability to detect associations. After adjusting for demographics, political lean, and economic variables, the variability in exposure and outcome across counties may have been small, limiting the amount of shift we could apply without generating extreme density ratios and the accompanying large standard errors (recall that the density ratio is a key component of the efficiency bound). If mobility index data were available for the sparsely populated counties that we excluded, it might be possible to expand our target population and investigate a wider range of intervention shift values that might detect stronger or more consistent associations. Fifth, there are still many unknowns about the dynamics of COVID transmission over space and time. Including one week lagged case rates in our adjustment set may have helped address some of those unknowns, but is likely not able to account for all of the complex dynamics of COVID transmission. For example, the return to schools in fall 2020 may partially explain how several indices show different behavior in the summer and fall: the adjusted DEX and DEXA point estimates change sign and the adjusted Descartes Labs estimates shift from minimal associations to stronger ones. Finally, our choice of a crosssectional timeseries analysis in weekly slices may have hidden complex patterns over time. A longitudinal analysis that examines the full scope of mobility over time and its impact on cumulative cases per capita could provide a more complete view of the full impact (or lack thereof) of mobility on case transmission. Doublyrobust longitudinal MTP methods are available in the ltmp
R package and offer a promising avenue to answer those questions.
Despite these limitations, our analysis provides evidence that mobility indices may not be a simple proxy measure for future case rates with airborne infectious diseases. Using MTP allowed us to consider shifts in mobility suppored by our observed data, and TMLE allowed for the leveraging of machine learning to minimize parametric assumptions while still providing statistical inference. For continuous exposures, MTP can be a useful tool for important research and policy questions.
Appendix A Data details
a.1 Mobility index sources, definitions, and shifts considered
Facebook Movement Range

Change in movement: Percentage change in the number of ‘Bing tiles’ (roughly 600 m by 600 m) a device pinged in a given day, compared to the baseline average for that weekday in February 2020, excluding Presidents’ Day, aggregated at the county level. We examined additive shifts of 3 and 4.

Stay put: Percentage of people who stay in a single Bing tile over an entire day. We examined multiplicative shifts of 1.05, 1.07, and 1.08.
Google COVID19 Community Mobility Reports
For Google indices, all of the baseline values are defined as the median value for that day of the week from January 3, 2020, to February 6, 2020.

Retail & recreation: Percentage change in number of visits and length of stay from baseline for restaurants, malls, movie theaters, and retail stores (excluding grocery stores and pharmacies). We examined additive shifts of 4, 5, and 6.

Transit stations: Percentage change in number of visits and length of stay from baseline for mass transit stations (buses, subways, etc). We examined additive shifts of 5, 10, and 15.

Workplaces: Percentage change in number of visits and length of stay from baseline for an individual’s place of work. We examined additive shifts of 2, 3, and 4.

Residential: Percentage change from baseline of duration time spent at one’s residence each day. Since the baseline values are high, the variation in this index is smaller than the others. We examined additive shifts of +1, +2, and +3.
PlaceIQ DEX

Device exposure index (DEX): number of distinct devices that entered the same commercial spaces that day as the target device, then averages that number across all devices in the county. We examined multiplicative shifts of 0.75, 0.7, and 0.65.

Adjusted DEX: Scaled version of the DEX that adjusts for the possibility of devices sheltering in place, generating no pings, and hence not having their zeroDEX values be counted in the overall average. We examined multiplicative shifts of 0.75, 0.7, and 0.65.
Descartes Labs m50

m50: Median maximum distance traveled from starting location by all tracked devices in a county. We examined multiplicative shifts of 0.6, 0.7, and 0.8.

m50 index: Relative value of m50 relative to median baseline value for that day of the week between February 2, 2020, and March 7, 2020. We examined multiplicative shifts of 0.9 and 0.85.
a.2 Comparison of counties included/excluded
All mobility indices require sufficient sample size within a county to report reliable data on a given day. In lowpopulation counties, this can make the mobility data sparse or nonexistent over the timeframe in question. Hence, to improve comparability across different indices and address positivity violations, we restricted our analysis to counties with 40,000 or more residents. The figure below compares the covariate distribution across the included/excluded counties. While a vast number of counties were excluded, they represent only 10% of the U.S. population, and covariate means were similar to the counties included.
All counties  Included  Excluded  

N  3,158  1,182  1,926 
Mean population  105,362  250,886  16,053 
Mean population density (/sq. mi)  27,167  62,366  5,608 
% Adults without health insurance  14.3%  13%  15% 
% Under 18  22.3%  22.6%  22.2% 
% Households below poverty line  15.2%  13.8%  16.1% 
% Hispanic  9.3%  10.8%  8.4% 
% Women  49.9%  50.6%  49.5% 
% Black  8.9%  9.9%  8.5% 
% Clinton vote in 2016 election  31.7%  38.6%  27.8% 
% Drive alone to work  79.2%  80.5%  78.9% 
a.3 Adjustment variables and their sources
We gathered a rich set of social, economic, physical, and demographic variables from the U.S. Census American Community Survey, U.S. Bureau of Labor Statistics, U.S. Centers for Disease Control, U.S. Department of Agriculture, University of Washington Population Health Institute and Department of Political Science, National Oceanic and Atmospheric Administration, and MIT Election Data and Science Lab. These included the covariates below, which we screened into the adjustment set after examining their correlations with the exposure (mobility index) and outcome (leading new cases per 100,000 people).

U.S. Census American Community Survey

Percentage of households below the poverty line


University of Washington Population Health Institute, via Robert Wood Johnson Foundation County Health Rankings

Percentage of people that identify as Black

Percentage of people that identify as Hispanic

Percentage of people that drive alone to work

Percentage of people that are women (note: values under 49% are often an indicator of a large prison being in the county)

Percentage of people that are under 18 years of age

Percentage of adults without health insurance


MIT Election Data and Science Lab

Percentage of vote for Hillary Clinton in the 2016 Presidential election

The following covariates were also examined, but were removed from the final adjustment set after the screening process.

U.S. Census American Community Survey

Population density

Natural logarithm of population density

Average household size

Percentage of people living below the poverty line

Percentage of households with income below the poverty line

Percentage of people in the county who live outside of a metropolitan area


University of Washington Population Health Institute, via Robert Wood Johnson Foundation County Health Rankings

Percentage of adults who smoke

Percentage of people that identify as Asian

Percentage of people that are 65 years of age or older

Percentage of people with more than a high school education

Index of income inequality

Natural logarithm of median household income

Level of air particulate matter (pollution)


Statelevel social distancing policies in response to the 2019 novel coronavirus in the U.S., University of Washington Department of Political Science

Mask mandate level (statewide data only)


Centers for Disease Control and Prevention

Percentage of people who live in crowded housing

Percentage of people categorized as obese

Percentage of people living with diabetes


National Oceanic and Atmospheric Administration

Absolute daily high temperature deviation from 70 degrees Farenheit


U.S. Bureau of Labor Statistics

Unemployment rate


U.S. Department of Agriculture

Presence of a level 5 meatpacking plant (largest plants)

Presence of a level 4 meatpacking plant (second largest plants)

Presence of a level 4 or level 5 meatpacking plant

Appendix B Sensitivity analyses
b.1 Using oneweek leading case rates as outcome
b.2 Using currentweek cases in the adjustment set instead of oneweek lagged cases
b.3 Observed exposure distributions over time
Appendix C Supplementary analyses
c.1 Using a timevarying confounder set
We tested three of the indices allowing for a different confounder set , screened in each week timeslice by univariate correlations with the exposure and outcome. As with the main analysis, the adjusted results (shown below for several shift value possibilities) show inconsistent patterns over time and CIs frequently overlap with zero.
c.2 Using a smaller confounder set with larger shifts
We tested three of the indices with a smaller, timevarying confounder set (four rather than eight), screened in each week timeslice by univariate correlations with the exposure and outcome, with larger shift values. As with the main analysis, the adjusted results (shown below for several shift value possibilities) show inconsistent patterns over time and CIs frequently overlap with zero.
Comments
There are no comments yet.