A Double Machine Learning Trend Model for Citizen Science Data

10/27/2022
by   Daniel Fink, et al.
0

1. Citizen and community-science (CS) datasets have great potential for estimating interannual patterns of population change given the large volumes of data collected globally every year. Yet, the flexible protocols that enable many CS projects to collect large volumes of data typically lack the structure necessary to keep consistent sampling across years. This leads to interannual confounding, as changes to the observation process over time are confounded with changes in species population sizes. 2. Here we describe a novel modeling approach designed to estimate species population trends while controlling for the interannual confounding common in citizen science data. The approach is based on Double Machine Learning, a statistical framework that uses machine learning methods to estimate population change and the propensity scores used to adjust for confounding discovered in the data. Additionally, we develop a simulation method to identify and adjust for residual confounding missed by the propensity scores. Using this new method, we can produce spatially detailed trend estimates from citizen science data. 3. To illustrate the approach, we estimated species trends using data from the CS project eBird. We used a simulation study to assess the ability of the method to estimate spatially varying trends in the face of real-world confounding. Results showed that the trend estimates distinguished between spatially constant and spatially varying trends at a 27km resolution. There were low error rates on the estimated direction of population change (increasing/decreasing) and high correlations on the estimated magnitude. 4. The ability to estimate spatially explicit trends while accounting for confounding in citizen science data has the potential to fill important information gaps, helping to estimate population trends for species, regions, or seasons without rigorous monitoring data.

READ FULL TEXT

page 7

page 8

page 9

page 10

page 11

page 12

page 25

page 26

research
07/25/2020

Improved Inference for Heterogeneous Treatment Effects Using Real-World Data Subject to Hidden Confounding

The heterogeneity of treatment effect (HTE) lies at the heart of precisi...
research
11/06/2020

Controlling for Unmeasured Confounding in the Presence of Time: Instrumental Variable for Trend

Unmeasured confounding is a key threat to reliable causal inference base...
research
07/04/2023

Identifying Optimal Methods for Addressing Confounding Bias When Estimating the Effects of State-Level Policies

Background: Policy evaluation studies that assess how state-level polici...
research
01/13/2023

Guidelines for the use of spatially-varying coefficients in species distribution models

Species distribution models (SDMs) are increasingly applied across macro...
research
11/27/2019

Confounding and Regression Adjustment in Difference-in-Differences

Difference-in-differences (diff-in-diff) is a study design that compares...
research
05/18/2018

Using permutations to quantify and correct for confounding in machine learning predictions

Clinical machine learning applications are often plagued with confounder...
research
12/28/2017

A spatially explicit capture recapture model for partially identified individuals when trap detection rate is less than one

Spatially explicit capture recapture (SECR) models have gained enormous ...

Please sign up or login with your details

Forgot password? Click here to reset