Preferential sampling for presence/absence data and for fusion of presence/absence data with presence-only data

09/05/2018
by   Alan E. Gelfand, et al.
0

Presence/absence data and presence-only data are the two customary sources for learning about species distributions over a region. We illuminate the fundamental modeling differences between the two types of data. Most simply, locations are considered as fixed under presence/absence data; locations are random under presence-only data. The definition of "probability of presence" is incompatible between the two. So, we take issue with modeling strategies in the literature which ignore this incompatibility, which assume that presence/absence modeling can be induced from presence-only specifications and therefore, that fusion of presence-only and presence/absence data sources is routine. We argue that presence/absence data should be modeled at point level. That is, we need to specify a surface which provides the probability of presence at any location in the region. A realization from this surface is a binary map yielding the results of Bernoulli trials across all locations. Presence-only data should be modeled as a point pattern driven by specification of an intensity function. We further argue that, with just presence/absence data, preferential sampling, using a shared process perspective, can improve our estimated presence/absence surface and prediction of presence. We also argue that preferential sampling can enable a probabilistically coherent fusion of the two data types. We illustrate with two real datasets, one presence/absence, one presence-only for invasive species presence in New England in the United States. We demonstrate that potential bias in sampling locations can affect inference with regard to presence/absence and show that inference can be improved with preferential sampling ideas. We also provide a probabilistically coherent fusion of the two datasets to again improve inference with regard to presence/absence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2017

Spatial Joint Species Distribution Modeling using Dirichlet Processes

Species distribution models usually attempt to explain presence-absence ...
research
02/18/2022

Preferential Sampling for Bivariate Spatial Data

Preferential sampling provides a formal modeling specification to captur...
research
08/20/2021

A comparison of different clustering approaches for high-dimensional presence-absence data

Presence-absence data is defined by vectors or matrices of zeroes and on...
research
10/14/2014

Presence-absence reasoning for evolutionary phenotypes

Nearly invariably, phenotypes are reported in the scientific literature ...
research
03/30/2021

Integration of presence-only data from several sources. A case study on dolphins' spatial distribution

Presence-only data are a typical occurrence in species distribution mode...
research
03/31/2022

On site occupancy models with heterogeneity

Site occupancy models are routinely used to estimate the probability of ...
research
02/08/2020

The Bloom Tree

The Bloom tree is a probabilistic data structure that combines the idea ...

Please sign up or login with your details

Forgot password? Click here to reset