Integration of presence-only data from several sources. A case study on dolphins' spatial distribution

by   Sara Martino, et al.

Presence-only data are a typical occurrence in species distribution modeling. They include the presence locations and no information on the absence. Their modeling usually does not account for detection biases. In this work, we aim to merge three different sources of information to model the presence of marine mammals. The approach is fully general and it is applied to two species of dolphins in the Central Tyrrhenian Sea (Italy) as a case study. Data come from the Italian Environmental Protection Agency (ISPRA) and Sapienza University of Rome research campaigns, and from a careful selection of social media (SM) images and videos. We build a Log Gaussian Cox process where different detection functions describe each data source. For the SM data, we analyze several choices that allow accounting for detection biases. Our findings allow for a correct understanding of Stenella coeruleoalba and Tursiops truncatus distribution in the study area. The results prove that the proposed approach is broadly applicable, it can be widely used, and it is easily implemented in the R software using INLA and inlabru. We provide examples' code with simulated data in the supplementary materials.



There are no comments yet.


page 1

page 2

page 3

page 4


Spatial Joint Species Distribution Modeling using Dirichlet Processes

Species distribution models usually attempt to explain presence-absence ...

spOccupancy: An R package for single species, multispecies, and integrated spatial occupancy models

Occupancy modeling is a common approach to assess spatial and temporal s...

Accounting for spatial varying sampling effort due to accessibility in Citizen Science data: A case study of moose in Norway

Citizen Scientists together with an increasing access to technology prov...

A comparison of different clustering approaches for high-dimensional presence-absence data

Presence-absence data is defined by vectors or matrices of zeroes and on...

Preferential sampling for presence/absence data and for fusion of presence/absence data with presence-only data

Presence/absence data and presence-only data are the two customary sourc...

The use of the GARP genetic algorithm and internet grid computing in the Lifemapper world atlas of species biodiversity

Lifemapper ( is a predictive electronic atlas ...

Bayesian Heatmaps: Probabilistic Classification with Multiple Unreliable Information Sources

Unstructured data from diverse sources, such as social media and aerial ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.