A Bayesian latent allocation model for clustering compositional data with application to the Great Barrier Reef

by   Luiza Piancastelli, et al.

Relative abundance is a common metric to estimate the composition of species in ecological surveys reflecting patterns of commonness and rarity of biological assemblages. Measurements of coral reef compositions formed by four communities along Australia's Great Barrier Reef (GBR) gathered between 2012 and 2017 are the focus of this paper. We undertake the task of finding clusters of transect locations with similar community composition and investigate changes in clustering dynamics over time. During these years, an unprecedented sequence of extreme weather events (cyclones and coral bleaching) impacted the 58 surveyed locations. The dependence between constituent parts of a composition presents a challenge for existing multivariate clustering approaches. In this paper, we introduce a finite mixture of Dirichlet distributions with group-specific parameters, where cluster memberships are dictated by unobserved latent variables. The inference is carried in a Bayesian framework, where MCMC strategies are outlined to sample from the posterior model. Simulation studies are presented to illustrate the performance of the model in a controlled setting. The application of the model to the 2012 coral reef data reveals that clusters were spatially distributed in similar ways across reefs which indicates a potential influence of wave exposure at the origin of coral reef community composition. The number of clusters estimated by the model decreased from four in 2012 to two from 2014 until 2017. Posterior probabilities of transect allocations to the same cluster substantially increase through time showing a potential homogenization of community composition across the whole GBR. The Bayesian model highlights the diversity of coral reef community composition within a coral reef and rapid changes across large spatial scales that may contribute to undermining the future of the GBR's biodiversity.


Bayesian Heterogeneity Pursuit Regression Models for Spatially Dependent Data

Most existing spatial clustering literatures discussed the cluster algor...

Bayesian clustering of high-dimensional data

In many applications, it is of interest to cluster subjects based on ver...

Bayesian Nonparametric Mixed Effects Models in Microbiome Data Analysis

Detecting associations between microbial composition and sample characte...

Bayesian spatial clustering of extremal behaviour for hydrological variables

To address the need for efficient inference for a range of hydrological ...

Multilayer Adjusted Cluster Point Process Model: Application to Microbial Biofilm Image Data Analysis

A common problem in spatial statistics tackles spatial distributions of ...

A Bayesian Approach to Restricted Latent Class Models for Scientifically-Structured Clustering of Multivariate Binary Outcomes

In this paper, we propose a general framework for combining evidence of ...

A method for Bayesian regression modelling of composition data

Many scientific and industrial processes produce data that is best analy...