A Bayesian Semiparametric Jolly-Seber Model with Individual Heterogeneity: An Application to Migratory Mallards at Stopover

11/05/2018 ∙ by Guohui Wu, et al. ∙ 0

We propose a Bayesian hierarchical Jolly-Seber model that can account for individual heterogeneity in departure and the dependence of arrival time on covariates. Additionally, our model provides a semiparametric functional form for modeling capture probabilities. The model is flexible and can be used to estimate the stopover duration and stopover population size, which are key to stopover duration analysis. From the modeling perspective, our model allows for individual heterogeneity in departure due to a continuous intrinsic factor that varies with time and individual. A stochastic process is considered to model the change of this intrinsic factor over time. Moreover, our model links extrinsic factors to capture probabilities and arrival time. Consequently, our proposed model enables us to draw inference about the impacts of the intrinsic factor on departure, and extrinsic factors on both capture outcome and arrival time. Through the use of a semiparametric model for capture probabilities, we allow the data to suggest the functional relationship between extrinsic factors and capture probabilities rather than relying on an imposed parametric model. By using data augmentation, we develop a well customized Markov chain Monte Carlo algorithm that is free of tuning. We demonstrate the effectiveness of our model through a motivating example of stopover duration analysis for mallards (Anas platyrhynchos) studied during fall migration in Sweden.



There are no comments yet.


page 37

page 38

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Migration is a common phenomenon in birds, especially in areas with pronounced seasonal variation. However, in most species, migration is not conducted as a single flight from the breeding area to the non-breading area; rather it is broken down into shorter legs interspersed with stopovers of variable length at suitable sites where energy spent during migration can be replenished (e.g., see Newton, 2010, and the references therein). Mostly determined by the time spent at stopover sites, the overall speed of migration is tightly linked to behaviors at stopover sites, and the distribution and quality of stopover sites impacts the success and survival of birds during migration. A key to stopover duration analysis rests on understanding various species-specific stopover behaviors and how intrinsic and extrinsic factors contribute to these behaviors. For this reason, primary objectives in stopover studies are to estimate the timing of arrival and departure, stopover duration (i.e., the length of stay at a stopover site), stopover population size, and to understand the impacts of intrinsic and extrinsic factors. To accomplish these objectives, capture-recapture studies have been used extensively over the past few decades, with a variety of models being proposed for stopover duration analysis (e.g., see Pledger et al., 2009; King et al., 2010; Matechou, 2010, and the references therein).

The two most commonly used capture-recapture models for stopover duration analysis are the Cormack-Jolly-Seber (CJS) (Cormack, 1964) and Jolly-Seber (JS) models (Jolly, 1965; Seber, 1965). Among the many underlying assumptions for the CJS and JS models, two important assumptions are: (1) every individual that is captured needs to be correctly and uniquely marked; (2) every individual that is alive and present in the study area has an equal likelihood of capture and survival (i.e., homogeneous capture probabilities and survival probabilities) (Williams et al., 2002). The fundamental difference between the CJS and JS models is that the former conditions on the first capture while the latter does not. In relation to stopover duration analyses, the CJS model allows estimation of survival probabilities (i.e., stopover retention probabilities in stopover duration analysis), based on which one can adopt the life expectancy equation (Seber, 1982) to derive an estimator of the stopover duration (e.g., see Morris et al., 2006). To exemplify this, Kaiser (1995), Dinsmore and Collazo (2003), and Rice et al. (2007) demonstrate applications of the CJS model for estimating stopover duration. Importantly, the resulting estimate of the stopover duration from the CJS model can be biased due to the conditional nature of the model and unknown arrival time (Pledger et al., 2009).

Unlike its CJS counterpart, in addition to estimating capture probabilities and survival probabilities, the JS model can be used to estimate population size and entrance probabilities (i.e., the probability of entering the study area right before each sampling period). Schwarz and Arnason (1996)

present a general, yet flexible, formulation of the JS model that is advantageous in the sense that their approach explicitly incorporates the entrance probabilities into the likelihood function. As a result, the Schwarz and Arnason (SA) formulation of the JS model allows for a versatile modeling framework capable of imposing restrictions or incorporating covariates for the entrance probabilities. Moreover, it is shown that unbiased estimators for the entrance probabilities and their derived quantities can be achieved in the presence of heterogeneous capture probabilities

(see Schwarz, 2001, and the references therein). Based on the SA formulation, Royle and Dorazio (2008) provide a state-space formulation of the JS model under the Bayesian hierarchical modeling paradigm. In this setup, data augmentation (Tanner and Wong, 1987) is considered to facilitate Bayesian model estimation using freely available software such as WinBUGS (Lunn et al., 2000).

Building upon the SA formulation of the JS model, Pledger et al. (2009) develop a flexible stopover model under the frequentist framework to allow capture and stopover retention probabilities to depend on an unknown time since arrival . Apart from deriving indirect estimate of the mean stopover duration, they also consider modeling the stopover retention curve to examine different stopover departure patterns. To extend the stopover model by Pledger et al. (2009), Matechou et al. (2014) develop a mixture model where captured individuals do not need to be correctly and distinctly marked. In other words, data for such an extended model consists of counts of individuals captured in each sampling period rather than encounter histories of uniquely marked individuals. Subsequently, Lyons et al. (2015) develop a Bayesian stopover model that accommodates both encounter histories of uniquely marked individuals and counts of unmarked individuals. Their model allows for the estimation of capture and stopover retention probabilities, entrance probabilities, stopover population size, and stopover duration. In particular, the estimator of the stopover duration is derived from latent state variables that are introduced via data augmentation, following Royle and Dorazio (2008). Recently, Matechou et al. (2016) develop a stopover model by extending the JS model to allow individuals to arrive in different groups and hence their model accounts for heterogeneity in departure due to a group effect. Additionally, to address individual heterogeneity in arrival time due to a group effect, entrance probabilities are modeled using a finite mixture.

Despite the usefulness of the aforementioned stopover models, many real-world applications require development of a data-specific model. As in our motivating example, there is a need to address individual heterogeneity in migratory bird departure decisions due to a continuous intrinsic factor that varies with both time and individual. In addition, there is also a need to link the arrival time and capture probabilities to extrinsic factors and to infer the functional relationship between them. As a consequence, we develop a stopover model using data augmentation under the Bayesian hierarchical state-space framework. The methodological contributions can be described as follows. First, our model accounts for individual heterogeneity in departure due to a time-varying continuous individual covariate. Second, our model allows for a data-driven functional relationship between the capture probabilities and extrinsic factors through the use of smoothing splines, which enables us to detect a nonlinear temporal trend. Furthermore, our model links the arrival time to extrinsic factors and hence allows us to draw inference about their impacts on the time of arrival. More importantly, we develop a well-tailored Markov chain Monte Carlo (MCMC) algorithm for our proposed model to avoid tedious user-defined tuning.

This paper is organized as follows. Section 2 introduces the motivating data from mallard monitoring study. Section 3 presents the proposed state-space model and provides two goodness-of-fit criteria for model assessment. Section 4 describes the MCMC algorithm for our proposed model. A simulated example is presented in Section 5, illustrating the effectiveness of our modeling approach. Section 6 demonstrates the application of our methodology through a stopover duration analysis for our motivating data collected by the Ottenby Bird Observatory in Sweden. Discussion is provided in Section 7. Further details surrounding the full conditional distributions and the MCMC sampling algorithm are provided in a Supplementary Appendix.

2 The Mallard Data

The mallard (Anas platyrhynchos), is the most common and widespread dabbling duck in the Northern hemisphere and an important model species for studies of ecological processes (Gunnarsson et al., 2012), harvest management (Nichols et al., 2007), and epidemiology of bird borne infections (Latorre-Margalef et al., 2009). It is a partial migrant, where southernly populations in the distribution range tend to be resident and the northernmost obligatory migrants, and in other populations a mix of resident and migrants (Cramp and Simmons, 1977). The mallard is a medium-sized bird with heavy wing loading where migration is energetically costly. From ringing and telemetry studies it is clear that migratory mallards break up their journey into shorter flights and spend a large proportion of their migration time at stopover sites, replenishing resources and preparing for the next leg of migration (Gunnarsson et al., 2012). Thus, stopover sites have a key role for successful migration and survival of mallards, and a priority for sustainable mallard management is to better characterize the ecology of birds at stopover. This includes assessing the timing of migration and densities of birds at specific stopover sites and to what extent intrinsic and extrinsic factors (e.g., body condition and weather) affect stopover behaviors.

In birds, fat is the main fuel for migration and it remains to be known how mallards adjust their stopover behavior and departure according to their refueling rates at the stopover site and their current body condition in terms of fat loads (Berthold, 2001). In addition, weather is known to be linked with bird migration during departure but also aloft. In general, birds prefer initiating a flight when winds provide flight assistance, i.e., tailwinds, and under other conditions favorable for flying, such as under low rainfalls (Berthold, 2001). Furthermore, understanding how and when mallards use stopover sites is a key step in forecasting avian influenza dynamics at these sites (Gunnarsson et al., 2012).

Despite their importance in research, a lot remains to be determined in regards to mallard migration ecology, especially during the less well-studied stopover periods. Key objectives for monitoring studies of mallards—and indeed for other migratory birds more generally—are to understand stopover retention probabilities, stopover duration, total stopover population size (i.e., the total number of individuals present) at specific sites, and the effects of intrinsic and extrinsic factors on migratory decisions and stopover behaviors. Here we use long-term capture series of mallards carried out at Ottenby Bird Observatory on the Swedish island of Öland in the Baltic Sea (N, E) (see Figure 1). This scheme started in 2002, and originally aimed for monitoring presence of influenza A virus in birds, but the data of banded individuals over time is also very suitable for addressing stopover ecology questions. The southernmost part of this island is an attractive stopover site for mallards within the Northwest European flyway, offering habitats for both roosting and foraging (Bengtsson et al., 2014). Mallards that utilize our study site—Ottenby—mainly originate from mainland Sweden, Estonia, Finland, and Russia (Gunnarsson et al., 2012). After leaving Öland, these mallards migrate to wintering areas in Northwestern Europe, predominantly in southern Denmark, northern Germany, and the Netherlands (Gunnarsson et al., 2012).

To collect data, Ottenby Bird Observatory used a stationary trap at the study site to catch mallards for ringing and epidemiological studies. In particular, mallards were attracted by bait grain and by the presence of a few (normally around 10) domestic ducks kept in a compartment of the trap. Traps were inspected daily during the field seasons and any wild duck captured was ringed and measured for structural size (i.e., the distance from the tip of the bill to the back of the head) and body mass, and subsequently released. This data collection process, over the course of a stopover season, results in the capture-recapture data. The data available for analysis was collected from 2004-2011, during the autumn migration season, which begins on August 1st and ends on December 16th of each year.

Motivated by the mallard data at hand, our primary goal is to develop a model that accomplishes three important research objectives. The first objective is to determine whether there is individual heterogeneity in mallards’ departure due to the intrinsic factor—body condition (i.e., body mass corrected by the structural size). The second objective is to estimate stopover duration, daily stopover population sizes, and total stopover population size, as well as to detect whether there is a temporal trend for daily stopover population sizes. The third objective is to understand how extrinsic factors such as wind and temperature relate to the timing of arrival and departure for mallards at our study site.

3 Methodology

3.1 Parameters and Notation

Consider a capture-recapture experiment with sampling occasions at distinct times studying a population regarding a particular species of interest. Further, we assume the population size for population during the study is , an unknown parameter that needs to be estimated. For , let denote the time interval between two consecutive sampling occasions and . Without loss of generality, we assume ; i.e., for . In addition, let be the total number of individuals that are caught during the study. For each individual being caught, denote as the corresponding capture history, where

is a binary variable indicating if individual

is caught at occasion for and ; that is, if individual is caught at occasion and 0 otherwise. Upon the capture of each individual animal, measurements on a set of individual covariates are taken and recorded.

Motivated by the mallard data, we consider the single covariate case and allow the individual covariate to be continuous and time-varying. In the current context, we emphasize that the values of such a covariate for an individual are observable only when the individual is captured. As a result, we need to model the evolution of the time-varying continuous individual covariate.

3.2 Modeling Continuous Covariates

Let be a continuous variable at time . We assume that follows an Ornstein-Uhlenbeck (OU) process; i.e., satisfies a stochastic differential equation of the form



controls the noise variance,

describes the rate of mean reversion, is the long-term (or asymptotic) mean, and is a standard Wiener process on . It is straightforward to see that by setting , (1) reduces to the von Bertalanffy growth equation (von Bertalanffy, 1938). The use of the OU process in the current context is advantageous. The extra random noise term in the OU process provides increased flexibility, accounting for random noise resulting from several factors, e.g., measurement error and/or random variation due to changes in the environmental conditions (Filipe et al., 2010). For and denote , the OU process is stationary (i.e., and are identically distributed), Markovian (i.e., ), and follows a multivariate Gaussian distribution (see Finch, 2004, and the references therein) for and .

The two moments of the OU process are:

and . For , it follows that the transition distribution takes the following form


(see Filipe et al., 2010, and the references therein). Compared with the diffusion process used by Bonner and Schwarz (2006, 2009) and Schofield and Barker (2011), the OU process we consider provides estimates for the rate parameter and long-term mean .

For and , the time-varying continuous individual covariate is assumed to satisfy the OU process defined by (1). Hence, at discrete sampling times , the conditional distribution for takes the following form

where is the realization of and .

3.3 Semiparametric Jolly-Seber Model with Individual Heterogeneity

The JS model we propose is formulated under the state-space framework. In particular, our proposed model is characterized by a state model, observation model, and parameter model. The state model describes the states of an individual over time, whereas the observation model describes the capture outcome of an individual over time. Throughout this article, we use the term “state” to describe two statuses of an individual in the population, which are either alive and present in the study area (denoted by 1) or not having entered the population or death (denoted by 0). The parameter model describes how certain model parameters are linked to the intrinsic and extrinsic factors.

Let where is a binary latent variable to indicate the state of individual at time for and . Note that the dimension of varies with , a parameter that is unknown. Consequently, the number of parameters is not fixed in each iteration of MCMC, which will cause some computational disadvantages. To maintain a constant number of parameters, a data augmentation technique is often utilized (e.g., see Royle and Dorazio, 2008). For our model, the data augmentation technique involves two steps. The first step is to introduce a parameter , and augment the observed data configuration by , where for . Second, for , we associate a binary membership indicator with each of individuals; i.e., . In other words, if individual is a member of and 0 otherwise.

3.3.1 State Model

Following Royle and Dorazio (2008), the state model can be defined by


where indicates whether an individual can enter the population right after time for and . In addition, is the indicator function that takes value 1 if and 0 otherwise. For , refers to survival probability (or stopover retention probability in a stopover model), i.e., the probability that an individual of will remain in the study area at time given its presence in the study area at time . Moreover, denotes the conditional entrance probability at time given that an individual has not entered the study area, that is,

for and denotes the proportion of that enters the study area between time and . By definition, it follows that .

The interpretation of the state model described in (2) and (3) is straightforward. First, (2) indicates that individual is subject to entrance with probability at time only if it is a member of (i.e., ). In (3), we see that if individual has not entered the study area right before time (i.e., ), it is subject to entrance with probability given it is a member of . Second, if individual has already entered and is present in the study area at time , it will remain in the study area at time with probability .

3.3.2 Observation Model

We proceed with the observation model. For , denote as the capture probability at time . The observation model is given by


for . According to (4), we are solely interested in the capture outcome for individuals that are members of (i.e., for any such that ). Moreover, for individual that is captured at least once during the study (i.e., ), it is clear that is implied. In addition, individual is subject to capture at time only if it has entered and still remains in the study area (i.e., ).

An important feature of building the JS model from the “individual” up is that it enables us to estimate certain quantities that are important in stopover duration analysis fairly easily. For example, the total stopover population size, , can be estimated as . The stopover population size at time , , can be estimated as . Moreover, we can estimate the mean stopover duration averaged over all captured individuals as (Lyons et al., 2015)

The number of individuals alive at both times and , say , can be calculated as .

3.3.3 Parameter Model

The parameter model links capture, departure, and entrance parameters with various types of covariates. We consider a semiparametric model for the capture probabilities. The departure probabilities are linked to a time-varying continuous individual covariate to account for individual heterogeneity. Additionally, we consider a model that links the entrance probabilities to time dependent covariates to infer the impacts of these covariates on the timing of arrival.

Starting with capture probabilities , we consider a semiparametric model of the form


where and is the number of knot points. Here is a vector consists of values for covariates at time ; and denotes a vector of regression coefficients. Moreover, it is assumed that where is a matrix whose th entry is for . Following Ruppert et al. (2003), the fixed knot

is chosen to be sample quantile of the

’s corresponding to probability for where . Let be the matrix with th row , (5) can be reparameterized as


where and is the th row of the matrix . Due to this reparameterization, it holds that where is a identity matrix.

From a modeling perspective, the parametric part of (6) posits a linear relationship between covariates and the logit of . In comparison, the nonparametric part of (6) allows for a greater flexibility in the sense that the shape of the functional relationship between the covariate and the logit of is determined by the data instead of assuming a particular parametric form a priori. For the nonparametric part of the model in (6), we consider low-rank thin-plate splines due to their appealing numerical properties in Bayesian computation. That is, the parameters associated with low-rank thin-plate splines tend to be less correlated than parameters associated with other basis functions, which leads to better mixing of the MCMC chains in Bayesian analysis (Crainiceanu et al., 2005).

Define as the departure probability of individual at time for . Strictly speaking, departures can arise from three outcomes—start of a migratory flight, relocation to another habitat that is not covered by traps, and death. When the sampling period is relatively short, as it is the case in our motivating mallard example, death between two consecutive sampling periods is almost negligible. As a result, the term departure primarily refers to start of another migratory flight or relocation to another habitat. We link to an intrinsic factor as follows


Here the realization of a time-varying continuous individual covariate (i.e., ) accounts for individual heterogeneity in departure. As previously mentioned, the inclusion of a time-varying continuous individual covariate raises some computational concerns. First, for an individual that is not captured at time , the value of is not observable. Further, for individuals that are never captured during the study, we do not observe any values for

. Accordingly, the implementation of the JS model we propose requires us to establish a model for the covariate such that missing values can be “imputed” by conditioning on the observed data. To achieve this goal, we assume

follows the OU process discussed in Section 3.2.

For entrance probabilities, we consider the following model


where denotes a vector consists of the values of covariates at time for . Furthermore, is a vector of regression coefficients. Due to the implied restriction , (8) is equivalent to the following

3.4 Priors and Posteriors

To complete the specification of our model, we need to assign prior distributions for the model parameters and derive the full conditional distributions. Denote , the set of parameters in the model we propose is . Denote

as the inverse gamma distribution with shape parameter

and scale parameter , we assign prior distributions as follows: ; ; ; ; for ; ; ; ; ; ; ; and . In our implementation, we choose vague priors that are noninformative relative to the scale of data.

Let denote the observed capture history. Assuming conditional independence, the joint posterior distributions of the model parameters can be derived as

3.5 Model Assessment

An extremely important aspect of Bayesian modeling is to evaluate goodness-of-fit for the model being considered. In the context of capture-recapture models, the Bayesian p-value is often considered (e.g., see King et al., 2010, and the references therein)

. Roughly speaking, the Bayesian p-value is a posterior probability that measures the similarity between the data generated from the posterior predictive distribution under a specified model and the observed data. To calculate the Bayesian p-value, we first define a discrepancy function

, where and denote the data and the parameters for the model being considered, respectively. Then, we calculate the value of the discrepancy function for both the observed data and the simulated data , which is generated conditioning on the posterior distribution of model parameters. Finally, the Bayesian p-value is defined as the percentage of times that values of the discrepancy function for exceeds those of the discrepancy function for . Mathematically, the definition of the Bayesian p-value, , can be formulated as . As a rule of thumb, a Bayesian p-value close to 0 or 1 indicates that the model being considered does not provide a good fit to the data and that there is inconsistence between the model and data (see Chapter 6 in Gelman, 2003).

For the model we propose, goodness-of-fit requires the assessment of two components. On the oned hand, we need to assess the goodness-of-fit for the overall JS model to the data. On the other hand, we need to evaluate the use of the OU process regarding modeling the time-varying continuous individual covariate. Consequently, it suffices to calculate the Bayesian p-values for the JS model and for modeling the individual covariate using the OU process. Among many choices of the discrepancy function (e.g., see Brooks et al., 2000), we used the complete log-likelihood function for ; i.e., , where is the complete log-likelihood function of given all model parameters excluding (i.e., ) and the data . Different from Bonner and Schwarz (2009), for , we compare the observed and expected value of the individual covariate for each capture rather than recapture and consider the discrepancy function to be

where is the total number of captures over sampling occasions and

denotes the standard deviation for the distribution of


4 MCMC Algorithm

We describe our customized MCMC sampling algorithms for , , , and . For the rest of model parameters, the details are provided in the Supplementary Appendix.

4.1 Sampling

We now discuss how to update the latent variables . For , we first define three sets as follows:

The update of will depend on which category an individual falls into. For example, if an individual is not a member of , i.e., , we always fix . Second, for an individual , it is captured at least once during the sampling occasions. As a consequence, would necessarily imply for , since an individual needs to be alive and present in the study area in order to be available for capture. In this case, the simulation of depends on the structure of . Consider a capture history of the form


with . It is clear that the corresponding latent states takes the form of , where denotes missing states to be simulated.

We start with the updating scheme of for . To simplify notation, we denote and as the first and last times that an individual is captured. We adopt a block updating scheme similar to Dupuis and Schwarz (2007). Specifically, let be the Type I block that consists of state variables corresponding to sample times up to . Further, denote as the Type II block that consists of state variables corresponding to sampling occasions after . For example, for the capture history in (9), we have and . Before we proceed with the simulation for Type I and Type II blocks, we need to introduce some further notation. Let denote the probability that individual enters the population, is still alive, and is not seen before time , the following recursive relationship holds

for and . Consequently, for Type I block , we can update according to where and

for .

Next, we discuss the simulation for latent state variables in the Type II block. Let denote the probability that an individual of leaves the study area after time , we can then obtain using the recursion

for and . Accordingly, for , we can update by first simulating from

and then update according to

Lastly, we address the simulation of latent state variables for an individual of that is never captured during the entire study (i.e., ). To achieve this goal, let denote the probability that individual of is never captured. We can derive the following


To perform Type I block simulation, we first determine the time that individual of first enters the population according to with and

for . After determining the time of entrance into the population, we need to perform Type II block simulation to ascertain the status of individual after its entrance. For the sake of brevity, the details are omitted here due to its similarity with the Type II block simulation for in the previous discussion.

4.2 Sampling and

Denote , the joint conditional distribution of takes the form of


which is not of standard form. To avoid tuning, we take advantage of the following results (Polson et al., 2013)


Here denotes a Pólya–Gamma distribution with parameters and

and the corresponding probability density function being

(Polson et al., 2013):

Combining (11) and (12) yields with

where and . Moreover, for . Regarding the conditional distribution of , we have with

where is a matrix whose th row consists of . Advantageously, by introducing another layer of data augmentation using Pólya–Gamma distribution random variates, the full conditional distributions for and have a standard form.

4.3 Sampling

We describe the sampling algorithm for membership indicator , . For individuals , it is straightforward to see that , i.e., . In other words, for individuals that are captured at least once during the study, they are members of . For an individual that is never captured, i.e., , we can apply Bayes rule to arrive at:

and hence, we can sample according to .

5 Simulated Example

To evaluate the performance of our proposed model, we consider a simulated example where the exact model specification is chosen for illustration. For this simulation, we set and . For the parameters specific to the OU process, we set , , , , , and (for ). In terms of the model for departure probability , we consider

where , , and . In addition, is the realization of a time-varying continuous individual covariate satisfying the OU process with the aforementioned parameter specification.

For the model associated with the capture probabilities , we consider

where for ; and , , and are three time dependent covariates. These three covariates are simulated according to for . For the regression coefficients , we consider . In addition, is the th row of matrix . Here is the matrix with th row for and ; and is a matrix whose th entry is for . Moreover, the th fixed knot is chosen as the sample quantile of