Hierarchical Bayesian state-space modeling of age- and sex-structured wildlife population dynamics

05/15/2020 ∙ by Sabyasachi Mukhopadhyay, et al. ∙ IIM Udaipur 0

Biodiversity is declining at alarming rates worldwide, including for large wild mammals. It is therefore imperative to develop effective population conservation and recovery strategies. Population dynamics models can provide insights into processes driving declines of particular populations of a species and their relative importance. We develop an integrated Bayesian state-space population dynamics model for wildlife populations and illustrate it using a topi population inhabiting the Masai Mara Ecosystem in Kenya. The model is general and integrates ground demographic survey with aerial survey monitoring data. It incorporates population age- and sex-structure and life-history traits and relates birth rates, age-specific survival rates and sex ratio with meteorological covariates, prior population density, environmental seasonality and predation risk. The model runs on a monthly time step, enabling accurate characterization of reproductive seasonality, phenology, synchrony and prolificacy of births and juvenile recruitment. Model performance is evaluated using balanced bootstrap sampling and comparing predictions with aerial population size estimates. The model is implemented using MCMC methods and reproduces several well-known features of the Mara topi population, including striking and persistent population decline, seasonality of births and juvenile recruitment. It can be readily adapted for other wildlife species and extended to incorporate several additional useful features.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Biodiversity is declining worldwide at such an alarming rate that biologists have christened the contemporary biodiversity loss as the sixth mass extinction (McCallum2015; Ceballosetal2017). Large mammal populations are particularly at risk in many ecosystems. Across continental Africa, in particular, many populations of large mammal species are undergoing disturbing declines (Craigieetal2010; Chaseetal2016). In Kenya, for example, large herbivore populations declined by about 70% on average between 1977 and 2016 (Ogutupiepho2011; Ogutupiephoetal2016). It is therefore imperative to advance our understanding of large herbivore population dynamics as a basis for developing effective species conservation and management and population recovery strategies. A reliable population dynamics model can help quantify and evaluate the relative importance of multiple processes driving declines of particular populations of a species.

For populations inhabiting seasonally variable environments and reproducing seasonally, such models can additionally help quantify shifts in seasonality, phenology, synchrony and prolificity of births, juvenile recruitment and sex ratio in response to climate change. The models can also be used to estimate population trajectories and assess likely population responses to conservation and management interventions, projected future scenarios of climate change, human population growth, socio-economic development, land use and other changes.

Animal population dynamics models often use independent data collected using various methods, such as ground demographic surveys and aerial surveys. Population dynamics models are also increasingly using information from multiple sources to make inferences on various features of populations. Notably, integrated state-space models are becoming widely used to combine different types of data from disparate sources to make joint inferences on animal population dynamics (Trenkeletal2000; Besbeasetal2005; Rhodesetal2011; Maunderetal2015; Mosnieretal2015).

Here, we develop an integrated population dynamics model for large wild herbivores that integrates aerial survey data with fine-resolution ground demographic survey data. The model can be used to predict large herbivore population dynamics and evaluate the relative importance of various factors driving their population dynamics. The model accounts for influences on large herbivore population dynamics of variation in climatic components, notably rainfall and temperature and their interactions; predation, density-dependence, population age and sex (adult sex ratio) structure, gestation length, weaning period, adult female pregnancy status, adult females available to conceive, females reaching the age of first-time conception, birth, juvenile and adult recruitment, age- and sex-specific survival rates and environmental seasonality. The model runs on a discrete monthly time step to reliably track temporal variation in female pregnancy status, birth, juvenile and adult recruitment, age and sex-specific survival rates, adult sex ratio, population size and inter-annual variation in reproductive seasonality, phenology, synchrony and prolificity of births. The model assumes that all births, recruitment or mortality occurs at the end of the month. The model is illustrated using the topi (Damaliscus lunatus korrigum) population inhabiting the Masai Mara Ecosystem of south-western Kenya but is very general and applicable to other large herbivore species. The model is developed within a hierarchical Bayesian framework and implemented using the Markov Chain Monte Carlo (MCMC) method for parameter estimation, prediction and inference and bespoke code in the R software (rprog).

Our modeling approach possesses several attractive and desirable properties. First, it integrates different types of data from multiple sources, accommodates and permits straightforward representation of complex non-linear relationships between birth rate, age- and sex- specific survival rates and multiple covariates. Second, it allows efficient computation of posterior distributions of many parameters and uses a flexible Transformation MCMC (TMCMC) technique to enhance computational efficiency and accelerate convergence of iterations. Third, it is validated using balanced bootstrap sampling to generate multiple population trajectories and establish robustness. Lastly, it uses the Importance Resampling MCMC (IRMCMC) technique for the first time to accelerate MCMC iterations for multiple data sets each of which involves high computational costs.

The rest of the paper is organized as follows. We describe the data in Section 2. The Bayesian state-space model is described in Section 3 along with an evaluation of its performance using balanced bootstrap samples. In Section 4, we discuss the formulation of survival and adult sex ratio models, prior distribtions and other model components. In Section 5, we discuss the convergence of the MCMC chain and model validation. In Section 6, we present results of applying the state-space model to the Mara-Serengeti topi population. Finally, in Section 7 we discuss the results and extensions of the model.

2 The data

2.1 Ground vehicle age and sex composition sample surveys

The Masai Mara Ecological Monitoring Program (MMEMP) carried out monthly vehicle ground sample counts of seven ungulate species, including topi from July 1989 to December 2003. The monitoring sample surveys were conducted in the 1530 km Masai Mara National Reserve (1–1S, E) and in a small sliver of the pastoral lands adjoining the Reserve to the north and northeast. This region (called Mara) is located in the Narok County of Kenya and is the northern-most section of the Greater Mara–Serengeti ecosystem, covering about 40,000 km in south-western Kenya and northern Tanzania. The Mara was partitioned into three counting blocks, each with a fixed strip transect, using major rivers and roads (OgutuPiephoDublin2008a OgutuPiephoDublin2008a, Ogutuetal2009). The three transects jointly sampled 29.4% (450.4 km) of the Mara Reserve. During the 174 months of monitoring, counts were not carried out on only one of the three transects during 6 months. Moreover, no counts were made at all in another 17 months because of flooding, logistical and other constraints. The 17 months were distributed over nine years and eight calendar months (Ogutuetal2015b)

. To determine the functional forms of the relationships between birth rate, survival rates, or adult sex ratio and the covariates, all the missing ground counts were imputed using a transect-specific state-space model, separately for each age and sex class (

PiephoOgutu2007, PiephoOgutu2007; Ogutuetal2015b, Ogutuetal2015b). The full model was fit, however, using the original ground count data with all the missing values.

The monthly vehicle ground counts were carried out along the three strip-transects out to 1000 m either side of the transect centre line. The vehicle was driven off the transect path to within 200 m of each sighted group and stopped to reliably assign individuals to age and sex classes and count animals using binoculars. All sighted animal herds were counted for the target species. The same transects and counting procedures were used throughout the monitoring period to ensure representative and consistent coverage. Year round accessibility by car dictated the choice of transect paths. Each count took 4 days to conduct on average, with a counting day starting at 7:00–7:30 h and ending at 14:00–15:00 h.

The complete groups were recorded by date, counting block, species and number of individuals in each age and sex class during each count. A combination of body size, coat colour, horn length and shape was used to assign animals to size classes (newborn, quarter, half, yearling, three-quarter and adult size). Approximate ages of topi in each size class, in months, are presented in Table 2 in OgutuPiephoDublin2008a (OgutuPiephoDublin2008a, Ogutupiepho2011) and Table 1. Ages were not assigned to adult topi. Adult topi were sexed using the presence, size and shape of horns, dimorphic morphology of the external genitalia and other secondary sexual characters (OgutuPiephoDublin2008a). Animals were highly visible because of open grasslands, minimising the likelihood of misclassification into age–sex classes. Chances of misclassification of fully grown topi were further reduced by amalgamating the adult and three-quarter size classes (OgutuPiephoDublin2008a, OgutuPiephoDublin2008b, OgutuPiephoDublin2008b, Ogutuetal2015b). During the entire monitoring period 91,582 topi were aged and 78,738 were sexed (OgutuPiephoDublin2008a).

The apparent monthly birth rate (or recruitment) for topi was estimated as the total number of newborns counted in each month, divided by the corresponding number of adult females. The month of conception was estimated by backdating recorded birth months by the gestation length of topi (Western1979). The complete ground sample count data and total montly rainfall data for 15 stations are available at
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0133744 (Ogutuetal2015a).

Age class Age class limit
Newborn Age 1 month
Quarter 1 month Age 6 months
Half-yearling 6 months Age 19 months
Adult 19 months Age
Table 1: Age-limits of the various topi age groups.

2.2 Aerial sample surveys

These data are independent of the vehicle ground sample age and sex structured counts. The Directorate of Resource Surveys and Remote Sensing of Kenya (DRSRS) started monitoring wildlife population size in Kenya’s rangelands, including the Mara, using systematic reconnaissance flights in 1977. The same sampling procedure was used to conduct 75 surveys in the Masai Mara ecosystem (6665.6 km) from 1977 to 2018 using 662 flights. The flights are carried out using high-winged aircraft (Cessna 186 or Partenavia 68) equipped with Global Navigation Systems, a Global Positioning System, Internal Communication System and radar altimeters for accurate navigation and mapping of animal distribution. The survey flights follow systematic, east-west oriented and parallel flight lines or transects spaced typically 5 km apart, following the Universal Transverse Mercator (UTM) coordinate system. North-south transects are used only where the terrain makes east-west flying at low altitude too dangerous to undertake. Each transect is subdivided into equal sample intervals, generally 5 km long (Ogutupiephoetal2016). A typical sampling unit is thus 5 5 km ( = 46 surveys) but 27 surveys had a higher (5 2.5 km) whereas the first two surveys in 1977 had a lower (5 10 km) spatial resolution. The total number of sampling units varied slightly across surveys during 1977–2018 (261.260.99, range 134–498 units, = 75 surveys). Animals are counted only within observation strips defined on the ground by two parallel rods (streamers) attached to the wing struts of the aircraft. The crew consists of a pilot, two rear-seat observers who count and record animals sighted within the strip width and one front-seat observer who records land use and cover and other environmental variables. Prior to each survey the aircraft is calibrated to the defined survey strip width. The average strip width was 266.1 29.78 m (range 210–366 m, = 75 surveys) and the average flying height was 75–122 m above ground level. The flight speed varies with terrain features but never falls below 190 km /hour. The average coverage or sampling intensity was 5.25 1.61% (range 2.0–10.88%, = 75 surveys).

Observations were recorded on video tapes. A 35 mm analogue camera was used to take oblique photos of groups of more than 10 animals for later correction of the visual estimates from 1977 to 1990s. The photos were captured on film rolls, which were then projected onto a large screen for accurate counting. From 1990s onwards, 35 mm digital cameras were used. The digital photos are downloaded and animals counted on large computer or digital television screens. Animals in earlier photos were counted with the aid of binocular microscopes or data projectors. Jolly’s method 2 (JollyGM1969)

for transects of unequal lengths is used to estimate population size and its standard error. The accuracy of population estimates derived from the aerial sample surveys ranges between 71% and 83% or higher

(OttichiloKhaemba2001; Georgiadis2011)

. A semiparametric generalized linear mixed model regression assuming a negative binomial error distribution and a log link function was used to interpolate the 75 aerial surveys to cover all the 174 months spanned by the vehicle ground surveys (

Ogutuetal2015b, Ogutuetal2015b; Ogutupiephoetal2016, Ogutupiephoetal2016). Only 14 aerial surveys were done in the period July 1989–December 2003 covered by the ground counts.

2.3 The Mara-Serengeti topi population

Topi weigh 100-120 kg, have elongate and narrow faces, ringed and lyrate horns. They are widely but patchily distributed across Africa from South Africa and Botswana in the south where they are called tsessebe, to Sudan in the north and Nigeria in the west, where they are called tiang. They select mesic, perennial grassland or savanna. The Greater Mara-Serengeti Ecosystem supports one of the largest remaining topi populations in East Africa (JarmanSinclair1979).

Topi numbers declined persistently by 73% (from 35,898 to 9,686 animals) in Kenya’s Narok County and by 80% (from 20,204 to 4,031 animals) within the Maasai Mara National Reserve between 1977 and 2016 (Ogutupiephoetal2016, Ogutupiephoetal2016; Veldhuis2019, Veldhuis2019). The causes of the troubling declines and their relative importance have not yet been fully quantified.

Topi is a resident grazer in the Mara–Serengeti ecosystem. There, births are seasonal, start in July and peak at the onset of the early rains in October-November, whereas conceptions peak at the start of the long rains in February-March. Births occur in all months but are rare from January to July (SinclairMduma2000, SinclairMduma2000; OgutuPiepho2010, OgutuPiepho2010; OgutuPiepho2014, Ogutuetal2015b). The gestation period of 8 months is followed by a lactation period of 3 months. Consequently, topi young are weaned after 3–4 months and nursing ceases before conceptions. Topi thus take about 11 months from one conception cycle to the next and give birth to one young per year. The young go through a hiding stage before following their mothers (Estes1991; SkinnerChimimba2005). Females attain sexual maturity after about 18 months. The pregnancy rate in Mara-Serengeti is 100% for adult but 86% for 2-year-old females (Duncanthesis1975).

Topi likely compete for limiting high-quality food in the dry season and the intensity of this competition likely increases with population density (JarmanSinclair1979). We therefore use topi population size to index intraspecific competition for resources. Lastly, topi predation is likely higher for the younger age classes and in the wet season. This is because numerous wildebeest (Connochaetes taurinus), Plains zebra (Equus quagga) and Thomson’s gazelle (Eudorcas thomsoni) migrating to, and occupying the Mara in the dry season, from the adjoining Serengeti Ecosystem in Tanzania to the south (southern migration) and Loita Plains in Kenya to the north-east (northern migration, which is virtually extinct) absorb most of the lion (panthera leo) and spotted hyena (crocuta crocuta) predation pressure falling on resident Mara ungulates, such as topi (ScheelPacker1979).

2.4 Rainfall and temperature

In African savannas vegetation production, quantity and quality are controlled by rainfall (Deshmukh1984, Deshmukh1984; Bouttonetal1988a, Bouttonetal1988a). Rainfall seasonality generates and sustains seasonality in food availability and quality for large herbivores (Bouttonetal1988b). Accordingly, rainfall is widely used to index food availability and quality for savanna herbivores. Seasonal temperature fluctuations additionally affect food quality for herbivores by governing the retention period of green plant leaf through the dry season.

In the Mara, rain falls in the wet (November–June) and dry (July–October) seasons. The wet season consists of the early wet-season (November–February, “short rains”) and the late wet-season (March–June, “long rains”) components. Rainfall distribution is thus bimodal, with a secondary peak in December–January separated from a primary peak in April–May by a short dry season in February (OgutuPiephoDublin2008a). Monthly rainfall was averaged over a network of 15 monitoring gauges spread over the Mara to account for spatial variation (Mukhopadhyayetal2019). The rainfall averaged 785.0 151.7 mm in the wet season and 213.7 76.4 mm in the dry season. The total monthly rainfall averaged 82.1 51.6 mm during 1989–2003. The annual rainfall total over 1965–2003 averaged 1010.1 187.3 mm, with all months receiving an average of no less than 46 mm of rainfall (Ogutuetal2015b). Hence, the dry-season nutritional shortfalls for ungulates should be relatively low. Severe El Niño Southern Oscillation (ENSO) droughts in 1993, 1997 and 1999–2000, plus mild droughts in 1991 and 1994 characterised rainfall during 1989–2003, whereas extreme floods, coincident with the strongest ENSO episode on instrumental record up to 2003, occurred in 1997–98 (Bartzke2018, Bartzke2018; OgutuPiephoDublin2008b, OgutuPiephoDublin2008b).

Maximum daily temperatures are lowest from May to August (Ogutuetal2015b). The monthly minimum temperature records for 639 months spanning January 1960 to March 2013 for Narok Town situated 75 km northeast of the Mara Reserve averaged 9.2 2.3 ℃ (range 0.3 - 17.4 ℃). The corresponding monthly maximum temperature averaged 24.9 2.1 ℃ (range 19.8 - 30.6 ℃). Temperatures are rising in the Mara with the minimum component increasing more steeply than the maximum (OgutuPiephoDublin2008b). The monthly averages of blended satellite-station maximum and minimum temperatures data for each 5 5 km grid cell in the Mara Ecosystem were also extracted from the Chirps data (Funketal2015) and used as covariates.

3 The integrated population dynamics state-space model

We aim to construct a general age- and sex-structured population dynamics model for large wild herbivores. The model uses the number of animals observed in the ground and aerial surveys (Sections 2.1 and 2.2) that are only samples from the unobserved true population about which we wish to make inferences. Thus, the population dynamics model entails two parallel but connected processes. The first is the unobserved true population that evolves over time (called state process) and the second are the observed counts over time (called observation process). A common approach to modelling both processes simultaneously is to use state-space models (Thomasetal2005, Thomasetal2005; Bucklandetal2007, Bucklandetal2007; Newmanetal2009, Newmanetal2009; Newmanetal2014, Newmanetal2014; Maunderetal2015, Maunderetal2015).

We develop an integrated population dynamics state-space model which couples a hypothetical mechanistic model of large herbivore population dynamics (state process model), with a statistical observation model of aerial survey and ground demographic data (observation process model). In the state-space model, the state process model predicts the true but unknown future state of the large herbivore population given its current state. The observation model weights the predictions by the likelihood of the data and thus links the process model to the observations. Consequently, the model integrates the aerial survey monitoring data with the contemporaneous but independent ground demographic survey data.

The state-space model involves quantifying birth recruitment and survival rates of various age and sex classes of the population and sex ratio as functions of climatic factors (e.g., rainfall and temperature), intraspecific competition, population density, predation and seasonality. Our state-space model shares similarities with the general approaches proposed by Bucklandetal2004 and Newmanetal2006 (Newmanetal2006; Newmanetal2014

) but also has some notable differences. In particular, we make different distributional assumptions for the initial states and the other components of the state and observation processes. Most crucially, our approach differs from theirs with respect to several structural assumptions and our proposal to model transition probabilities of the state process using log-linear models in which covariates such as rainfall, temperature and population density are used as predictors of birth recruitment and survival probabilities and adult sex ratio, similar to the Bayesian approach of

Brooksetal2004. Also, unlike Newmanetal2006, we illustrate our model using a non-migratory species. Lastly, our model incorporates several key life-history traits of large herbivores crucial to understanding their biology and population dynamics and uses rare fine-resolution ground demographic survey data and long-term monitoring aerial survey data.

In Sections 3.1 to 4.4, we describe the state process and observation models and the associated notations. In particular, we describe parameters of the state process model and how they link birth recruitment and age- and sex-specific survival probabilities and adult sex ratio with covariates in Sections 4.14.2. Accurate estimation of birth recruitment and age- and sex-specific survival probabilities and sex ratio is therefore a crucial step in developing the state-space model. As a result, the model explicitly allows for the dependence of birth recruitment and age- and sex-specific survival probabilities and sex ratio on food availability and quality, density-dependent intraspecific competition for food and large carnivore predation. Influences of these factors are indexed by past rainfall, minimum and maximum temperatures and their interactions, prior total population size and environmental seasonality.

The relatively large number of parameters considerably complicates their estimation using classical techniques. We overcome this difficulty using a flexible Bayesian state-space model and present forms of the prior and full conditional distributions in Sections 4.4 and 5 and in Section S.2 in the Supplementary materials.

3.1 The age and sex structured state-space model-process model

To construct the age- and sex-structured state-space model we first introduce notations for the different topi age and sex classes at time of the observation process as follows:
= observed number of newborns.
= observed number of quarter-sized animals.
= observed number of half-yearlings.
= observed number of adult females.
= observed number of adult males.

The state process involves the same age and sex classes. The true but unknown numbers of animals in each age and sex class are denoted by
= actual number of newborns.
= actual number of quarters.
= actual number of half-yearlings.
= actual number of adult females.
= actual number of adult males.

3.2 Assigning topi to actual ages in months

Since the exact age of topi is hard to determine through visual observation in the field, animals were only assigned to age and sex classes. However, the probability of survival likely varies with age and other temporally varying covariates, such as rainfall. For example, a newborn topi has to survive the first month of its life to join the quarter-size class. Likewise, a quarter-size topi has to survive through 5 consecutive months before graduating to the half-yearling class. So, we assign topi in the quarter and half-yearling age classes to actual ages in months as follows.

= Number of individuals in the quarter-size class with actual ages lying between and ( included but excluded), months,
= Number of individuals in the half-yearling class with actual ages lying between and ( included but excluded), months.
Note that and .

For adult females, tracking the reproductive cycle is essential for understanding population dynamics. Young topi start reproducing at about 19 months of age. The topi reproductive cycle spans 11 months, including 8 months for gestation and 3 months for lactation. We assume that a female cannot conceive during this 11-month period. If pregancy is prematurely terminated, however, then a fresh conception may occur within the 11-month period. But we do not have data to estimate the probability that a pregnant female topi fails to carry pregnancy to term and so do not consider it in the model. We track the pregnancy status of adult females in each of the the 11 months spanned by the reproductive cycle as follows:

= Number of adult females at time , that gave birth exactly months ago, ;
= Number of adult females at time , that gave birth at least 12 months ago.

The adult males are not split into subgroups. Male and female half-yearlings at time , denoted by , that join the adult class at time , are denoted by and , respectively. We denote by the probability that an individual graduating from the half-yearling class to the adult class is a female at time (henceforward referred to as sex ratio). is therefore the number of new adult female recruits that can conceive at time and give birth 8 months later. So, we add the new adult female recruits to , the number of adult females that gave birth exactly three months ago, and track future changes in the resulting total number. Note that . From the preceding definitons, it follows that a total of only females can conceive at time . All these processes are illustrated diagrammaticality in Figure 1.

Figure 1: Flow chart showing the reproduction and recruitment processes in topi. The notations are defined in the text.

3.3 Stochastic topi population dynamics

The stochastic topi population dynamics model for time can therefore be cast as follows:

As defined before, is the number of females that gave birth exactly one month ago. We assume that the number of females that gave birth at time is the same as the number of newborns recorded at time . (This is an underestimate because some newborn calves are almost certainly lost to predators and other mortality sources before they are counted.) The population dynamics of adult females is thus modelled as follows:


3.4 Initialising population size for topi age and sex classes

The time = 0 for our data corresponds to June 1989, one month before the start of the ground sample surveys. To model population size at time , we need to know the initial population size at time = 0. We denote the initial population distribution by
= Number of newborns at time = 0.
= Number of quarters of age at time = 0, 6.
= Number of half-yearlings of age at time = 0, 19.
= Number of adult females that gave birth months before time = 0.
= Number of adult males at time = 0.

The initial states for the different age and sex classes at time = 0 (namely, ,

, etc.) are assumed to follow normal distributions with means determined by the estimated population age structure at the initial time

= 0 and variances assumed to be all equal to 20,000.

We used the aerial survey and ground survey data to estimate the unknown age and sex structure of the initial population. First, we selected the ground sample counts for the month of June. We calculated the proportion of animals in all the different age and sex classes in each of the 15 years spanning 1989 to 2003 and averaged the proportion for each age and sex class across all the 15 years. We used this average to represent the population proportion for each age and sex class at time = 0 for the month of June. To derive the initial population size estimate for each age and sex class, we multiplied the total population size estimated from the aerial survey data at time = 0 (June 1989) with the average proportion for each age and sex class for June. We use these initial population size estimates for each age and sex class as the means of the normal distributions for the corresponding initial states, , , etc.

3.5 Observation process model

The hidden states are linked to the observed counts ( , , ,

) assuming Poisson distributions as described in equation



with parameters ,

each assumed to follow a gamma distribution as in equation



where and . The parameter is a constant and can be interpreted as the variance of the actual population size. Similarly, we define , , , , , , and .

We assume that has expected value . But some newborn topi are almost certainly missed during the ground surveys because topi hide their young for some time soon after birth, carnivores kill some newborns whereas others are simply missed due to visibility bias. To account for potential underestimation, we therefore multiplied with a correction factor of 1.7, based on experimentation, before determining birth rates in section 4.2. More generally, however, sightability bias in can be modeled by allowing to follow a Poisson distribution with parameter , where is a proportionality factor.

4 Predicting expected number of animals in each age and sex class using simultaneous linear equations

We used an interdependent system of linear regression equations

(Theil1971) to estimate the expected total number of animals in each age and sex class present in the ground sample in each month. The current month endogenous variables appear as regressors in equations for other age or sex classes in the system of simultaneous equations. The model accounts for potential correlation of errors for the set of related regression equations to improve the efficiency of parameter estimates. The modelling framework used estimation procedures that produce consistent and asymptotically efficient estimates for the system of linear regression equations. We imposed linear restrictions on some of the parameter estimates and fitted the model in the SAS SYSLIN procedure (SAS Institute 2019). The SAS code used to fit the simultaneous linear equations are provided in S1 Text in the Supplementary Materials.

4.1 Notations for predictors of birth recruitment and survival rates

We use the following notations for the covariates used to model topi birth recruitment and survival rates and adult sex ratio. We denote time by and the 174 months covered by the vehicle ground counts by . The calender month at time is denoted by
= Month corresponding to January-December and numbered as 1, 2, , 12.
= Average total monthly rainfall including lags 6 to 10 at time (i.e., 7 to 11 months before the birth month).
= Total topi population size at lag 7 at time (i.e., around conception time 8 months ago).
= Minimum temperature at time .
= Maximum temperature at time .
= Minimum temperature at lag at time , , 11 (i.e., up to 4 months pre-conception).
= Maximum temperature at lag at time , , 11.
= Total population size at lag 1 at time .
= Total monthly rainfall at lag at time , . Note that stands for total monthly rainfall at time .
= Total wet season rainfall at lag 1 (i.e., in the immediately preceding year) at time .
= Total dry season rainfall at lag 1 (i.e., in the immediately preceding year) at time .
= Moving average of rainfall between lags and () at time .

4.2 Determining the birth recruitment, sex ratio and age-specific survival functions

We used a logistic regression model with a binomial error distribution and a logit link function to relate the proportion of adult females that gave birth (1.7

) to the total population size of topi averaged over 7–11 months before the birth month and prior minimum and maximum temperatures and prior rainfall. We similarly related the survival rates for the quarter size class, half-size class plus yearlings, adult female and adult male topi to the total topi population size in the preceding 1 to 3 months, seasonality (indexed by month), prior minimum and maximum temperatures and prior rainfall. For modelling adult sex ratio, seasonality (indexed by the dry (June-September) and wet (January-May, November-December) seasons), prior rainfall during the wet and dry seasons and prior temperature are used as covariates in the logistic regression model. The definition of wet and dry seasons used with adult sex ratio is based on the distribution of topi numbers across months and differs from the meteorological wet season that spans November-June and dry season that covers July-October. The particular lagged values of topi population size, lagged and moving average values of minimum and maximum temperatures and rainfall considered for each age class and adult sex class are summarized in Tables 2 and 3.

Rainfall component
Covariate Months covered Moving averages Lags Lagged Moving averages
Monthly rainfall Each month Mavrain1–Mavrain5 Rain1–Rain5 Mavrain6-9, Mavrain6-10, Mavrain7-10, Mavrain7-11
Early wet season Nov–Feb Mavearlywet1–Mavearlywet3 Earlywet1
Late wet season Mar–Jun Mavlatewet1–Mavlatewet3 Latewet1
Early dry season Jul–Aug Mavearlydry1–Mavearlydry3 Earlydry1
Late dry season Sep–Oct Mavlatedry1–Mavlatedry3 Latedry1
Wet season Nov–Jun Mavdry1–Mavdry3 Wet1
Dry season Jul–Oct Mavlatewet1–Mavlatewet3 Dry1
Annual Nov–Oct Mavannual1–Mavannual3 Annual1
Table 2: Rainfall components, the months covered by each component, moving averages, lags and lagged moving averages computed for each component and used as predictors of survival probabilities and adult sex ratio.
Temperature Component
Covariate Months covered Moving averages Lags
Min Nov–Oct Mavmin1–Mavmin5 Min1–Min5
Max Nov–Oct Mavmax1–Mavmax5 Max1-Max5
Population component
Covariate Months covered Moving averages Lags
Population size Each month Pop1-Pop8
Table 3: Temperature, population components, the months covered by each component, moving averages, lags and lagged moving averages computed for each component and used as predictors of survival probabilities and adult sex ratio.

For each age and sex class we considered all the pertinent lagged topi population size, lagged and cumulative moving average values of the minimum and maximum temperatures and rainfall. We observed the principle of marginality (Nelder2000)

to ensure that interaction, quadratic and cubic terms in the covariates are included only if the main effects are already included in the model. We used forward selection to select variables to retain in the model. We used automatic model selection based on the Akaike (AIC) and Corrected (AICc) Akaike Information Criteria to select the best supported model by using the SAS HPGENSELECT procedure (SAS Institute, 2019). We re-fit the selected best model for each age and sex class and adult sex ratio using the SAS GLIMMIX procedure (SAS Institute, 2019). We estimated and tested parameters in the final model for statistical significance and estimated the 95% confidence limits of the parameters. The selected best model was simplified by dropping clearly insignificant effects. The estimates of parameters generated by SAS GLIMMIX procedure were used as hyperparameters of the prior distributions of the regression coefficients.

The functional forms of the logit regression models relating birth recruitment and age-specific survival rates and adult sex ratio to covariates are given in equations 5 through 8. Descriptions of the notations used in these equations are provided in Section 4.1.


The SAS codes for relating birth recruitment and survival rates and adult sex ratio to the various covariates are provided in S2 Text in the Supplementary Materials.

4.3 Determining predation risk

The birth and survival rates are adjusted for environmental seasonality and seasonality in predation risk. During the dry season (July-October) migratory herbivores occupy the Mara, generating a superabundance of food for large predators thereby considerably reducing predation risk for resident large herbivores, such as topi. So, we assume, based on parameter tuning, that predation risk for topi during the dry season is 70% of the risk during the wet season when the migrants are absent from the Mara. Also, adult male topi often fight each other and defend mating territories, potentially elevating their susceptibility to predation. Thus, we assume, also based on parameter tuning, that the survival rate for adult males is 99.7% that of adult females.

4.4 Prior distributions

The prior distributions for the regression coefficients , , , and of the logit regression models are assumed to be normal with the same means and diagonal covariance matrices with diagonal elements equal to the estimated covariance components for the corresponding regression coefficients from the SAS GLIMMIX procedure. The empirical choice of the priors ensures good mixing of the MCMC chains.

We assumed a prior on in equation 4 following a gamma distribution, Gamma(50,000, 0.000001), which has a very large average variance for topi population size of 50,000,000,000. Prior distributions for the initial states (, , etc.) are specified in Section 3.4.

5 Full conditional distributions and convergence of MCMC chains

The forms of the full conditional distributions of , , , , and other parameters are presented in Section S.2 of the Supplementary materials. The functional forms of the full conditional distributions of , , and (collectively refered to as ’s) and other parameters are also presented in Section S.2 of the Supplementary materials.

The functional forms of the full conditional distributions for the ’s are not conformable to Gibbs sampling. Moreover, the resulting Metropolis-Hastings chain for the ’s converge quite slowly, making the algorithm highly inefficient. To accelerate the rate of convergence of the chain we implement Transformation Markov Chain Monte Carlo (TMCMC) at the Metropolis-Hastings step. A theoretical discussion of TMCMC can be found in DuttaBhattacharya2014. Details on how TMCMC was specialized for our chain are discussed in Section S.3 of the Supplementary materials. The MCMC simulations were continued for 1,000,000 iterations after discarding the initial 100,000 iterations. Trace plots for the various ’s are shown in Figures S1–S4 in the Supplementary materials and indicate good convergence of all the chains.

5.1 Model validation

To establish robustness of the model, we performed a model validation test using balanced bootstrap samples. Details of the sampling method are provided in Section 5.2.

5.2 Resampling topi aerial sample counts using balanced bootstrap sampling

We first drew 10 different samples from each of the 75 aerial surveys conducted between 1977 and 2018 using balanced bootstrap sampling. The balanced bootstrap selection was performed by using the algorithm of Gleason1988 in SAS PROC SURVEYSELECT (sas2019). The balanced bootstrap method was used to select 10 samples from each of the total of 75 aerial surveys with equal probability and with replacement, where each aerial survey had 232 to 705 sampling units each measuring 5 5 km, 2.5 5 km or 10 5 km. Because the bootstrap selection is balanced the overall total number of selections is the same for each sampling unit (Davison1986). We then estimated the total topi population size for each bootstrap sample using Jolly’s method 2 (JollyGM1969).

5.3 Cross-validation results

Before running the model to produce the parameter estimates and interpreting their significance, we validated the model to establish its suitability for predicting population dynamics. The estimates of population sizes from the bootstrap samples served as the actual population size of a hypothetical topi population. Using these estimates, we generated a set of 10 time series of hypothetical ground survey data each of length 174 months and having the same age and sex classes as the actual ground sample count data (further details in Section S.4 of the supplementary materials). We call these hypothetical ground data sets generated data and the corresponding population size estimates bootstrap population estimates. We then fit our model separately to each of the generated data and obtained estimates of total population size and the corresponding 95% credible limits of the estimates from the state process model in Section 3.1. We call these estimates generated estimates. Next, we compare the bootstrap population estimates with the generated estimates by observing whether the bootstrap estimates lie within the 95% credible limits of the generated estimates for each of the 10 bootstrap population time series. However, running the model separately for each time series is computationally very expensive. To reduce the computing time we used the idea behind the Importance Resampling MCMC (IRMCMC) proposed by BhattaHaslett2007. Though developed for inverse problems, this method can be generalised to tackle the current problem. The precise details of this generalization are discussed in Section S.4.1 of the Supplementary materials. We ran our model using IRMCMC for each of the generated data sets and calculated the percentage coverage of the bootstrap populations at each of the 174 time points. The coverages for the 10 time series of bootstrap samples vary from 85% to 97%. The R code used to implement the full model is provided in S3 Text in the Supplementary Materials.

6 Results

6.1 Topi population trajectory by age and sex class

The model captures the essential and well-known features of the Mara topi population dynamics. First, it accurately captures the declining population trajectory of all the age and sex classes; adult female, adult male, half-yearling, quarter-size and newborn topi in the Mara between 1989 and 2003 (Figures 2 and 3). Second, it accurately tracks inter-annual variation in the phenology, synchrony and prolficity of topi births and juvenile recruitment (Figure 2; SinclairMduma2000, SinclairMduma2000; OgutuPiepho2010, OgutuPiepho2010, Ogutupiepho2011). Figures 2 and 3

show the observed and predicted population sizes for each age and sex class for each of the 174 time points (July 1989–December 2003) and the associated 95% credible limits averaged across the 1,000,000 MCMC simulation replications. The 95% credible intervals for the predicted values are not too wide indicating convergence of the MCMC chains. The birth recruitment rates (Figures

1(a) and S3a in the Supplementary materials) show strong seasonality consistent with the pronounced reproductive seasonality characteristic of the Mara-Serengeti topi. Further, the prolificacy of topi births is strongly time-varying, reflecting the controlling influence of the seasonally and inter-annually varying rainfall (OgutuPiepho2014, OgutuPiepho2014, Ogutuetal2015b). The pronounced seasonality in prolificacy of births and birth recruitment also carry over to the trajectories of the quarter and half-yearling size classes (Figures 2b–c) but not to the adult age class (Figures 3a–b). The persistent declines in the trajectories of topi birth recruitment, quarter and half-yearling classes, adult males and females and the overall topi population size are consistent with the overall topi population decline in the Mara Ecosystem from 1977 to 2016 (Figure 3c; Ogutupiephoetal2016, Ogutupiephoetal2016; Veldhuis2019, Veldhuis2019). Finally, there is evident seasonality in the overall topi population trajectory generated by the strong seasonality of births and juvenile recruitment in the ecosystem (Figure 3c).

6.2 Adult female recruitment and females available to conceive

Adult female recruitment is strongly seasonal, consistent with the seasonality in births and juvenile recruitment. The expected number of new adult females recruited into the population per month peaked in 1991–1994 and 1999 but was noticeably low in the other years (Figure S5a). Moreover, the number of adult females that gave birth exactly 8 months ago (Figure S5b) and the number of adult females that were available to conceive (Figure S5c) decreased persistently and markedly. The latter reduced from a maximum of nearly 7,000 in 1989 to around 2,000 animals by 2003 (Figure S5c). Thus, the decline in newborns was associated with a decline in the number of adult females.

6.3 Temporal variation in age structure and adult sex ratio

The expected population proportions of newborns, quarter-size and half-yearlings all tended to increase as the topi population decreased but the proportion of the adult population segment remained rather constant over time apart from sustained seasonal oscillations from 1989 to 2003 (Figures S2a–d). The proportion of adult females increased from a monthly maximum of 50% to 57% whereas the proportion of adult males decreased correspondingly from a peak of about 50% to 43% between 1989 and 2003 (Figures S2e–f).

6.4 Factors influencing birth recruitment and survival rates and adult sex ratio

The posterior means, standard deviations and 95% credible intervals for parameters of the models relating birth recruitment, survival rates and adult sex ratio to various covariates are summarized in Tables S3–S7. Similarly, posterior densities for a sample of the parameters are displayed in Figures S5–S8. There was evident seasonality in birth recruitment and survival of quarter-size topi but not in the survival of half-yearlings or adults. Birth recruitment was positively density-dependent for the declining topi population and was also influenced by prior rainfall and average temperatures (Table S3). Besides seasonality, quarter-size survival was influenced only by prior rainfall (Table S4). Half-yearling survival was influenced only by past rainfall (Table S5). Lastly, adult survival was negatively density-dependent and also varied with past rainfall amounts (Table S6). Adult sex ratio varied seasonally and with rainfall, minimum and maximum temperatures but was apparently not density-dependent (Table S7).

Figure 2: Modelled (black lines) population trajectories and the associated 95% lower (red lines) and upper (blue lines) credible limits for a) newborn, b) quarter and c) half-yearling topi.
Figure 3: Modeled (black lines) population trajectories and the associated 95% lower (red line) and upper (blue line) credible limits for a) adult female, b) adult male and c) total topi population size overlaid with the population size estimates (green filled circles) and their standard errors (vertical whiskers) based on the DRSRS aerial surveys. The DRSRS population size estimates and their standard errors were multiplied by 1.3 to correct for sightability bias (Stelfoxetal1986).

6.5 Birth recruitment rates, survival rates and sex ratio

The estimated birth recruitment, survival rates and sex ratios are consistent with expectation and are displayed in Figures S7 – S8 in Sections S.7 and S.8 of the Supplementary materials.

6.6 Comparing predicted topi population size with aerial survey data

Fourteen of the 75 DRSRS aerial surveys for the Masai Mara Ecosystem for 1977–2018 (Section 2.2) fell within the period spanned by the ground survey data (July 1989–December 2003). The total population size estimates from these 14 aerial surveys were in reasonable agreement with the total topi population size predicted by the Bayesian state-space model. Notably, the total topi population size predicted by the Bayesian state-space model was within one standard error of most of the total population size estimates derived from the 14 DRSRS aerial surveys (Figure 2(c)).

7 Discussion

We develop a flexible and general, integrated state-space model in a Bayesian framework for estimating large herbivore population demographic parameters, dynamics and the associated uncertainties. The model is illustrated using the Mara-Serengeti topi population inhabiting the World-famous Greater Mara-Serengeti Ecosystem of Kenya and Tanzania in East Africa. The state-space model allows estimation of both process and observation error variances (ValpineHastings2002, ValpineHastings2002; Bucklandetal2004, Bucklandetal2004). The model incorporates age and sex structure and key life-history characteristics (e.g., gestation length, lactation period, pregnancy status, females available to conceive) essential to understanding large herbivore population dynamics and efficiently integrates ground demographic survey with aerial survey data. The model relates birth recruitment and age-class specific survival rates and adult sex ratio to various covariates, such as prior population density, predation risk, past rainfall and temperature, environmental seasonality and their interactions. To estimate model parameters, we used the MCMC method in a Bayesian framework because of its flexibility (Brooksetal2004, Brooksetal2004; HoyleMaunder2004, HoyleMaunder2004; Schaubetal2007, Schaubetal2007). The convergence of the MCMC chains of the parameters is accelerated using the Transformation MCMC (TMCMC) technique (DuttaBhattacharya2014) and the idea behind IRMCMC (BhattaHaslett2007) to reduce the computational time for model validation.

The predicted population trajectories show persistent and marked declines in the Mara topi population from 1989 to 2003 in accord with the trends derived from aerial surveys for 1977–2018 (Ogutupiephoetal2016, Ogutupiephoetal2016; Veldhuis2019, Veldhuis2019). Importantly, whereas birth recruitment fluctuated around a stable average, the survival rates for quarter size, half-yearling size and adults decreased gradually and contemporaneously with the overall topi population decline. The disturbing and sustained decline in numbers of topi and other species in the Mara and across Kenya (Ogutupiephoetal2016) and continental Africa (Craigieetal2010) increases the urgency for establishing their leading causes and developing effective population conservation and recovery strategies.

The modelling framework can be extended to (i) identify the most influential processes driving population declines and assess their relative importance, (ii) test interesting ecological hypotheses, (iii) enable rigorous forward-projection of large herbivore population dynamics allowing for both parameter and future demographic uncertainty (HoyleMaunder2004). Further extension can also (iv) enable assessment of likely future populating trajectories under various scenarios of climate change, land use change, socio-economic development, human population growth, livestock population density, conservation and management interventions, such as formation of new wildlife conservancies (Rhodesetal2011, Rhodesetal2011; Maunderetal2015, Maunderetal2015; Mosnieretal2015, Mosnieretal2015). Moreover, the monthly time step allows the model outputs to be used to study potential shifts in reproductive seasonality, phenology, synchrony and prolificacy of births, juvenile recruitment and adult sex ratio (OgutuPiepho2010, OgutuPiepho2010; Ogutupiepho2011, OgutuPiepho2014, Ogutuetal2015b) in response to future changes in climate and other factors. It would also be interesting and useful to adapt the model for other species, with contrasting life-history traits and strategies, such as hartebeest (Alcelaphus buselaphus cokeii), impala (Aepyceros melampus), waterbuck (Kobus ellpsyprimnus), zebra, warthog (Pharcocoerus africana) and giraffe (Giraffa camelopardalis).

The model can also be extended to include several additional features. First, females that are reproducing for the first time can be allowed to have a lower pregnancy rate (86%) than older (100%) females (Duncanthesis1975). Second, females that lose their calves soon after birth can be moved to the group of females available to conceive. Third, the survival rate of calves aged 0 to 1 month can be explicitly incorporated in the model, especially if empirical estimates of such rates can be obtained. Fourth, the dependence of birth recruitment and age-specific survival rates and sex ratio on covariates may alternatively be made time-varying to potentially better account for temporal variation in the influence of the covariates. Fifth, the model can be made spatially explicit to allow for spatial variation in the type and intensity of factors influencing survival and recruitment rates and sex ratio. Sixth, sightability bias in newborns, , can be modeled explicitly by specifying to follow a Poisson distribution with parameter , where the proportionality constant

is sampled from a suitable probability distribution. Lastly, birth recruitment and survival rates and sex ratio can be related to additional covariates, particularly anthropogenic covariates, such as human population density, livestock population density, settlement density and progressive habitat loss.


The Masai Mara Ecological Monitoring Program was designed and supervised by Dr. Holly T. Dublin and executed by Messes Paul Chara, John Naiyoma, Charles Matankory and Alex Obara. It was funded by the World Wide Fund for Nature–East Africa Program (WWF–EARPO) and Friends of Conservation (FOC). The program also received financial, material or logistical support from WWF–US, WWF–Sweden, the Darwin Initiative (DICE), the University of British Columbia, United States Fish and Wildlife Service, Kenya Wildlife Service, Cottar’s Camp, Kichwa Tembo, Keekorok Lodge/Balloon Safaris and Kerr and Downey Safaris. WWF–EARPO, the Kenya Meteorological Department and Prof. K. Holekamp provided the rainfall data. Chris Shitote helped with extracting and processing the Chirps rainfall and temperature data. We thank the Director of the Directorate of Resource Surveys of Kenya (DRSRS), Dr. Patrick Wargute, for permission to use the aerial survey data. Gordon Ojwang, Lucy Njino, Mohammed Said, Bella Omollo, Haron Tanui, Timothy Kavithi and Shem Kifugo assisted with the verification, processing and making the aerial survey data available. Gordon Ojwang, Mohammed Said and Shem kifugo helped us understand the structure of the DRSRS data. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 641918. JOO was additionally supported by a grant from the German Research Foundation (DFG; Grant # 257734638) and by The Planning for Resilience in East Africa through Policy, Adaptation, Research, and Economic Development (USAID PREPARED) project.