Modelling spine locations on dendrite trees using inhomogeneous Cox point processes

07/29/2019 ∙ by Heidi S. Christensen, et al. ∙ 0

Dendritic spines, which are small protrusions on the dendrites of a neuron, are of interest in neuroscience as they are related to cognitive processes such as learning and memory. We analyse the distribution of spine locations on six different dendrite trees from mouse neurons using point process theory for linear networks. Besides some possible small-scale repulsion, we find that one of the spine point pattern data sets may be described by an inhomogeneous Poisson process model, while the other point pattern data sets exhibit clustering between spines at a larger scale. To model this we propose an inhomogeneous Cox process model constructed by thinning a Poisson process on a linear network with retention probabilities determined by a spatially correlated random field. For model checking we consider network analogues of the empirical F-, G-, and J-functions originally introduced for inhomogeneous point processes on a Euclidean space. The fitted Cox process models seem to catch the clustering of spine locations between spines, but also posses a large variance in the number of points for some of the data sets causing large confidence regions for the empirical F- and G-functions.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Point patterns on linear networks arise in a broad range of fields, where the network for example represents roads, a river network, or a dendrite tree. This paper focuses on the latter type of data: the left panel in Figure 1 shows six linear networks each representing a dendrite tree from a mouse neuron grown in vivo. On the dendrites small protrusions called spines are found that among other things help transmitting electrical signals to the soma. In neuroscience, the behaviour of spines are of interest as changes can be linked to changes in cognitive processes. The spine locations can be viewed as a point pattern on the dendrite tree and thus analysed using point process theory for linear networks.

Over the last two decades, methods for analysing point patterns on linear networks have been developed. Particularly, a network analogue of Ripley’s -function was first presented in Okabe and Yamada (2001) and later modified and extended to the inhomogeneous case in Ang et al. (2012). When defining the -function, Ang et al. (2012) required that the underlying point process model fulfils an invariance property called second-order pseudo-stationarity (an analogue to second-order intensity-reweighted stationarity as introduced in Baddeley et al., 2000). This property is fulfilled whenever the pair correlation function is isotropic, i.e. when it only depends on the shortest path distance. Baddeley et al. (2017) showed that certain constructions, e.g. special types of Cox point processes that lead to point processes in the Euclidean space with an isotropic pair correlation function rarely result in second-order pseudo-stationary point processes when adapted to linear networks. Even without the requirement of pseudo-stationarity, there are only a limited number of point process models available for linear networks. For point processes on directed acyclic linear networks, Rasmussen and Christensen (2019) presented both regular and clustered models defined by a generalisation of the conditional intensity function for temporal point processes. Anderes et al. (2017) supplied a list of valid isotropic covariance functions for connected linear networks that can be used to construct Cox point processes, particularly log Gaussian Cox processes (LGCPs; see also Møller et al., 1998).

Figure 1: Spine data sets for dendrite 1 to 6 (from top to bottom), where each main branch is coloured black and the side branches grey and the marks the vertex closest to the dendrite’s attachment to soma. Left: projection of the original three-dimensional network onto a plane. Middle: spine locations on the simplified networks embedded in

so that distances are preserved. Right: non-parametric kernel intensity estimates.

Only few studies use point process theory to analyse the behaviour of spines: treating the dendrite tree as a directed tree, Rasmussen and Christensen (2019) analysed one of the six spine point pattern data sets (hereafter, the ‘spine data’) from Figure 1. The distribution of spines (and their shape) have further been investigated using point process theory in Jammalamadaka et al. (2013) (testing a homogeneous Poisson process model) and Baddeley et al. (2014) (using multitype Poisson process models to account for the shape classification) for in vitro grown neurons. Based on the network -function, Jammalamadaka et al. (2013) concluded that a homogeneous Poisson process model seems adequate to describe the spine locations. However, Jammalamadaka et al. (2013) also stated that their results for the in vitro setting are unlikely to hold in an in vivo setting.

Instead of Poisson process models, this paper suggests a new class of Cox process models on a linear network. Such a model applies for an undirected graph and is not a LGCP, but its construction still exploits a Gaussian random field so that the covariance functions from Anderes et al. (2017) become useful. Moreover, seemingly for the first time in connection to point process model fitting on linear networks, we demonstrate the use of minimum contrast and composite likelihood estimation procedures. Finally, we introduce new empirical summary functions and demonstrate their usefulness for model checking.

The paper is organised as follows. The spine data is described in more detail in Section 2 along with the general notion of a linear network. In Section 3 we discuss existing as well as our new summary functions for point processes on linear networks; these are used for analysing the spine locations in Section 4. We initially suggest to model the spine locations by an inhomogeneous Poisson process model in Section 4.1, but due to clustering between spines we propose in Section 4.2 an inhomogeneous Cox process model. Lastly, we discuss in Section 5 possible extensions and future research directions.

2 Dendritic spine data

Figure 1 shows six examples of a linear network. Specifically, a linear network is a union of a finite number of line segments , , with finite length and intersecting only at the end points. A linear network may also be viewed as a graph consisting of a set of vertices and a set of weighted edges, where the edges coincide with the line segments , the vertices correspond to the end points of these line segments, and the weight of an edge is the length of the corresponding line segment.

Throughout this paper the distance between two points is measured by the shortest path distance and is denoted by . For the spine data, the linear network is a tree, meaning that there is only one path between any pair of points in . Naturally, in other applications more complicated networks than a tree occur in which case we may need to take more care when choosing the distance metric (see Section 5 for details).

The linear networks visualised in the left column of Figure 1 are approximations of the underlying apical dendrite trees which were extracted from six different mouse neurons grown in vivo. The vertices of each of the linear networks are described by three-dimensional coordinates which represent a spine location or another point chosen to obtain the approximation. To simplify each of the networks, for any pair of edges meeting at a vertex with degree two, we replace the two edges and the vertex with one single edge, and to preserve distances within the network, we let the weight of the new edge be the sum of the two old edges. Note that this transformation of the original linear network into a simpler one, relies on the fact that the linear network forms a tree. Further, to utilise the functionalities of the R-package spatstat (Baddeley et al., 2015), we embed the simplified network in in a way that also preserves distances. The simplified and embedded versions of the networks are shown in the middle column of Figure 1 along with the spine locations. As the models and tools we use in Section 4 to analyse the spine data do not directly depend on the three-dimensional coordinates but on distances, we can without loss of information consider the spine locations as a point pattern on the simplified and embedded network.

The spine data origin from six neurons, with two neurons from each of three different mice. The numbering of the dendrites is as follows: dendrite 1 and 2 come from mouse no. 1; dendrite 4 and 5 from mouse no. 2; and dendrite 3 and 6 from mouse no. 3.

For each dendrite tree, we talk about two subsets: the main branch and the side branches. Main branch refers to the tree’s stem, while side branches constitute the rest of the tree. Figure 1 shows which parts of the original trees and the simplified embedded trees belong to the main branch and which to the side branches. In the following, denotes the whole dendrite tree, where is the main branch, and is the union of the side branches. Further, and denote the number of spines on and , respectively. Lastly, we let denote the size of or more precisely the total length of the (partial) line segments constituting ; note that . Table 1 summarises the number of spines and sizes for each dendrite tree.

Dendrite
1 51 72 212 202 0.240 0.356
2 36 145 204 430 0.176 0.337
3 69 63 211 202 0.328 0.312
4 33 308 225 652 0.134 0.477
5 34 83 286 450 0.119 0.184
6 30 62 178 250 0.168 0.248
Table 1: Number of spines, length, and intensity estimates for the main and side branches separately.

3 Point processes on linear networks

The six spine point pattern data sets are modelled as realisations of six point processes defined on the six dendrite trees. In general, by a point process on a linear network we mean a random finite subset of ; we use this generic notation throughout this paper. In this section we consider summary functions useful for analysing point processes on linear networks, including the introduction of new empirical summary functions.

3.1 Summary functions for first and second-order moment properties

We assume that has intensity , that is, for ,

(3.1)

where is the number of points from falling in and denotes integration with respect to one-dimensional arc-length along . Intuitively, is the probability of having a point in an infinitesimal small subset of that contains and has size . If the intensity is constant, we say that is homogeneous; otherwise is said to be inhomogeneous. In case of homogeneity, is the expected number of points per unit length.

We also assume that has pair correlation function , that is, for disjoint ,

We can interpret as the joint probability that two infinitesimal small regions around and of size and , respectively, each contains a point from .

If the pair correlation function only depends on the shortest path distance, we say that it is isotropic and write . When has an isotropic pair correlation function, the (geometrically corrected network) -function introduced by Ang et al. (2012) can be expressed as

(3.2)

Alternatively, may be written as an expectation with respect to a Palm distribution, see Ang et al. (2012, Theorem 3). If the -function or the pair correlation function is expressible on closed form, we can use a minimum contrast or composite likelihood procedure to estimate the model parameters; this is described further in Section 4.2.2.

The simplest point process model is a Poisson process, which is characterised by that

follows a Poisson distribution with mean given by (

3.1) with and further that the points of conditioned on are independent and identically distributed, with density proportional to . For a Poisson process, and .

3.2 New empirical summary functions

For estimating the pair correlation function and the -function we follow Ang et al. (2012). These empirical summary functions can be used in minimum contrast or composite likelihood estimation procedures as well as for model checking. Obviously, if the -function or pair correlation function have been used to fit the model, neither should be used to check the adequacy of the model. Due to the shortage of summary functions for point processes on linear networks, we may let a simple visual comparison of the observed point pattern and simulations from the fitted model serve as a model check. It is needless to say that a more rigorous model checking would be preferred.

Therefore, we now introduce three purely empirical summary functions. These are obtained by modifying the empirical -, -, and -functions for inhomogeneous point patterns on a Euclidean space (introduced by van Lieshout, 2011) to linear networks. The modification simply consists of replacing the Euclidean space with the linear network, introducing the shortest path distance instead of the Euclidean distance, and adapting the notion of an eroded set to linear networks. The functions are then defined as follows. Assume that the intensity is known or has been estimated by and that , where either or . For , let consist of the points in with distance greater than to any vertex of with degree one. Furthermore, let be a finite ‘lattice’. For an observed point pattern , the empirical summary functions , , and are then defined for by

(3.3)
(3.4)
(3.5)

where we restrict attention to -values small enough to ensure that for and that for .

In Ang et al. (2012), the -function and its empirical estimate include a factor that corrects for the network geometry, such that its shape can be compared for point patterns on different networks. As it was not obvious to us how to extend such a correction to , , and in a meaningful way, our definitions in (3.3)–(3.5) do not correct for the network geometry. Further, we do not have any theoretical counterpart to , , and and therefore their shapes alone can in general not be used to conclude anything about e.g. the presence of regularity or clustering. However, , , and are still useful tools for providing a so-called global rank envelope; this is a confidence region for a given test function obtained from simulations under a fitted model (for details, see Myllymäki et al., 2017). In a global rank envelope procedure, the shape of the test function for the data is compared to that of the simulations and Myllymäki et al. (2017) discussed how this provides a test and an interval with lower and upper bounds given by a liberal and a conservative -value, respectively.

4 Modelling spine locations

In this section each of the six data sets is analysed with the aim of finding a model that adequately describe the spine locations.

4.1 Fitting a Poisson process model

The simplest model we can propose is a Poisson process. To investigate the behaviour of the spine intensity, we calculated the non-parametric intensity estimate suggested by McSwiggan et al. (2016); the resulting estimates are seen in the right panel of Figure 1. The spine intensity tend to be higher on the side branches than on the main branch, except perhaps for dendrite 3. Therefore, we allowed the intensity of the Poisson process to be different on the main and side branches, that is, recalling the notation in Section 2,

(4.1)

for non-negative parameters and ; here denotes the indicator function. The maximum likelihood estimates of the intensity parameters are easily found and given by

(4.2)

cf. the notation in Section 2. These estimates are shown in Table 1.

To test whether the proposed inhomogeneous Poisson process model adequately describes the spine locations, we performed global rank envelope tests using as test function, cf. Section 3.2. Results from these tests are shown in Figure 2. For all dendrites the conservative -value is below , suggesting that the fitted Poisson process models do not describe the spine locations adequately. Specifically, the empirical -functions for dendrite 2, 4, 5, and 6 fall above the global rank envelopes for certain -values, indicating that the spines tend to cluster at these distances. Further, for all six spine data sets the empirical -function falls below the global rank envelope for small -values, indicating a small-scale repulsion between spines. For dendrite 1 and 3, this small-scale repulsion is the only deviation from the Poisson process model revealed by the global rank envelope test. Disregarding the small distances () for the global rank envelope test with as test function, does not change the -intervals significantly for dendrite 2, 4, 5, and 6. For dendrite 1 and 3 on the other hand, the -intervals change from to and from to , respectively, giving (most clearly for dendrite 3) no evidence against the proposed Poisson process model. Global rank envelopes with a concatenation of , , and as test function, and where distances less than were disregarded, are shown in Figure C.1 in Appendix C; these do not provide any evidence against the Poisson process model for dendrite 1 and 3 either.

Figure 2: For each spine data set: the empirical -function for the data minus the -function for a Poisson process (solid line) along with global rank envelopes (grey region) based on 2499 simulations from the fitted inhomogeneous Poisson process model; -intervals for each of the associated global rank envelope tests are also displayed.

As the physical scale of the spine data is quite small (the dendrites range in size from to , cf. Table 1) and as there is uncertainty in the exact choice of the point representing a spine’s location, we must expect some degree of imprecision and therefore we may not want to put too much value into the observed small-scale repulsion. In the following we will not take the small-scale repulsion into account but rather focus on modelling the large scale clustering.

4.2 Fitting a Cox process model

In addition to inhomogeneity in the location of spines, Figure 1 also shows a tendency to large bare areas where no spines occur. To model this behaviour we considered a point process model introduced by Lavancier and Møller (2016) for the Euclidean space, which is easily adapted to a linear network . The point process is constructed as a thinning of a Poisson process on with intensity function , where the retention probabilities are determined by a random field that may be spatially correlated. That is, the point process is given by , where

consist of independent uniform random variables on

, and where and are independent. Thus is a Cox point process driven by the random field . We let

(4.3)

where and are parameters, and are IID zero-mean unit-variance Gaussian random fields with correlation function . If depends only on the shortest path distance, we say that is isotropic; see Anderes et al. (2017) for a list of isotropic correlation functions for linear networks. For the spine data, we considered the exponential correlation function, that is,

(4.4)

which is a valid correlation function for any parameter value and any tree network but not necessarily for other kinds of linear networks (see Section 5 for a comment on this).

We have that and

implying that has intensity

(4.5)

and pair correlation function

(4.6)

If is isotropic, then is isotropic and the -function can be expressed by (3.2). Closed form expressions of the -function are given in Appendix A for equal to the exponential correlation function and .

Note that for each , in (4.3) follows a -distribution with degrees of freedom. That is, the skewness decreases as increases, while the range is stretched/compressed depending on the value of . The larger is, the more is thinned to obtain and also the more variation in the thinning probabilities. The pair correlation function in (4.6) is an increasing function of both and . When is given by (4.4), controls the correlation of the retention probabilities: the smaller , the longer range of correlation in and thus larger coherent bare/populated areas in . Finally, the pair correlation function decreases towards 1, when increases.

For the spine data, we still want to model the intensity function by (4.1), which by (4.5) requires to have a similar intensity structure, that is,

(4.7)

for non-negative parameters and .

4.2.1 Simulation

To perform model checking with global rank envelopes or to carry out simulation studies, we need to be able to simulate point patterns from the model of interest. Fortunately, it is straightforward to simulate a point pattern on from the proposed Cox process model by the following three steps:

  1. Simulate a discretised version of the random field by first simulating the independent Gaussian random fields , , at chosen grid locations along the network and then transforming the random fields according to (4.3) to obtain the retention probabilities.

  2. Simulate a point pattern from a Poisson process on with intensity .

  3. Thin using the retention probabilities simulated in (a).

4.2.2 Estimation procedure

In the following we describe a procedure for estimating the model parameters of the proposed Cox process model where. For specificity we consider the case where is given by (4.4) and where is given by (4.7).

To begin we assume that is known, whereas the remaining parameters are estimated through a two-step procedure (Waagepetersen, 2007; Waagepetersen and Guan, 2009). In short, we first estimate and then plug in these estimates in a second-order procedure where is estimated. Lastly, an estimate of can be found by using (4.5).

First, to estimate we use the first order composite likelihood (Waagepetersen, 2007) which simply corresponds to a Poisson likelihood yielding the estimates in (4.2).

Second, as we know explicit formulas for the pair correlation and -function, we can estimate using a minimum contrast procedure (Guan, 2009; Diggle, 2014) or a second-order composite likelihood approach (Waagepetersen, 2007; Lavancier et al., 2018). The latter is not considered here but described in Appendix B, where results from a simulation study comparing the two approaches also can be found. The simulation study suggests that the minimum contrast procedure provides far more reliable estimates than the second-order composite likelihood.

For a chosen summary function which depends on , the minimum contrast estimate of is given by

(4.8)

where and are user-specified tuning parameters, and is an empirical estimate of the summary function. In our case, is given by or , and is given as in Ang et al. (2012). A frequently seen choice is , while general recommendations of and can be found in Guan (2009) and Diggle (2014) for point patterns on the Euclidean space.

The minimum contrast procedure can easily be extended to include estimation of too, but a simulation study indicated that it may be difficult to estimate and simultaneously as an increase in seemingly can be balanced out by an increase in . In practice we may therefore simply make a choice of ; for simplicity we chose in the following. Note that for both and the estimate in (4.8) depends on the intensity (Ang et al., 2012); here we simply plug-in the estimated intensity obtained in the first step of the estimation procedure.

A drawback of using is the need of choosing a bandwidth for the non-parametric kernel estimate presented in Ang et al. (2012). However, the simulation study in Appendix B suggests that using with the default bandwidth and kernel from the spatstat-package generally performs better than when fitting the proposed Cox process model. This is consistent with results from a simulation study in Lavancier and Møller (2016) for point processes on a Euclidean space.

In the simulation study found in Appendix B, we also investigated how different choices of , , and affect the estimates of and given by (4.8). We observed that the choice of often is a matter of trade-off between bias and variance: a large value of may entail a large bias, while a small often leads to a greater variance of the estimates. Furthermore, the best choice of seems to be quite depending on what the true underlying model parameters are. For example, a larger range of correlation in the retention probabilities, that is, a smaller value of , requires a larger . Naturally we should also take the size of the network into consideration when choosing . In the simulation study we found that gives the best estimates, and that or behave equally well for , while is preferred over for . For parameter values yielding models close to the Poisson process model, that is, when is close to zero or is large, the estimation procedures were not very successful regardless of the tuning parameters. This does not come as a surprise as many combinations of - and -values yield similar Poisson processes. Lastly, the estimation procedure seems quite stable with respect to the choice of start parameter values for the optimisation algorithm (optim in R) used to minimize (4.8).

4.2.3 Model fit and model check

The Cox process model was fitted to each of the spine data sets using the two-step procedure with fixed, cf. Section 4.2.2. For the minimum contrast procedure we let , , and in accordance with the simulation results discussed in Section 4.2.2. Further, we initially let and obtained a set of initial parameter estimates for each data set. Then we performed a small simulation study based on 500 simulations from the initially fitted models to investigate which of , , results in the best estimates (in terms of bias and variance) for these specific models. For dendrite 1, 3, and 6, the initially fitted models are close to the Poisson process case, and as a consequence the model parameters are hard to estimate regardless of the choice of . However, for dendrite 2, 4, and 5, it seems that is the best choice. Using for all six data sets, we obtained the parameter estimates in Table 2. The fitted model for dendrite 3 is practically a Poisson process model (in consistency with the conclusions made in Section 4.1) while the remaining fitted models are not. In Figure C.2 in Appendix C, one simulation from each of the fitted random fields is shown to illustrate the behaviour of the retention probabilities. For example, is considerably larger for dendrite 5 than dendrite 1, resulting in more fluctuating retention probabilities.

Dendrite
1
2
3
4
5
6
Table 2: Estimates of , , , and for each spine data set.

Figure 3: For each spine data set: the concatenation of , , and for the spine locations (black solid line) along with global rank envelopes (grey region) based on 2499 simulations from the fitted Cox process model; -intervals for each of the associated global rank envelope tests are also displayed.

As discussed in Section 3.2, for the statistical analyses of point patterns on linear networks there is only a limited number of options for model checking, especially when the -function or the pair correlation function have already been used to estimate the model parameters. One simple option is to look at simulations from the fitted model as in Figures C.3C.8 in Appendix C. Comparing these simulations visually to the observed point patterns, it seems that the simulations mimic the behaviour of the data quite well. For a more rigorous model checking, we performed global rank envelope tests with a concatenation of , , and as test function, where distances less than were disregarded as discussed in Section 4.1; results are shown in Figure 3. Except from dendrite 4, where the data curve for falls below the global rank envelope for some of the smaller -values, the tests do not reveal any evidence against the fitted models. However, for some of the dendrites (especially dendrite 6) the global rank envelopes for the part concerning and are very wide due to a large variance in the number of points.

5 Discussion

As discussed in Section 4.1, the spines seem to posses a small-scale repulsion, which was not modelled by the Cox process proposed in Section 4.2. To accommodate the repulsion, the inhomogeneous Poisson process used to build the Cox process could be replaced by an inhomogeneous and repulsive point process. The simplest case may be to use a dependent thinning as in a Matérn hard core process of type I (Matérn, 1960, 1986): let be an inhomogeneous Poisson process (with constant intensity on the main branch respectively the side branches), and let

where is a hard core parameter; that is, a point in is included in if and only if no other point in is within distance . However, it is doubtful whether an expression for the -function or the pair correlation function can be found for such a Cox process model, posing new challenges with respect to parameter estimation.

To avoid using the rather ad hoc created summary functions , , and , it is needed to develop new summary functions for (inhomogeneous) point processes on linear networks for which we both have a theoretical interpretation and an estimate that do not depend on the geometry of the network. We leave this challenging issue for future research.

Rakshit et al. (2017) discussed the importance of how distance is measured when analysing point patterns on a linear network and they generalised the -function to allow the use of any distance metric. In fact, following Anderes et al. (2017) all methods as well as Poisson and Cox process models in this paper immediately apply for more general linear networks, called graphs with Euclidean edges, when the correlation function is isotropic with respect to the shortest path distance as well as another metric called the resistance metric. For the dendrite networks or any other tree network, the resistance metric is equivalent to the shortest path distance. Anderes et al. (2017) showed that correlation functions that are isotropic with respect to the shortest path distance only are guaranteed to be valid for a small class of linear networks, whereas they are valid for any linear network when considering the resistance metric instead. Thus, depending on the network, it may be preferable to consider the resistance metric over the shortest path distance when specifying a correlation function. Anderes et al. (2017) provided a list of valid isotropic covariance functions for graphs with Euclidean edges.

Appendix A Expressions for

For the Cox point process presented in Section 4.2 with equal to the exponential correlation function in (4.4), the -function is

(A.1)

Let , then is given by

if ,

if ,

if ,

if , and

if .

Appendix B Simulation study concerning estimation procedure

b.1 Second order composite likelihood

In Section 4.2 we fitted the parameters of the Cox process models using a two step procedure involving a minimum contrast procedure for estimating and . Another option is to consider a second order composite likelihood approach (adapted from Waagepetersen, 2007, to a point pattern on a linear network). That is, for an observed point pattern , the maximum composite likelihood estimate is obtained by maximizing the log composite likelihood

where is a weight function and over the summation sign means that . We can for example let

(B.1)

for some user specified value . To estimate , we either directly maximise (B.1) or alternatively solve the associated estimating equation obtained by setting the score equal to 0. The score function is in our set-up given by

(B.2)

To improve the composite likelihood estimation procedure, Lavancier et al. (2018) suggested an adaptive version of (B.2) where the weight function depends on the model parameters. We may for example let be the indicator function given by

(B.3)

where and is a small user-specified number, e.g.  or . Note that for an isotropic correlation function , we have . Another weight function suggested in Lavancier et al. (2018) is

(B.4)

where .

For approximating the double integral in (B.2) (or in the adapted version), note that this is of the form . We split up the integration area into the line segments constituting , that is,

Note that for the spine data is constant on any line segment (as the line segment is either fully contained in or ). Further, if is a tree, is given by (B.1), (B.3), or (B.4), and is isotropic, then depends only on distance; this will ease the approximation of the integral:

where . Each of these integrals can then be approximated by Monte Carlo integration using uniform variables on and .

b.2 Simulation study

In the following we describe and summarise results from a simulation study investigating how well and are estimated using either the minimum contrast procedure with or (as described in Section 4.2.2) or the adaptive composite likelihood procedure using (B.3) or (B.4). We considered the network for dendrite 4 and simulated from the Cox process described in Section 4.2 with different parameter values given in Table 3 and when was fixed.

For the minimum contrast procedure we investigated different values of the tuning parameters , , and as well as different start parameters for the optimisation algorithm, see Table 3. Here run no. 1 is the reference run from which one (or two) model or tuning parameters are changed at the time. For run no. 1, we chose and resulting in a model rather far away from the case of a Poisson process. Note that decreasing will increase the range of correlation in the thinning probability, whilst increasing will increase the probability of thinning. Thus a small and a large yield a model very different from the Poisson process. For each choice of model parameters we simulated 500 point patterns and estimated using minimum contrast and for a few selected runs we also estimated using the adaptive composite likelihood method.

For the adaptive composite likelihood method the integral in (B.2) was approximated using simulations. Estimates of were found by minimising the length of the score over a grid centred around the true values of and . The finer and broader grid, the better, but as a grid was already quite time consuming we settled with that.

Run no. for MCE- (MCE-)
1 5 0.1 0.8 1.2 1 (0.25) 0 30 (0.5, 0.5)
2 5 0.1 0.8 1.2 0.5 (0.5) 0 30 (0.5, 0.5)
3 5 0.1 0.8 1.2 1 (0.25) 0 50 (0.5, 0.5)
4 5 0.1 0.8 1.2 1 (0.25) 0 20 (0.5, 0.5)
5 5 0.1 0.8 1.2 1 (0.25) 0 30 (3, 0.2)
6 5 0.1 0.8 1.2 1 (0.25) 0 30 (0.2, 3)
7 5 0.1 0.3 0.7 1 (0.25) 0 30 (0.5, 0.5)
8 5 0.1 1 1 1 (0.25) 0 30 (0.5, 0.5)
9 5 0.5 0.8 1.2 1 (0.25) 0 30 (0.5, 0.5)
10 5 1 0.8 1.2 1 (0.25) 0 30 (0.5, 0.5)
11 1 0.1 0.8 1.2 1 (0.25) 0 30 (0.5, 0.5)
12 1 0.5 0.8 1.2 1 (0.25) 0 30 (0.5, 0.5)
13 5 0.1 0.8 1.2 1 (0.25) 30 (0.5, 0.5)
14 5 0.1 0.8 1.2 1 (0.25) 30 (0.5, 0.5)
15 5 0.1 0.8 1.2 1 (0.25) 30 (0.5, 0.5)
16 5 0.1 0.8 1.2 1 (0.25) 30 (0.5, 0.5)
Table 3: Overview of runs made in the simulation study for the minimum contrast procedures. Here denote the start parameters for the optimisation algorithm and is the automatically selected bandwidth used to calculate in spatstat.

b.2.1 Results using the minimum contrast procedure

In the following we let MCE- and MCE- refer to minimum contrast estimation with and , respectively.

Histograms of the obtained estimates are shown in Figures B.1B.4. In general it seems that MCE- performs better than MCE-. For parameter values that result in models close to the Poisson model, neither of the estimation procedures estimate successfully. For simulations more distinguishable from Poisson (as for example run no. 1), the MCE- gives more satisfactory estimates.

Neither MCE- or MCE- seem to be sensitive to the choice of start parameters for the reference run. Further, for the reference run it seem that was the best choice, but in general this depend on the model we are trying to fit and naturally on the size of the network. The best choice of seem to be , while seems preferable over for MCE-, and and perform equally well for the MCE-. The choice of seem to be important with respect to bias and variance: a too high may lead to a large bias, while a too smale may lead to a large variance in the estimates. It is therefore recommendable to perform a small simulation study for the specific network and proposed model at hand, such that the best choice of can be made.

Figure B.1: Estimates of and using either MCE- or MCE- for 500 simulated point patterns of models with parameters no. 1–4 in Table 3 (one set of parameters for each row, starting with no. 1 in the top). From left to right: estimates of and found by MCE- (column 1 and 2), followed by estimates of and based on MCE- (column 3 and 4). Blue dashed line is the true parameter value, and red dashed line is the mean of the estimates. OBS: the histograms have been truncated such that estimates above 15 for (column 1 and 3) and 5 for (column 2 and 4) have been omitted in the frequency count; in each histogram it is stated how many values were discarded.
Figure B.2: Estimates of and using either MCE- or MCE- for 500 simulated point patterns of models with parameters no. 5–8 in Table 3 (one set of parameters for each row, starting with no. 5 in the top). From left to right: estimates of and found by MCE- (column 1 and 2), followed by estimates of and based on MCE- (column 3 and 4). Blue dashed line is the true parameter value, and red dashed line is the mean of the estimates. OBS: the histograms have been truncated such that estimates above 15 for (column 1 and 3) and 5 for (column 2 and 4) have been omitted in the frequency count; in each histogram it is stated how many values were discarded.
Figure B.3: Estimates of and using either MCE- or MCE- for 500 simulated point patterns of models with parameters no. 9–12 in Table 3 (one set of parameters for each row, starting with no. 9 in the top). From left to right: estimates of and found by MCE- (column 1 and 2), followed by estimates of and based on MCE- (column 3 and 4). Blue dashed line is the true parameter value, and red dashed line is the mean of the estimates. OBS: the histograms have been truncated such that estimates above 15 for (column 1 and 3) and 5 for (column 2 and 4) have been omitted in the frequency count; in each histogram it is stated how many values were discarded.
Figure B.4: Estimates of and using either MCE- or MCE- for 500 simulated point patterns of models with parameters no. 13–16 in Table 3 (one set of parameters for each row, starting with no. 13 in the top). From left to right: estimates of and found by MCE- (column 1 and 2), followed by estimates of and based on MCE- (column 3 and 4). Blue dashed line is the true parameter value, and red dashed line is the mean of the estimates. OBS: the histograms have been truncated such that estimates above 15 for (column 1 and 3) and 5 for (column 2 and 4) have been omitted in the frequency count ; in each histogram it is stated how many values were discarded.

b.2.2 Results using the adaptive composite likelihood

For the CLE procedure we restricted ourselves to a small number of runs as the grid search was very time consuming. Specifically, we simulated 500 point patterns from the Cox process models with the parameters from run no. 1, 10, and 11 in Table 3. For each choice of model parameters we estimated using both weight functions and . Further, we also ran the CLE procedure with for the model parameters from run no. 1. Figure B.5 shows histograms of the resulting estimates. It is clear that for all three choices of model parameters the grid should be broader in order to find the parameter values that give the smallest length of (B.2) and that the estimates are worse than the ones obtained by the minimum contrast procedures. Finally, there is no seemingly advantage of choosing one weight function over the other or of choosing over .

Figure B.5: Results from simulation study using adaptive composite likelihood: estimates of and for model parameters from run no. 1 (first row), 10 (second row), and 11 (third row) with and run no. 1 with (fourth row); see Table 3. The two first columns display estimates of and (in that order) when using the indicator weight function for the CLE procedure, while estimates in the two right columns are found using CLE with the exponential weight function. Blue dashed line is the true parameter value, red dashed line is the mean of the estimates.

Appendix C Analysis of spine locations

This Appendix contains figures related to the analysis of the six spine data sets. Figure C.1 shows global rank envelopes under the fitted inhomogeneous Poisson model using a concatenation of , , and as test function. Further, Figure C.2 show one simulation of the fitted random field for each network. Finally, each of Figures C.3C.8 display the data along with five simulated point patterns from the fitted Cox process model.

Figure C.1: For each spine data set: the concatenation of , , and for the spine locations (black solid line) along with global rank envelopes (grey region) based on 2499 simulations from the fitted inhomogeneous Poisson model; -intervals for each of the associated global rank envelope tests are also displayed.
Figure C.2: For each dendrite tree, a simulated realisation of the random field determining the retention probabilities in the fitted Cox process models.
Figure C.3: Upper left corner: observed spine locations on dendrite 1. Remaining: simulated point patterns from the fitted Cox process model. The simulated retention probabilities used to obtain the point pattern in the upper right corner are shown in Figure C.2.
Figure C.4: Upper left corner: observed spine locations on dendrite 2. Remaining: simulated point patterns from fitted Cox process model. The simulated retention probabilities used to obtain the point pattern in the upper right corner are shown in Figure C.2.
Figure C.5: Upper left corner: observed spine locations on dendrite 3. Remaining: simulated point patterns from fitted Cox process model. The simulated retention probabilities used to obtain the point pattern in the upper right corner are shown in Figure C.2.
Figure C.6: Upper left corner: observed spine locations on dendrite 4. Remaining: simulated point patterns from fitted Cox process model. The simulated retention probabilities used to obtain the point pattern in the upper right corner are shown in Figure C.2.
Figure C.7: Upper left corner: observed spine locations on dendrite 5. Remaining: simulated point patterns from fitted Cox process model. The simulated retention probabilities used to obtain the point pattern in the upper right corner are shown in Figure C.2.
Figure C.8: Upper left corner: observed spine locations on dendrite 6. Remaining: simulated point patterns from fitted Cox process model. The simulated retention probabilities used to obtain the point pattern in the upper right corner are shown in Figure C.2.

Acknowledgements

We would like to thank Abdel-Rahman Al-Absi who collected the spine data. We have been supported by The Danish Council for Independent Research | Natural Sciences, grant DFF – 7014-00074 "Statistics for point processes in space and beyond", and by the "Centre for Stochastic Geometry and Advanced Bioimaging", funded by grant 8721 from the Villum Foundation.

References

  • Anderes et al. (2017) Anderes, E., Møller, J., and Rasmussen, J. G. (2017). Isotropic covariance functions on graphs and their edges. Available at arXiv:1710.01295.
  • Ang et al. (2012) Ang, Q. W., Baddeley, A., and Nair, G. (2012). Geometrically corrected second order analysis of events on a linear network, with applications to ecology and criminology. Scandinavian Journal of Statistics, 39:591–617.
  • Baddeley et al. (2014) Baddeley, A., Jammalamadaka, A., and Nair, G. (2014). Multitype point process analysis of spines on the dendrite network of a neuron. Journal of the Royal Statistical Society: Series C (Applied Statistics), 63:673–694.
  • Baddeley et al. (2000) Baddeley, A., Møller, J., and Waagepetersen, R. P. (2000). Non- and semi-parametric estimation of interaction in inhomogeneous point patterns. Statistica Neerlandica, 54:329–350.
  • Baddeley et al. (2017) Baddeley, A., Nair, G., Rakshit, S., and McSwiggan, G. (2017). “Stationary” point processes are uncommon on linear networks: Point processes on linear networks. Stat, 6:68–78.
  • Baddeley et al. (2015) Baddeley, A., Rubak, E., and Turner, R. (2015). Spatial Point Patterns: Methodology and Applications with R. Chapman & Hall/CRC Press, Boca Raton.
  • Diggle (2014) Diggle, P. J. (2014). Statistical Analysis of Spatial and Spatio-Temporal Point Patterns. CRC Press, Lancaster, 3rd edition.
  • Guan (2009) Guan, Y. (2009). A minimum contrast estimation procedure for estimating the second-order parameters of inhomogeneous spatial point processes. Statistics and Its Interface, 2:91–99.
  • Jammalamadaka et al. (2013) Jammalamadaka, A., Banerjee, S., Manjunath, B., and Kosik, K. (2013). Statistical analysis of dendritic spine distributions in rat hippocampal cultures. BMC Bioinformatics, 14:287.
  • Lavancier and Møller (2016) Lavancier, F. and Møller, J. (2016). Modelling aggregation on the large scale and regularity on the small scale in spatial point pattern datasets. Scandinavian Journal of Statistics, 43:587–609.
  • Lavancier et al. (2018) Lavancier, F., Poinas, A., and Waagepetersen, R. P. (2018). Adaptive estimating function inference for non-stationary determinantal point processes. Available on arXiv:1806.06231.
  • Matérn (1960) Matérn, B. (1960). Spatial variation: Stochastic models and their application to some problems in forest surveys and other sampling investigations. Meddelanden från Statens Skogforskningsinstitut, 49:1–144.
  • Matérn (1986) Matérn, B. (1986). Spatial Variation. Lecture Notes in Statistics 36. Springer-Verlag, Berlin.
  • McSwiggan et al. (2016) McSwiggan, G., Baddeley, A., and Nair, G. (2016). Kernel density estimation on a linear network. Scandinavian Journal of Statistics, 44:324–345.
  • Møller et al. (1998) Møller, J., Syversveen, A., and Waagepetersen, R. P. (1998). Log Gaussian Cox processes. Scandinavian Journal of Statistics, 25:451–482.
  • Myllymäki et al. (2017) Myllymäki, M., Mrkvička, T., Grabarnik, P., Seijo, H., and Hahn, U. (2017). Global envelope tests for spatial processes. Journal of Royal Statistical Society Series B (Statistical Methodology), 79:381–404.
  • Okabe and Yamada (2001) Okabe, A. and Yamada, I. (2001). The -function method on a network and its computational implementation. Geographical analysis, 33:271–290.
  • Rakshit et al. (2017) Rakshit, S., Nair, G., and Baddeley, A. (2017). Second-order analysis of point patterns on a network using any distance metric. Spatial Statistics, 22:129–154.
  • Rasmussen and Christensen (2019) Rasmussen, J. G. and Christensen, H. S. (2019). Point processes on directed linear networks. Available at arXiv:1812.09071.
  • van Lieshout (2011) van Lieshout, M. N. M. (2011). A -function for inhomogeneous point processes. Statistica Neerlandica, 65:183–201.
  • Waagepetersen (2007) Waagepetersen, R. P. (2007). An estimating function approach to inference for inhomogeneous Neyman-Scott processes. Biometrics, 63:252–258.
  • Waagepetersen and Guan (2009) Waagepetersen, R. P. and Guan, Y. (2009). Two-step estimation for inhomogeneous spatial point processes. Journal of the Royal Statistical Society Series B (Statistical Methodology), 71:685–702.