Bayesian reconstruction of cosmic density fields
We present a self-consistent Bayesian formalism to sample the primordial density fields compatible with a set of dark matter density tracers after cosmic evolution observed in redshift space. Previous works on density reconstruction considered redshift space distortions as noise or included an additional iterative distortion correction step. We present here the analytic solution of coherent flows within a Hamiltonian Monte Carlo posterior sampling of the primordial density field. We test our method within the Zel'dovich approximation, presenting also an analytic solution including tidal fields and spherical collapse on small scales using augmented Lagrangian perturbation theory. Our resulting reconstructed fields are isotropic and their power spectra are unbiased compared to the true one defined by our mock observations. Novel algorithmic implementations are introduced regarding the mass assignment kernels when defining the dark matter density field and optimization of the time step in the Hamiltonian equations of motions. Our algorithm, dubbed barcode, promises to be especially suited for analysis of the dark matter cosmic web implied by the observed spatial distribution of galaxy clusters --- such as obtained from X-ray, SZ or weak lensing surveys --- as well as that of the intergalactic medium sampled by the Lyman alpha forest or perhaps even by deep hydrogen intensity mapping. In these cases, virialized motions are negligible, and the tracers cannot be modeled as point-like objects. It could be used in all of these contexts as a baryon acoustic oscillation reconstruction algorithm.READ FULL TEXT VIEW PDF
I present a new approach to recover the primordial density fluctuations ...
Weak gravitational lensing is a powerful probe of the large-scale cosmic...
Galaxy imaging surveys enable us to map the cosmic matter density field
Hamiltonian Monte Carlo (HMC) is an efficient Bayesian sampling method t...
Following [Diggle 2011, Greenland 1995], we give a simple formula for th...
Uncertainty quantification for full-waveform inversion provides a
For over twenty years, the term 'cosmic web' has guided our understandin...
Bayesian reconstruction of cosmic density fields
The Cosmic Web of the Universe arises from the gravitational instability caused by tiny primordial density perturbations, which presumably have their origin in quantum fluctuations. At initial cosmic times, perturbations are linear and the statistics describing them is extremely close to Gaussian, though some deviations from Gaussianity could be expected depending on the inflationary phase the Universe has probably experienced after its birth(Starobinsky, 1980; Planck Collaboration et al., 2018a). Therefore, the Cosmic Web encodes the information to understand nonlinear structure formation (Bond et al., 1996; van de Weygaert, 1996; Cautun et al., 2014; Alpaslan et al., 2014) and disentangle the interplay between dark matter and dark energy (Park & Lee, 2007; Bos et al., 2012; Vogelsberger et al., 2014). One of the great challenges of the coming years is to rigorously test these theories using observations (Werner et al., 2008; Lee et al., 2016; Tanimura et al., 2018) However, observations are based on some biased tracers, which are affected by their proper motions, as they are measured in so-called redshift space (Kaiser, 1987; Hamilton, 1998). We are facing three problems here: one is the action of gravity linking the linear primordial fluctuations to the final nonlinear density field, the second one is modeling the bias of the dark matter tracers and the third one is solving for redshift space distortions. There have been dramatic developments, ever since the pioneering works by Peebles (1989) — proposing a least action method linking the trajectories of particles from initial to final cosmic times — Weinberg (1992) — proposing a rank-ordering scheme, realizing the different statistical nature between the initial and final density fluctuations — and Nusser & Dekel (1992) — proposing a time reversal machine based on the Zel’dovich approximation (Zel’dovich, 1970). The minimization of an action led to two approaches: the FAM method (Nusser & Branchini, 2000; Branchini et al., 2002) and the MAK method (Brenier et al., 2003; Mohayaee et al., 2006; Lavaux & Wandelt, 2010). Time reversal machines based on higher order corrections have been proposed (Gramann, 1993; Monaco & Efstathiou, 1999; Kitaura & Angulo, 2012). The same concept has been proposed to enhance signatures encoded in the primordial density fluctuations, such as the Baryon Acoustic Oscillations (BAOs) (see Eisenstein et al., 2007), as this linearizes the density field transferring information from the higher order statistics to the two point correlation function (see Schmittfull et al., 2015). Constrained simulations have been applied to study the Local Universe (Gottloeber et al., 2010), but it is hard to find a configuration of constrained peaks that give rise to a desired Cosmic Web configuration after non-linear gravitational evolution. The reverse Zel’dovich approximation can provide a first order correction to this limitation of constrained simulations (Doumler et al., 2013; Sorce & Tempel, 2018). The problem of these methods is that they are limited by the approximation of reversing gravity, which breaks down with so-called shell crossing (Hidding et al., 2016). Bayesian approaches have permitted the incorporation of forward modeling, which solves this problem, at a considerably higher computational cost (Kitaura, 2013; Jasche & Wandelt, 2013; Wang et al., 2013). We will follow here this approach and more specifically the setting described in Wang et al. (2013) based on the combination of a Gaussian prior describing the statistical nature of the primordial fluctuations and a Gaussian likelihood minimizing the squared distance between the reconstructed density and the true one. Nonetheless, we would like to remark that the algorithmic approach of connecting the primordial density field with the final one followed by Wang et al. (2013) is based upon the one described in Jasche & Wandelt (2013), as we also do here. Our novel contribution to this formalism consists of including coherent redshift space distortions (RSD) in an analytical way, within the Hamiltonian Monte Carlo sampling of the posterior distribution function (resulting from the product of the prior and the likelihood). While Jasche & Wandelt (2013) do not give a treatment of RSD, Wang et al. (2013) rely on an iterative RSD correction applied prior to the Bayesian reconstruction method (Wang et al., 2009, 2012). In a recent study in the context of Bayesian reconstructions of the primordial fluctuations, Jasche & Lavaux (2018) apply a redshift space transformation to the dark matter field, in which the bias is defined with respect to redshift space. Our formalism entails a more natural approach, and includes a biasing scheme with respect to Eulerian space (see also Bos, 2016). This implies additional terms in the Hamiltonian equations, and has the advantage that it admits natural real space bias models that couple the large scale dark matter distribution to the galaxy population (see e.g. Wang et al., 2013; Kitaura et al., 2014; Neyrinck et al., 2014; Ata et al., 2015; Kitaura et al., 2016a; Vakili et al., 2017, and references therein)
. The only restriction to our approach is to include a treatment for the virialized motions, as we are not able at this moment to include a random term within the analytical posterior sampling expression. Virial motions can, in general, be corrected for with fingers-of-god (FoGs) compression(Tegmark et al., 2004; Tully, 2015; Tempel et al., 2016; Sorce & Tempel, 2017). Within a Bayesian context, it is possible to include a virialized cluster term following the approach of Kitaura (2013), which includes a likelihood comparison step object by object and not cell by cell as in Jasche & Wandelt (2013) or Wang et al. (2013). In the object based approach, the likelihood comparison can be done in redshift space, including virialized motions as was demonstrated in Heß et al. (2013). This method requires two steps within an iterative Gibbs-sampler to handle the reconstruction problem, one to transform dark matter tracers from redshift space at final cosmic times to real space at initial cosmic times, and a second one to make the link between the discrete distribution to the continuous Gaussian field. It is, in fact, also possible to include an additional Gibbs-sampling step in the Bayesian formalism to self-consistently sample peculiar motions and solve for RSD (Kitaura et al., 2016b; Ata et al., 2017).
In this work, we propose a self-consistent transformation of the data model into redshift space, allowing to sample directly from the posterior distribution function in one step within a Hamiltonian Monte Carlo sampler. Thus, this represents the first self-consistent (coherent) redshift space data based primordial density fluctuation reconstruction code based on one iterative sampling step, which can also naturally deal with masks. Our implementation of this approach is called barcode. A high level overview of the method and the problem it solves is presented in figure 1. In this code, we transform from Lagrangian coordinates to Eulerian coordinates using a structure formation model and within the same equation also from to the corresponding redshift space . To implement the redshift space transformation into the Hamiltonian Monte Carlo formalism, we need to derive the corresponding analytical derivatives. For our structure formation model, we will restrict our studies to the Zel’dovich approximation, as we seek the most efficient, simple formalism, which has been shown to be accurate when it is fully treated (see e.g. White, 2015).
barcode has already been referred to in various works giving details of its implementation (Bos et al., 2016; Bos, 2016). One of our main motivations for developing barcode is the analysis of galaxy clusters for which the mass has been measured in X-ray (Sarazin, 1986; Mulchaey, 2000; Ikebe et al., 1996; Nulsen et al., 2010), Sunyaev-Zel’dovich cluster surveys (e.g. Birkinshaw, 1999; Bleem et al., 2015; Planck Collaboration et al., 2016; Hilton et al., 2018) and weak lensing studies of clusters (e.g. Kaiser & Squires, 1993; Lee et al., 1997; Umetsu, 2010; Laureijs et al., 2011; de Jong et al., 2012; Hoekstra et al., 2013; Sartoris et al., 2016; Radovich et al., 2017; The LSST Dark Energy Science Collaboration et al., 2018). This frees us from having to consider galaxy bias, contrary to many similar studies in the literature. Nonetheless, we have implemented a power-law bias in barcode for more general cases, as it was done in Wang et al. (2013). More details to the context and motivation of our focus on clusters have been given in Bos (2016). We note that in such a context, the approach proposed here is potentially more appropriate than the ones cited above modeling tracers as point-like objects, because the shapes and orientations of clusters play a decisive role in their tidal effects on their environments (Bond et al., 1996; van de Weygaert & Bertschinger, 1996). We leave a more detailed investigation of this to later work.
Broader cosmological applications are also envisioned. Extending this work to make growth rate sampling is trivial, and can be done in a similar fashion to Granett et al. (2015), but including a more complex and accurate algorithm, as presented here. In fact, redshift space distortions can also be leveraged as a proxy for constraining the nature of gravity and cosmological parameters (e.g. Berlind et al., 2001; Zhang et al., 2007; Jain & Zhang, 2008; Guzzo et al., 2008; Nesseris & Perivolaropoulos, 2008; Song & Koyama, 2009; Song & Percival, 2009; Percival & White, 2009; McDonald & Seljak, 2009; White et al., 2009; Song et al., 2010; Zhao et al., 2010; Song et al., 2011). To this end, many recent studies focus on the measurement of redshift space distortions (Cole et al., 1995; Peacock et al., 2001a; Percival et al., 2004; da Ângela et al., 2008; Okumura et al., 2008; Guzzo et al., 2008; Blake et al., 2011; Jennings et al., 2011; Kwan et al., 2012; Samushia et al., 2012; Reid et al., 2012; Okumura et al., 2012; Samushia et al., 2013; Zheng et al., 2013; Blake et al., 2013; de la Torre et al., 2013; Samushia et al., 2014; Sánchez et al., 2014; Bel et al., 2014; Tojeiro et al., 2014; Okumura et al., 2014; Beutler et al., 2014).
This paper is structured as follows. First we present the methodology, giving details of the analytical implementations (see Section 2). In particular we focus on the novel redshift space formalism implemented in barcode. We briefly recap the most important barcode concepts and equations in Appendix A. In Section 3, we set the stage for our analyses by describing and illustrating our mock observations and the reconstructions that barcode yields from them.
The core results of our investigation are described in Section 4. The reconstructions including the redshift space model demonstrate the major benefits of our novel approach. For contrast, we show what would happen if one used barcode without our redshift space extension on observational data in redshift space coordinates in Appendix B.
These results are followed by a more technical discussion on the performance aspects of barcode with this new model in Section 5. We evaluate whether the added stochasticity that redshift space transformations add cause the MCMC chain to evolve differently than without our redshift space model. We close with a discussion of the results in Section 6.
Additional material is available in the appendices. The two dimensional correlation function, which is used extensively in the results sections, is defined in Appendix C. An important technical aspect in barcode
is the use of kernel density estimation, the details of which we discuss in AppendixD. Finally, in Appendix E, we give an extensive derivation of the equations necessary for configuring barcode to use a second-order Lagrangian Perturbation Theory (2LPT) structure formation model, optionally including the spherical collapse terms of the “ALPT” model.
The Bayesian reconstruction algorithm connecting the evolved dark matter density to the primordial fluctuation field has already been presented in several works (Kitaura et al., 2012; Kitaura, 2013; Jasche & Wandelt, 2013; Wang et al., 2013; Bos et al., 2016). We summarize the central equations used in barcode — our C++ implementation of this algorithm — in Appendix A.
Let us now consider the Hamiltonian Monte Carlo calculations in redshift space (equation 1) instead of Eulerian coordinates . Densities are then functions of , e.g. . For convenience, in this section we write instead of for the initial density field (the sample in the MCMC chain), where
is a grid location (i.e. a component of the sample vector).
In the first part of this section, we will first define what we mean by redshift space coordinates. Next, in Section 2.2, we give a brief overview of the different kinds of non-linearities involved in the redshift space transformation and our Lagrangian Perturbation Theory models of structure formation. In particular, we discuss what this means for the representation of non-linear cosmic structure. We then dive into the derivation of the main quantity necessary for the HMC algorithm: the Hamiltonian likelihood force in Section 2.3. In Section 2.4 we simplify the fully general form we derived to arrive at the plane parallel approximation that we employed to produce the results in this paper.
From a practical perspective we will describe redshift space as follows. Redshift space coordinates can be defined as
where is a particle’s location in redshift space, is the Eulerian comoving coordinate, is the particle’s (peculiar) velocity and is the velocity of the observer, and is the comoving unit vector. is the cosmological expansion factor, which converts from physical to comoving coordinates, and is the Hubble parameter at that expansion factor, which converts the velocity terms to distances with units . We call the redshift space displacement field, analogously to the displacement field that transforms coordinates from Lagrangian to Eulerian space in LPT (equation 50).
One important point to stress is that in equation 1 (and all equations henceforth), the comoving coordinates are defined such that the observer is always at the origin. The unit vector , and thus the entire redshift space, changes completely when the origin changes.
From redshift measurements , we can derive the radial component of the total velocity of galaxies and other objects:
where is the Hubble flow velocity and is the peculiar velocity. Redshift space (comoving) coordinates can be written in terms of the total velocity as
We can compare such observational redshift space positions directly with theoretical models by transforming our models to redshift space as given in equation 1.
Lagrangian Perturbation Theory (LPT) is accurate in its position and velocity estimates up to (quasi-)linear order. Note that since our objective is to use them in a model based on data at low redshift, and we are dealing with strong non-linearities, LPT is not a fully accurate model.
To clarify, there are two types of non-linearity involved in redshift space distortion theory (White et al., 2012). Kaiser (1987) and others — when talking about linear redshift space distortions111 Linear redshift space theory is about finding analytical solutions to (Kaiser, 1987). One of the great results of this theory is the so-called Kaiser effect, which predicts a boost on large scales and a deficiency at small scales in the power spectrum. Note that this is a linear effect; one assumes to derive it. — are dealing with non-linearities in the mapping from Eulerian to redshift space. One marked example of this non-linear mapping is the creation of caustics in the triple-value regions. In the centers of clusters, we also have to take into account that the velocity field itself is non-linear, which, for instance, leads to FoGs.
Since we make no approximations in the redshift space mapping, we are dealing with its full non-linear form. The (quasi-)linear velocity field of LPT, on the other hand, will become increasingly inaccurate as we venture into the non-linear density and velocity-field regime, leading to increasingly inaccurate redshift space representations.
Large scale (linear regime) coherent flows are still well modeled in our LPT-based framework. Bulk velocities in LPT scale linearly with the displacement field :
where is summed up to the used perturbation order ( for the Zel’dovich approximation and for 2LPT) and is the linear velocity growth factor:
In linear theory, the displacement is directly proportional to the gravitational acceleration, which follows directly from the initial density field. On large scales, structure still resides mostly in the linear regime, so that LPT manages to describe the velocity field fairly accurately. The squashing effect on large scales will therefore be well represented in our model.
Cluster infall is modeled poorly in Lagrangian Perturbation Theory. Virialized and multistream motion are not at all included in LPT. This means we do not model FoGs nor the triple value regime within the turnaround radius (see e.g. van Haarlem & van de Weygaert, 1993) in this work. Any coincidental triple value regions cannot be trusted in our approach, even though there might be FoG-like features around clusters in LPT as well. Since LPT breaks down there, they cannot be considered as true features. We shortly come back to this in the discussion in Section 6.2.6.
The Hamiltonian likelihood force222 The Hamiltonian prior force is defined in Lagrangian space only. We need not take redshift space into account there. — defined for observations in comoving coordinates in equation 43 — is redefined for observations in redshift space as
where is the Eulerian density field in redshift space, at grid location , which is found by evolving the Lagrangian density field forward (see Appendix D), and is the Lagrangian density field (the signal) at grid location . The first right-hand side multiplicative term in equation 6 is given analogously to equation 48 by
where is a transfer function that performs a simple linear density transformation to values in the used structure formation model given values from an -body simulation (which should correspond to the observed quantities). For the non-RSD models we could use functions from Leclercq et al. (2013). However, the fitted functions from that work were not calibrated for redshift space densities, so we would need to rederive such functions. We highlight this possibility for future applications (see Section 6.2.1), but without any loss of generality, we can define here. Since the term in this equation does not depend on the primordial density field, its exact form is not important in our derivation.
Note that indices and run over completely different regular grids. The index here refers to the initial Lagrangian space grid, while the index is for the present-time Eulerian grid defined by the observations.
With that we can write the second right-hand side term of equation 6 as
where the gradient of is given in equation 62 (replacing by ). The second term is where the real difference with the non-RSD method comes in, because
where, besides the derivative of , we now have the extra derivative of redshift space displacement term to deal with. It is this term that we will derive further in the following subsections. Putting this back into equation 6, we can rewrite as
where the first part is the same as the “regular” without redshift space from equation 43 and the second part represents the redshift space contribution.
We carry on to derive in detail the multiplicand in this added term, which is the derivative or gradient of the redshift space displacement field with respect to the signal (the primordial density field ). For convenience, we multiply by , yielding:
We can further expand the gradient of :
This is the most explicit form we can give in the general case. To find a specific solution for our models, we must specify . In LPT, given , we know :
For convenience, in what follows we split into two parts:
where stands for the LPT order of the and corresponding terms. This way we can treat the observational velocity contribution separately from the LPT part(s). In what follows we work out the observational part and the Zel’dovich first order LPT term.
To facilitate insight into this expression we introduce a few auxiliary variables:
following the definition:
One may verify that in the Zel’dovich case the terms are also first order. The and terms have units and is unitless. It is interesting that, while the first three terms represent a contribution in the radial direction, the last term corresponds to a contribution in the direction of the displacement field. Depending on the density field this may be an arbitrary direction. The term in front of it contains a division by the distance to the particle, meaning that the closer you look, the stronger this component will be. The same is true for the term. The derivative of the redshift space displacement field is thus quite different from what one might naively expect from regular (radial) RSD effects, especially near the observer. The term is the most straight-forward, simply representing the component of the derivative of the displacement field in the radial direction.
Elaborating on the remaining part of equation 12, in combination with equation 13, yields the observational term. This term is valid for any structure formation model encoded in a displacement field .
This expression has the same structure as equation 17, except for the first “” term. The latter disappears because it is constant, yielding:
where we define
The similarity between the expressions reveals that it has the same behavior of non-radial contributions near the observer as the Zel’dovich term.
The above two terms can be greatly simplified by taking a plane parallel approximation of redshift space, effectively putting the observer at an infinite distance. The will then make the and terms go to zero, as well as the full observational term of the derivative of in equation 23.
For the Zel’dovich model in plane parallel redshift space, we are left with
It is good to point out that and are intrinsically different. Only in this specific case are their derivatives equal, as the observational part is equal to zero.
Without loss of generality we will restrict our studies to the plane parallel approximation for simplicity.
Finally, we insert this back into the term of equation 11:
where we define as
We are left with almost the same result as in equation 3.27 of Bos (2016), the only difference being the replacement of with . We can then conclude that in order to reuse equation 44, we need to merely replace by , defined as
This results in:
In the implementation of this algorithm it is convenient to choose the direction of the observer parallel to one of the coordinate axes. In this way, only changes one component of . This modification can easily be implemented in a pre-existing HMC sampling code.
Our novel approach towards integrating redshift space in the Hamiltonian solver code needs to be validated. The question of whether the code converges correctly towards the true Lagrangian density and in what way it diverges from it (if differently than in the regular Eulerian space treatment of previous works) will be answered in Sections 4 and 5. To this end, we sampled a set of reconstructions based on mock observed density distributions. We describe and illustrate the mocks and reconstructions in the rest of this section.
In our numerical experiments we use four categories of parameters, detailed in the following subsections:
For the cosmological parameters we use the maximum likelihood results from the WMAP 7-year observations (Komatsu et al., 2011)333 For more up to date parameters, see Planck Collaboration et al. (2018b). . The relevant parameters for barcode are given in table 1.
|Matter density parameter|
|Baryonic matter density parameter|
|Dark energy (DE) density parameter|
|DE equation of state parameter|
|Primordial power spectrum power-law index|
|DE linear time dependency parameter|
|Density field fluctuation rms at|
|Hubble parameter (units of )|
|Comoving box size in|
The cosmological parameters also determine the power spectrum. We use CAMB (Challinor & Lewis, 2011) to generate the power spectrum which is used for the prior computation in barcode. We consider cubical volumes of . The Zel’dovich structure formation model is applied to transform from Lagrangian to Eulerian coordinates at .
barcode has a number of parameters to tune HMC and some choices of statistical models. The results in this paper use the following statistical parameters:
Leap-frog step size : We use an adaptive step size (Appendix A.2.1). It is bounded to be between and , but within those limits it is adapted to give an acceptance rate of (calculated over the previous steps).
Number of leap-frog steps per iteration :
The number of leap-frog steps is randomly drawn from a uniform distribution of integers betweenand . For Section 5.2.3 the maximum was increased to to assess performance as a function of this parameter.
Hamiltonian mass: We use the inverse correlation function type mass (see Bos (2016) for more details):
Likelihood: We use a Gaussian likelihood as described in equation 39. For this work, we set and to for all cells, effectively removing these parameters from the algorithm.
Mock observed density field: The algorithm is based on the comparison of an observed density field with sampled fields from the posterior PDF. For the runs in this study, we generated the mock observed fields in redshift space (instead of Eulerian space which we did in Bos (2016)) as follows. First, we generate a Gaussian random field given the cosmological power spectrum. This Lagrangian space field is mapped to an Eulerian space density field at , using the assumed structure formation model. The Eulerian space field is, in turn, transformed to redshift space on the basis of the ballistic velocities according to the Zel’dovich approximation. Following this, to simulate actual data, we add random noise to the redshift space density field. The noise is drawn from a Gaussian with zero mean and standard deviation.
Note that since our likelihood is Gaussian, we must also allow negative densities to be produced by the noise. In Bos (2016)
, we truncated negative values so as not to generate unphysical negative masses. This turned out to be the cause of the power deficiency in the reconstructions we discussed at length in that work. In this work, we show the potential of the formalism, including its ability to converge on its target, which is why we allow unphysical negative densities in order to satisfy our model choice of a Gaussian likelihood. In a more realistic use case, using real observed data, one would have to choose an error model that does not allow negative values, like a gamma or log-normal distribution.
The lower right panel of figure 2 displays the resulting noisy mock observed field (the panel above that describes the end result of the mock field used in a model where density is defined in regular comoving coordinates instead of redshift space).
For our comparison study, two types of runs were set up, in addition to the runs from Bos (2016) (in this work referred to as “regular” runs, i.e. runs with data and the model in regular comoving space):
Window / mask: The window parameter in the likelihood was not used in this work, i.e. it was set to for all cells .
Starting configuration: To start the MCMC random walk, we need to provide a starting point, or “initial guess”, in the -dimensional sampling space. We supply the code with a non-informative guess about the cosmic web structure: a field with value zero in every grid cell. This is also the way to run the code using real observations.
We run with a resolution of , corresponding to a grid of cells in each direction and a volume side of . The total number of cells is . In Bos (2016) we studied computations with . The Zel’dovich approximation as a structure formation model correlates well with a full N-body simulation above scales of , which is why for this work we chose a higher grid cell size.
We used three seeds to initialize the random number generator in the code. Each of these seeds was used for a separate code run, giving separate, differently evolving chains.
We output the field realizations every 50 steps of the chain. This assures that the saved samples are sufficiently far apart, although it is a bit overcautious, because when the chain performs well, the HMC algorithm should already make sure that subsequent samples are uncorrelated (indeed the likelihood panel in figure 14 shows that this is the case). In the statistics of the following sections (means and standard deviations of reconstructions) we use 101 samples, the range from sample 500 to 5500.
We used the SPH density estimator (Section D). We set the SPH scale parameter equal to the cell size .
First, in figure 4 we show the difference between the mean and true fields, but now in redshift space coordinates
. The mean reconstruction in redshift space shows a high degree of similarity to the true field’s configuration, both in the density values and in the shapes of the web structures. The variance around the mean (bottom panel) reflects the propagation of data, model and systematic errors embodied by the posterior model of the cosmic densities that we sample. As is the case when sampling densities in regular comoving coordinates (see e.g.Bos (2016), figures 3.14 and 3.15), the highest variance can be found in the high density regions, reflecting the higher number statistics of the particles that cluster to form these regions. These same statistics lead the voids in reconstructions to be dominated by random fluctuations from the prior. These fluctuations average out in the mean field. The underdense voids, hence, contain less structure in the reconstructions’ mean than in the true field.
In figure 5 we compare the redshift space results to the regular Eulerian space density (we also show the true densities, for convenience). The coherent inflow part of the redshift space distortions causes expected features along the line of sight in redshift space: squashing and enhancement of density contrast for overdense large scale structures and a stretching of underdense voids444 These features may be hard to appreciate from static figures. We provide an animated GIF at http://egpbos.nl/rsd-eul__rsd-rss/ for more convenient comparison of the right-hand side panels of figure 5. . These effects are quantified in the power spectrum, which in redshift space shows the Kaiser effect (figure 3; see also figure 19 of Appendix B, which illustrates the imprint of the Kaiser effect on reconstructions that were not corrected for redshift space distortions), and the two-dimensional correlation function, further discussed in Section 4.2.
In this section we will show the benefits of using our self-consistent redshift space (RSS) Zel’dovich model over a regular Zel’dovich model. When properly dealt with, the anisotropies caused by RSDs can be modeled in the data and hence eliminated from the reconstruction. This enables reconstruction of the mean density field based on redshift space data. By directly incorporating RSDs in the formalism, ad-hoc corrections for RSDs are no longer necessary.
From an algorithmic point of view, the most pressing question is whether the code does converge correctly towards the true Lagrangian density. Also, we want to characterize in what way the samples diverge from it.
One might wonder what would happen if redshift space data are used without a redshift space model. In Appendix B we illustrate the non-negligible impact of the redshift space effects in such a naive approach. This clearly motivates the need for the model used in our approach.
We first inspect the match of the Lagrangian density obtained by the chain samples, after burn-in, to the true underlying density. Comparing the true field to the ensemble average in the top panels of figure 6, we find a good match in terms of the large scale structures. Many high peaks and low troughs have their counterpart in the samples, even though their exact shapes and heights do consistently differ. In some cases peaks or troughs are combined, in some they are split and in some cases random peaks or troughs are added that should not exist at all.
We should note that these kinds of fluctuations are to be expected. In the individual samples, such fluctuations will be even more pronounced (as we discuss in Section 5.1). In the mean sample, the algorithm is expected to fluctuate around the true density field to a great extend. However, it is highly unlikely that the mean field will ever exactly match the true field. This is prevented by the uncertainties that are inherent to all the processes involved in the posterior.
The algorithm not only manages to reconstruct the structures, also quantitatively it works well: the difference between true and reconstructed density is typically on the same order of magnitude as the values themselves. It leads to a good reconstruction of the density distribution function, but still allows for significant variation. The standard deviation of sampled densities is typically around 1 (bottom left panel of figure 6). Using the RSS model gives a very comparable amount of variation to when one does not need it, as can be seen by comparing to figure 18. In principle, given the added uncertainty in spatial mass distribution configurations along the line of sight introduced by redshift space distortions, one might expect that the variation around the mean would be smaller when using the RSS model. Stated differently, the degeneracy of the -axis could lead to the “leaking” of the MCMC chain of some of its movement in the statistical coordinate space to velocity space. Here we see no such effect, which may indicate that our chosen HMC parameters, in particular , cause a sufficiently uncorrelated sampling process for both cases. We mention this here explicitly, because in our previous work on higher resolution runs (Bos, 2016) we did see a difference in sample variability between runs with and without RSS model. The sample variation is further discussed in Section 5.
Following gravitational evolution, the corresponding Eulerian density fields reveal an evolved web. Our analysis therefore also allows a similarity inspection of the prominent web-like mass distribution features in the Cosmic Web. On the basis of figure 7, it becomes clear that this model performs very well in reconstructing the large scale features that we are interested in.
As expected, there is still some difference, as seen in the bottom right panel. With the redshift space distortions included in the reconstructions, the outstanding -directionality or bi-polarity that can be noted in the uncorrected process (see figure 18 of Appendix B) have fully disappeared. Moreover, the variation seems evenly spread out over the entire field. If the variation were caused by redshift space distortions, one would expect a concentration of differences around clusters and “great walls”. As expected, compared to the Lagrangian case, the standard deviation of the samples is somewhat lower, but of the same order of magnitude.
Looking further at the sample power spectra in figure 8, we find an excellent match to the expected true spectrum. The difference of the mean to the true spectrum in the bottom panel shows that the Kaiser effect has been completely accounted for. The deficiency in the reconstructed power, peaking at , that we discussed in Bos (2016) is no longer present here, due to allowing the mock observed density distribution to have values , so as to properly match the Gaussian likelihood (see Section 3.3).
Because of the anisotropy of redshift space, RSD effects show up more clearly in directional statistics like the 2D correlation function (see Appendix C for more detail). In , the three spatial directions are collapsed into two. One uses the directions along and perpendicular to the line of sight. Anisotropies in the data will show as a deviation from a perfectly circular . RSDs will thus have a marked signature, see e.g. Hawkins et al. (2003); Peacock et al. (2001a); Guzzo et al. (2008).
The two-dimensional correlation function of the reconstructions matches the true one when using the redshift space model. This indicates that the large scale redshift space anisotropies have been eliminated555 We use the notation instead of , replacing and . . In figure 9 we compare the correlation functions of the ensemble mean of chain samples and the true field. The difference between them, , is maximally , with the mean difference being .
The nature of the MCMC sampling process itself introduces additional anisotropies in the samples. No single realization will perfectly match the 2D correlation function of the true field. This sampling effect is illustrated in figure 10, where we show the difference of the correlation functions of 5 individual realizations with that of the mean of all realizations. The differences are very small, on the order of . To account for this case-by-case variation, we only look at sample ensembles instead of just at single samples.
Note that for reasons of cosmic variance, some amount of anisotropy is always expected. A finite sized box will never be perfectly isotropic. This fact is illustrated in figure 11. The first three rows are each based on a different true Lagrangian density field, generated with a different random seed. It is immediately apparent from their deviation from perfect circularity on larger scales that the true fields (left column) contain a considerable amount of anisotropy. To account for this effect, one should always compare to the true field and take its anisotropies into consideration. Indeed, it can be seen that the true and reconstructed anisotropies match well. Alternatively, one could run a large number of chains with different random input fields and average out their statistics. For the entire Universe we expect a perfectly isotropic condition, reflecting the cosmological principle. Even averaging over only three different chains does not suffice, as may be appreciated from the bottom row of figure 11.
Some questions remain regarding performance of the algorithm given our new model, in terms of both speed and accuracy. Here we formulate some answers to the questions of how the chain evolves compared to the case without RSDs presented in Bos (2016) and whether we see any remaining imprints of the RSDs.
As demonstrated above, the variation of samples around the mean seems similar to that obtained with a regular Eulerian space model. One could have expected the algorithm to have somewhat more trouble evolving the MCMC chain in the redshift space case. However, as argued above, it seems that in our situation, the leap-frog step size was sufficiently large for the chain to orbit similar volumes of posterior parameter space in both cases.
In figure 12 we take another close-up look at the evolution of the chain. The same region of the reconstructed Eulerian density field is shown for six different iterations of the chain in figure 11(a). These can be compared to the true density field in the bottom panel. We see here the stochastic sampler at work. The main large scale features can be identified in all samples, like the very massive peak in the bottom right corner and the filamentary structure in the top left. The massive peak is rather stable, with some additional protrusions sometimes being formed at random. The filaments, however, undergo some major variations in shape, orientation and connectedness.
We compare this behavior to that of a regular space run in figure 11(b) (see Appendix B for details on this run). While there are clear differences with the redshift space corrected runs of figure 11(a), the results do not seem qualitatively different by eye in terms of amount of structural variation, which is in line with what we discussed above. At the same time, we see deeper reds in clusters and blues in voids in figure 11(b) than in the corrected runs. This is because the uncorrected runs reproduce features that look like redshift space distortion — amongst which: enhanced density contrast — but then in real space. The enhanced contrast can also be seen in the redshift space zoom-ins of figure 11(c), which corroborates this fact. The uncorrected runs are discussed in more detail in Appendix B.
When trying to analyze chain mixing speed, one obvious place to begin is the burn-in phase. In figure 13 we compare the likelihood of three different chains for each of the two models, the Zel’dovich model in Eulerian and in redshift space. From this we conclude that burn-in does not seem at all slower. The runs seem equally fast in Eulerian and in redshift space. That also means that defining step 150 as the cut-off for burn-in is acceptable for the redshift space model as well.
In figure 14 we show chain parameters for the whole chain (after burn-in). Note that the iteration number on the horizontal axis includes both accepted and rejected steps. The variation in the likelihood can be used as a measure of the variability of the chain, as it compares the sample to the input field. Two Eulerian space and two redshift space chains are shown in these figures.
The variation in likelihood is nearly identical for the regular and redshift space chains, with a standard deviation of slightly over 400 for both chains. This is more directly illustrated in the top right panel, where we subtract the median value of the chain, so that the actual variation around the median can be compared for the different chains. The random patterns are statistically indistinguishable.
In the bottom panel we additionally show (bottom left) and (bottom right). Here finally the expected difference in the chain performance between the regular and rss runs shows. The leap-frog step-size adapts to smaller values in the redshift space model runs. This means that for some reason the orbit through the posterior’s parameter space accrues more errors, i.e. its Hamiltonian is more sensitive to the exact path taken. The redshift space model constrains the sampler less tightly, since it allows for more possible configurations along the line of sight than when using a regular space model. This means that it is easier for the sampler to end up in a location that fits less well with the true field than in the regular model case. This explains why a shorter leap-frog path may be necessary in the redshift space model.
In Bos (2016), we found that for runs with and , a higher number of leap-frog steps than improved the performance of the chain in that its samples covered a larger part of the posterior’s parameter space. We tried a higher value in this work as well to test whether we have reached optimal chain performance.
In figure 15, we show the results of another redshift space model runs with a higher value of . This significantly higher value does not seem to make any noticeable difference in the chain evolution parameters. It does makes the code slower, as it scales with . We can conclude that 256 leap-frog steps per iteration are plenty for proper chain performance.
Figure 16 shows, for each grid cell, the density of the true Eulerian density field versus the corresponding density in the ensemble average of the chain. The right panel shows the results for the redshift space model, the left panel for the Eulerian space model. These plots corroborate some of our previous findings. The variation (width of the distribution around the line) is more or less the same, corresponding to the similar variation in the samples we saw before. Generally, the correspondence is good for both models.
We developed and tested a self consistent method for the reconstruction of an ensemble of likely primordial density fields based on mock observed data in redshift space. We showed that this significantly improves the quality of the reconstructions. They are more isotropic and redshift space artifacts are removed. This forms a contrast to a naive application of an Eulerian space algorithm to redshift space data, which would retain these features.
The novelty of our method relates to the explicit treatment of redshift space. We extend existing Hamiltonian MC methods (Jasche & Wandelt, 2013; Wang et al., 2013) to a slightly more realistic data model. The rationale is that pre-processing data should be kept to a minimum. Properly modeling the data is always preferable. This way, all the possible interacting complexities in the data will be properly handled, whereas pre-processing data might lead to additional noise and/or errors. For instance, to artificially remove Fingers of God one must make a lot of simplifying assumptions, like sphericity of the clusters (Tegmark et al., 2004; Tempel et al., 2012, e.g.). Simple, automated FoG removers will inevitably generate false positives and also miss out on true positives. In our integrated redshift space approach, no assumptions are needed at all. This is not only considerably easier, but also seems like the only way that one can ever consistently model all the complexities in the data.
Note that redshift distortions have been treated in a different way by Wang et al. (2013). Their treatment of redshift distortions (Wang et al., 2009) is based on a linear approximation of the velocity field, which they use to correct for the positions of dark matter haloes. This is a pre-processing step that is done only once and in a unique way, before running their Hamiltonian sampler. In so doing, the intrinsic uncertainty from using redshift as a proxy for distance is circumvented rather than accounted for.
In the rest of this section, we will discuss some of the (astro)physical and astronomical applications of our formalism. The method presented in this work is still not fully complete with respect to modeling all complexities of the cosmic structure formation process, which we will also touch upon. In addition, a range of technical aspects and open questions will be addressed and we will discuss some further points pertaining to the novel redshift treatment.
Using the barcode framework, we aim to use cluster data as input and statistically study the Cosmic Web that forms in between. All the stochastic effects involved in the evolution of large scale structure and in the observations thereof can be self-consistently included. The analysis of the resulting ensemble of reconstructions should then be straightforward. This matter was explored in Chapter 5 of Bos (2016).
One point of attention when using this code is the fact that it is biased, and is better tuned towards constraining high density regions than low density regions. Because many Lagrangian locations will end up in clusters on the real-space or grid, the real-space constraints in barcode are biased towards more high density regions. This means that even though you might have put constraints on in void-like regions, they will be poorly probed. In a future paper, we will explore this effect in more detail by trying to constrain voids.
Given our aim of using clusters as constraints, this circumstance is actually quite fortunate. One might, however, want to modify the algorithm to better suit void-region constraints. One possibility is to sample directly in Eulerian (volume) space , as opposed to the Lagrangian (mass) space. This poses problems with the non-uniqueness of the translation from Eulerian to Lagrangian coordinates. Another option might be to use to compensate for the voids. One could lower the variance in those regions, thus making the constraints tighter and more likely to be held in a random sample. However, it still leaves the possibility of intermediate fluctuations, given that the amount of particles ending up in voids will be low. The effects of this will also be tested in a future study. This approach is consistent with that of Wang et al. (2013), who define the variance as
with a constant; i.e. simply a linear function of the density.
A number of improvements may be identified that would enable our formalism to improve the accuracy, reliability and level of realism of our reconstructions. For future work, especially for applying the formalism on real observational data, we here list the most significant ideas towards improvement.
When one wants to compare true density fields to fields derived with LPT, one of the first aspects encountered is the difference of one-point density distributions (PDFs). Perturbation theory approaches do not prevent particles from pursuing their initial track. Beyond shell crossing, it leads to the overshooting of the position they would take if the mutual gravitational interaction would have been taken into account in the calculations. True gravitational clustering will give much higher peak density values than LPT, where this artificial shell crossing artifact creates “fuzzy” clusters. A simple 1st order correction to this would be to apply a rank ordered substitution of density values. A transfer function from a “LPT Eulerian space” density to, for instance, a “-body Eulerian space” density can be constructed. Leclercq et al. (2013) did this by comparing the rank order of density values in LPT and -body densities at (given the same initial conditions) and reassigning densities in one or the other in such a way that the density value PDFs match after the procedure. This way, at least one is comparing apples to apples for as far as 1st order statistics are concerned.
We have shown that in the perfect situation of a model that exactly describes our (mock) reality, our algorithm almost perfectly reconstructs the desired mean field and its statistics. Although the used models describe cosmic structure well on large scales, they are far from perfect for describing non-linear regime structures. There are a few models that would give a more complete picture of gravitational evolution.
The current LPT (Zel’dovich) based framework is easily adapted to using second order LPT (2LPT) or ALPT (Kitaura & Heß, 2013). 2LPT has the advantage of matching even better on the largest scales. At cluster scales 2LPT is more affected by artificial shell crossing than Zel’dovich, leading to “puffy” structures. The latter can be fixed by combining 2LPT with a spherical collapse model on small scales. This is what ALPT accomplishes. Both models are fully analytical, so that they can be implemented in the same manner as the Zel’dovich approximation described in this work. In Appendix E we work out the equations necessary to extend barcode with 2LPT and ALPT structure formation models. A similar option would be to apply the Multiscale Spherical Collapse Evolution (MUSCLE) model (Neyrinck, 2016). This analytical model was shown to perform slightly better than ALPT, especially when combined with 2LPT on large scales.
It would be even better if we could use an -body simulation as our structure formation model. Wang et al. (2014) indeed for the first time successfully invoked a particle mesh -body algorithm in this context. The particle mesh equations are analytical, so every single particle mesh evolution step can be derived to the signal. By writing the derivative in the form of a matrix operator and combining subsequent particle mesh time steps by means of matrix multiplications, the full likelihood force can be analytically derived. By means of adjoint differentiation, the large matrices can be efficiently multiplied and the computational cost of this method stays within reasonable limits. The resulting model accuracy using only 10 particle mesh steps is remarkably high. When one needs high accuracy on small scales, this seems like the way forward. More recently, this approach was also adopted by Jasche & Lavaux (2018).
Possibly, the method by Wang et al. (2014) could be augmented by using an -body solver that also has baryonic particles. Whether this is analytically tractable within an HMC framework remains to be investigated. Another interesting extension might be to employ the COLA method (Tassev et al., 2013, 2015) as an alternative gravitational solver. COLA combines the LPT and particle mesh methods, trading accuracy at small scales for computational efficiency. It yields accurate halo statistics down to masses of , which would be more than sufficient for cluster studies. In fact, the COLA method has already found uses in the context of Bayesian reconstruction (Leclercq et al., 2015b; Leclercq, 2015), but in these cases COLA was applied after the Bayesian reconstruction step, not self-consistently within the model like in the work of Wang et al. (2014).
As, in practice, observations concern galaxies, stars and gas, instead of dark matter, it is of key importance to address the issue of in how far the galaxy distribution reflects the underlying dark matter distribution (see Desjacques et al., 2018, for a recent review). For the application to clusters observed in X-ray, SZ or via weak lensing, this is not directly necessary, since for clusters the biasing problem is far less. Schaller et al. (2015) showed that the ratio of halo mass in -body simulations with and without baryons is from mass upwards. It drops off towards for galaxies. Similarly, the halo abundance in cluster-sized haloes was shown to be similar in simulations with and without baryons. However, we might want to extend the algorithm towards a formalism capable of processing the galaxy distribution (like e.g. Wang et al., 2013; Leclercq et al., 2015a).
A natural way to solve this would be to incorporate gas particles in the structure formation model. However, this would still be a mere approximation, due to the complexities of baryonic astrophysical processes. A more statistical approach would be to explicitly incorporate biasing prescriptions (see e.g. Ata et al., 2015; Kitaura et al., 2015, and references therein). We have implemented such models in barcode and will explore their use further in an upcoming work.
One of the more observationally oriented issues is that of masking and selection functions. In our general Gaussian likelihood (equation 39), we included a weight factor . This can be set to zero if nothing is known about cell in the data and to one if there is. Another option is to set it to zero if the area is observed, but was found to be empty (at least up to the depth of the observation). Either of these constitute a very basic form of masking.
Given the nature of observations, one could think of more advanced selection mask treatments. For instance, we should take into account the depth of a null-observation and the fact that this implies a lower limit to the density, not an absolute absence of mass. One could implement this by setting if for some cut-off density (per cell). A theoretically more sound model may be to use a Heaviside step function convolved with a Gaussian kernel (i.e. an error function, plus one, divided by two) with a step at the cut-off density . This reflects the fact that there is some uncertainty about whether the density is above the cut-off value that we derived from the depth of our observation. It further tells us that we are quite sure it is not much higher and that it is most likely lower than the cut-off value.
Such a selection function will give an additional
dependent term in the Hamiltonian force. The derivative is simply a 1D Gaussian distribution. This implies that it should be straight-forward to derive and should give only minimal extra overhead in the algorithm.
These masking treatments will be explored further in an upcoming work.
Another parameter in our Gaussian model we hardly gave any attention is the cell-wise dispersion of the data. It is highly likely that a more suitable value can be found than the one we used ( for all cells). In general, the statistical model in the form of prior and likelihood that we use in our formalism might be improved upon. It seems like the prior is well defined and encapsulates what it must in a proper way. The likelihood is a different story.
Other authors have mainly employed Poissonian data models (Kitaura & Enßlin, 2008; Wang et al., 2013; Jasche & Wandelt, 2013) instead of the Gaussian likelihood we used. The way to find out what model should be used is to meticulously map out the errors and systematics involved in the observations. One particular example of this is that of defining a likelihood model for an observational dataset of X-ray clusters. In this case, we have to take into account the following aspects:
X-ray photon counts, which have Poissonian errors;
X-ray photon energies, which are also Poissonian due to CCD photon counts;
the propagation of photon errors into a -squared fit for the temperature and density profile parameters, e.g. using a -model (King, 1972; Sarazin, 1986; Mulchaey, 2000); if the confidence ranges of such parameters are estimated with bootstrapping, that could lead to complicated likelihood shapes for the density profile parameters;
many systematics arising from the use of certain fitting models for density estimation, like cool cluster cores which are not included in simple -models (e.g. Schindler, 1996);
the combination of the density/mass profile errors with errors in the distance estimators, e.g. from redshifts of the brightest cluster galaxies or some weighted average of all clusters’ galaxies.
Our Gaussian likelihood was chosen as a general error model that is likely to be close to (but not equal to) many true data likelihood functions. This approach is, in fact, not uncommon in the literature for X-ray cluster density error estimation. In Eckert et al. (2011) an error in the density profile of a cluster is estimated using a Monte Carlo approach, by simulating realizations (assuming Poisson statistics) and taking the variation in values as errors. In Samsing et al. (2012), an MCMC sampling of the parameter space is done to characterize the probability function around the best fitting model. Subsequently, the width of the projected PDFs for every parameter were characterized by the root mean square, meaning that essentially they are approximating the error on the parameters as Gaussian, i.e. they leave out higher order PDF characteristics. For the NORAS survey, Böhringer et al. (2000) assume statistically independent, Gaussian distributed errors, even though they state this is not 100% correct. Finally, many authors use the -statistic for determining the goodness-of-fit in their analyses. This statistic is also based on a Gaussian distribution of noise.
The HMC algorithm is quite sensitive to deviations of the data model from the actual data errors. Using a wrong model can lead to poor reconstructions, as we found in Bos (2016). In that work, we found that the power spectrum was not properly reconstructed, deviating from expected values by as much as 10% in some ranges. For this work, to show the potential of the redshift space correction model, we accommodated for the Gaussian likelihood by allowing negative densities in the mock density fields produced by adding Gaussian noise to the Eulerian redshift space density field derived from the “true” Lagrangian initial density field. In other use-cases, when reconstructions deviate significantly from expectations, one could use this as an indicator of model deficiencies.
The use of Lagrangian Perturbation Theory imposes two major limitations on a redshift space distortion treatment. Because of the inaccuracy of the non-linearities in the velocity field, Fingers of God and triple value regions are not accurately represented. However, just as clustering in density was augmented in the ALPT model, one could try to think of ways to augment LPT to better represent these non-linear density features.
The virialized motion of FoGs is hard to model as a direct function of the signal (which is necessary for the derivative of ). It is possible to fix this to some extent by adding a dispersion term to the velocity field model (Heß et al., 2013; Kitaura et al., 2014). However, this implementation needs an extra stochastic step to sample from a Gaussian velocity distribution (the mean and dispersion of which depend on the local density). This breaks the philosophy of barcode, in which we sample from one posterior that integrates all the available models. It might be possible to find a way to add a similar model to our posterior. This is a topic of ongoing work.
Another possible way would be to split up very high density particles into a distribution of smaller particles with a virialized velocity distribution. An easy way to identify these particles would be to use the spherical collapse criterion of the lower limit. To simplify matters, one could collect clustered particles and collectively split them into one big virialized kernel. However, this would then destroy any remaining substructure and shape.
The triple value regions may require a similar augmentation as the virialized motion, but maybe a simpler form is sufficient. It is possible that in ALPT these regions are already partly modeled. However, whether the velocities are also correct in ALPT should be further investigated. If so, this would indeed solve the triple value region problem.
In the relativistic limit, the correspondence between our theoretical redshift space definition and observational redshifts breaks down. It might be worth looking into whether a relativistic redshift space mapping could be fashioned. One would ideally also include in this the proper cosmological distance instead of the Hubble approximation that we usually make. Such a mapping may lead to rather small corrections on small scales. However, on the largest possible scales, quadratic and higher order terms must be taken into account, as they become dominant. Furthermore, one should start thinking about relativistic structure formation equations.
Within the limits of the framework described in this work, the above cosmological distance arguments are moot. A far more important consideration would be that at some point the data that is used can no longer be assumed to be in the same evolutionary state. Conceivably, LPT should be adaptable to this. The calculation of a field’s evolved state at one point in time is just as easily done as another point in time. One could then imagine that each particle in the field is simply evolved to a different point in time, depending on where it ends up. The MCMC nature of the model should be able to iteratively find optimal values for this.
In Section 2, we derived the redshift space equations necessary for an application in a fully non-plane parallel redshift space. Subsequently, we neglected these terms, due to our plane-parallel approximation. We have not implemented nor tested the additional terms that the full treatment entails. However, this can easily be done. The algorithm will then become fractionally slower, but not significantly. For real galaxy redshift catalogs the plane-parallel approximation can not be used. When one wants to use these to reconstruct real Universe densities, one must implement the full redshift space equations.
Galaxy surveys provide information of the luminous matter distribution, such as galaxy, or cluster catalogs. They are affected by selection effects, yielding an incomplete biased and noisy sample of the underlying nonlinear matter distribution. In addition, the tracers are affected by their peculiar motions causing redshift-space distortions (RSDs). A proper analysis of the cosmological large scale structure should account for all these effects. To this end, Bayesian methods have been developed in the past years connecting the observed data to the primordial density fluctuations, which summarize all the cosmic information for a given set of cosmological parameters and structure formation model.
In this work, we have presented barcode, an algorithm for the analysis of the large scale structure, which for the first time self-consistently solves for coherent RSDs within an analytical Bayesian framework. The contribution of this work in the context of Bayesian techniques relies on the analytical derivation of the Hamiltonian equations taking into account the transformation of the biased tracer to redshift space. We present its numerical implementation and a number of tests.
This method could be extended to deal with virialized motions by including a random term accounting for it in the equations, or by a prior fingers-of-god collapse, as suggested in a number of works.
Here we have chosen the grid-based Bayesian approach, which can easily account for survey mask and geometry of a typical galaxy survey. We have restricted the numerical study to the Zel’dovich approximation, but have made derivations including the tidal field tensor within second order Lagrangian perturbation theory and small scale spherical collapse based corrections. This approach could also be extended at the expense of a higher computational cost to particle mesh solvers.
From the detailed analytical derivations and numerical tests here presented, we draw the following main conclusions:
Using our self-contained redshift space model we can overcome large scale redshift space distortion effects in observations and reconstruct the true densities to a great degree of accuracy.
When our model is not applied, but rather a naive model based purely on Eulerian real space is used, the redshift space distortions create an anisotropic imprint on the real space reconstructions (Appendix B). We propose to call this effect the Kaiser effect echo or Kaiser echo for short.
We have demonstrated that barcode yields unbiased initial density fluctuations from a biased tracer of the dark matter density field in redshift space in terms of the cell-to-cell correlation of the true and reconstructed density field, the respective power spectra and the 2D correlation functions.
In particular, we find that the features of the power spectra characterized by the cosmic variance of the considered volume are recovered in detail solving for the Kaiser factor and beyond (as coherent flows have also an impact on non-linear RSDs), and that the 2D correlation functions are isotropized after applying our method.
Component-wise scrutiny of the Hamiltonian likelihood force in redshift space (Section 2.3) shows that it contains an unexpected non-radial term that points in the direction of the displacement field and has an amplitude that grows inversely with distance to the observer. This component may play a strong role when the plane parallel approximation is abandoned. This remains to be investigated.
Our adaptive leap-frog time step method was shown to give good MCMC chain performance, yielding uncorrelated samples with a high, stable degree of structural variability, showing that the chain successfully explores the posterior parameter space of initial conditions that match the input (mock) data.
The number of leap-frog steps per iteration of 256 is sufficient for good chain performance and may even be reduced slightly for this set of parameters (box size, resolution, etc.).
The redshift space model constrains the sampler less strongly, which means the chain has to progress through the posterior space slightly slower than a regular space model. Our adaptive scheme automatically takes care of this given a large enough value. When is too small (as in Bos (2016)) subsequent samples will be more strongly correlated and the sampler will cover a smaller part of the parameter space.
This code can find applications in a large variety of cosmological problems, such as baryon acoustic oscillations reconstruction or cosmic web reconstruction from the Lyman alpha forest, galaxy or cluster distributions. The here presented code, which is made publicly available666 Our source code under MIT license can be found at http://ascl.net and https://github.com/egpbos/barcode. , thus has the flexibility to tackle realistic large-scale structure analysis from galaxy cluster survey data.
We are grateful for the opportunity offered by IAU Symposium 308 “the Zeldovich Universe” (June 2014) to present the details of the code and formalism described in this paper777The talk can be found at https://doi.org/10.5281/zenodo.1451723.. PB and RvdW thank the Leibniz-Institut für Astrophysik Potsdam for the great hospitality during the various work visits that defined the basis for this publication. FSK thanks the Karl-Schwarzschild-Fellowship funding for allowing him to regularly invite PB to Potsdam in this period. PB further thanks his eScience Center managers Jisk Attema and Rob van Nieuwpoort for supporting the final editing of this manuscript during the past four years. We thank Jelle Kaastra for the encouragement and motivation given during (mainly the early, defining stages of) this project, and Johan Hidding, Maarten Breddels and Bernard Jones for many interesting and encouraging discussions. PB acknowledges support by the NOVA project 10.1.3.07. We would like to thank the Center for Information Technology of the University of Groningen for their support and for providing access to the Peregrine high performance computing cluster. For the analysis and the figures created for this paper we used a number of scientific software libraries, most importantly: Matplotlib (Hunter, 2007) and Seaborn (Waskom et al., 2018) for visualization, NumPy and SciPy (Jones et al., 2018) for numerical computation and IPython (Pérez & Granger, 2007) and Jupyter (Kluyver et al., 2016) for interactive analysis. barcode itself was written in C++, making use of the FFTW3 (Frigo & Johnson, 2005) and GSL (Galassi et al., 2009) libraries.
Hoffman M. D., Gelman A., 2012, Journal of Machine Learning Research,pp 1–30
Jaynes E., Bretthorst G., 2003, Probability Theory: The Logic of Science. Cambridge University Press
In this section we briefly recap the basics behind barcode that are necessary for this paper. barcode is a code that aims at reconstructing the full Lagrangian primordial density perturbation field corresponding to a configuration of observed objects. It has been designed specifically for clusters of galaxies as principal constraining objects.
We want this reconstruction to reflect the inherently stochastic nature of both observations and the non-linear structure formation process. This stochasticity can be encoded in a probability distribution function, or posterior. The posterior describes the complete physical process leading to an observed density field; starting from the initial density perturbations, going through gravitational collapse and finally including observational effects. barcode uses an MCMC method called Hamiltonian Monte Carlo (Neal, 1993; Taylor et al., 2008; Jasche & Kitaura, 2010; Neal, 2012) to sample this posterior, comparing the results to the actual observed density field and in this way finding the initial density distributions that match the data. This yields not just one reconstruction, but an ensemble of possible reconstructed Lagrangian density fields. The variation in these realizations reflects the full array of stochastic effects as included in the model.
In the HMC method, the PDF is represented by a Hamiltonian (Section A.1), where the PDF’s stochastic variable (the signal we sample) is treated as a position analogue. The HMC algorithm then samples the posterior PDF by repeating the following procedure:
Initialize the signal: define the “starting position” in the parameter space (either a manually supplied initial guess or the position from the previous sample). Our “signal” is defined as
where is the first order growing mode linear growth factor and is the spatial part of the first order term of the LPT primordial density field expansion.
Stochastic step: draw a random “momentum” vector (we use a Gaussian field momentum distribution with the Hamiltonian mass as the covariance matrix).
Dynamical step: solve the Hamiltonian equation of motion using a leap-frog solver (Section A.2) to find the next “position”, i.e. the next sample candidate.
Accept or reject the candidate based on the acceptance criterion (Section A.3).
This process can continue indefinitely. Usually, the user supplies some maximum number of desired samples.
The Hamiltonian is the sum of the potential energy and the kinetic energy ,
The kinetic term in this is the momentum (row-vector) times the inverse of the mass matrix times the momentum (column-vector),
The potential is coupled to the probability distribution function (PDF) of our signal .
We define this PDF in a Bayesian way as the posterior , which is the product of a prior with a likelihood 888
Note that the notation we use is not necessarily the most common in Bayesian literature.
For instance, Jaynes &
Bretthorst (2003, page 89) uses a notation that would translate in this work to something like , where is the “likelihood” of the hypothesis and is a probability function that describes it in terms of the random variable
is a probability function that describes it in terms of the random variableand conditional variable (the hypothesis). . The connection between the potential and the PDF is given as a canonical ensemble distribution. In logarithmic form, this gives us an equation for in terms of the posterior :
so and are the terms we need our (statistical) model to specify so that we can calculate , up to a constant. However, as we will see below, we only need to know or the derivative of the potential. We can therefore forget about the constants.
In this work we use a Gaussian prior and likelihood:
These are two different types of Gaussian exponents. The likelihood one is a simple sum of one dimensional Gaussians; one for each grid cell. The prior exponent is a little more computationally intensive, due to the matrix inversion. However, we can approximate the by treating it as a convolution in real space and thus a simple multiplication in Fourier space, where the (inverse) power spectrum (the FT of ) is known and easy (diagonal) (see Appendix 3.B.3 of Bos (2016) for computational details). The same considerations apply for the kinetic term , though the mass matrix is a different quantity. Often it is taken to be , meaning that indeed a similar procedure can be applied, only now with the non-inverted power spectrum.
After drawing a random momentum vector, we “move” the MCMC chain forward through the posterior’s parameter space by solving the Hamiltonian equations of motion using a leap-frog estimation scheme. Numerically, we’ll be solving the following discretized equations for each time step (based on the previous step at ):
We will need the derivative of to each component of the signal, i.e. every sampled on a regular grid :
This is called the Hamiltonian force and its components are called the prior force and likelihood force respectively. For our Gaussian prior, the prior force is
where is the correlation matrix corresponding to the cosmological power spectrum, given an inverse power spectrum mass (Section 3.2).
In general, we can write the likelihood force as
where the derivative of the divergence of the displacement field to the signal encodes the structure formation model and the term primarily represents the Gaussian model that we use for our likelihood (see Chapter 3 of Bos (2016) for more details). For the Zel’dovich model of structure formation, the likelihood force is given by
The function is subsequently defined as: