1 Introduction
Diffusions are popular within a variety of application areas as models for stochastic dynamical systems. These applications include the physical sciences (for example Picchini et al. [2009]), the life sciences (for example Golightly and Wilkinson [2006, 2008]), and perhaps most extensively within finance (for example Black and Scholes [1973], Merton [1973, 1976], Eraker et al. [2003], BarndorffNielsen and Shephard [2004]). Given a model for a dynamical system of interest it is natural to consider how to conduct inference on the model parameters. The complexity of inference for diffusions often necessitates the use of advanced techniques in computational statistics (such as Markov chain Monte Carlo (MCMC)). These often require the ability to sample diffusion bridges with different parameter values, while ensuring the bridge sampled is coherent with the data observed.
To introduce diffusion bridges, first consider the following stochastic differential equation (SDE):
(1.1) 
where denotes the drift coefficient, the diffusion coefficient and is one dimensional Brownian motion. Throughout, we assume regularity conditions for the existence of a unique, global, weak solution (for instance, it would suffice if and are both globally Lipschitz and grow at most linearly at infinity [Kloeden and Platen, 2011, §II.4.5], or for weaker conditions see AïtSahalia [2002, §2.1]). We will refer to the law of (induced by (1.1)) conditioned such that as a diffusion bridge.
Even if we were to disregard the complications arising from the conditioning of (1.1), we would still face a number of challenges in simulating unconditioned diffusions. For very simple classes of unconditioned diffusions in which the transition function is known, efficient schemes for simulating trajectories over large intervals of time are available—one can simply apply the strong Markov property and iteratively simulate partial trajectories (and so computational cost scales linearly in the time interval to be simulated). For most classes of algorithm one requires access to the transition function, which for many interesting classes of diffusion is intractable. Although it is natural to approximate the transition function, over long intervals there is an accumulation of error. (It is sufficient to be able to simulate from the transition function (by means of an embedded Monte Carlo algorithm, such as a pathspace rejection sampler), which is one strand of work we explore in this paper.) The problem is exacerbated when we consider the additional complications arising from the conditioning; the strong Markov property cannot be applied in the same direct manner, and the linearintime scaling is also not direct.
Approaches found in the literature for carrying out diffusion bridge simulation fall into two broad categories: methods based on simulating from timediscretisations (and thus approximations) of (1.1), and those which rely on constructions directly on diffusion pathspace for (1.1). The latter eliminates any discretisation error, but usually at a cost of introducing technical restrictions on and . In both cases various embedded Monte Carlo algorithms can be used (for instance, importance and rejection sampling or MCMC), with repeated simulation from an appropriately chosen proposal distribution.
A key consideration is ensuring that the proposal bridge and target bridge are well matched, while simultaneously accounting for the computational complexities of the proposal. Within the context of discretised algorithms, Pedersen [1995] was an early breakthrough in which proposals were made using the unconditioned target process, but this approach suffers from degeneracy between the target and proposal distribution as increasingly fine timediscretisations are considered. Improvements on this notion were made by Durham and Gallant [2002] and Golightly and Wilkinson [2008], which although not degenerate are both computationally expensive for the fine discretisations required to reduce their inherent bias. Other successful approaches (such as Ozaki [1992] and Delyon and Hu [2006]) to avoid the degeneracy of Pedersen [1995] start by constructing continuoustime proposals and then discretising the resulting infinitedimensional algorithms; the discretisation still fundamentally contributes a bias. Delyon and Hu [2006] considered proposals in which a ‘pulling’ drift term was added to an unconditioned (1.1) to ensure the endpoint was reached, and this has inspired a number of extensions and variants including guided and residual proposals [Van Der Meulen and Schauer, 2017, Schauer et al., 2017, Whitaker et al., 2017]. Other approaches incorporating information about the endpoint include a Sequential Monte Carlo resampling scheme proposed by Lin et al. [2010], whereas Hairer et al. [2009]
approach the problem by means of simulating solutions to Langevintype stochastic partial differential equations (SPDEs).
More recently, a major innovation in discretisation approaches for onedimensional diffusion bridges was proposed by Bladt and Sørensen [2014], termed by the authors as simple diffusion bridges (SDB). This forms one key ingredient of the work we introduce in this paper, and so is fully discussed in Section 2. However in short, the novelty of Bladt and Sørensen [2014] is that it only requires the forward simulation of unconditioned diffusions, one of which attains the start point of the desired bridge, another the end. These two unconditioned diffusions are then ‘spliced’ together at a well specified crossing time to form a single proposal path (which we call a ‘confluent diffusion bridge’), which is then either accepted or rejected by means of auxiliary unconditioned diffusions. Although conceptually complex, the beauty of this procedure is that as only unconditioned diffusions are ever simulated, the highly desirable linearintime computational cost of such simulation is retained, and so it offers a practical solution to the simulation of diffusion bridges with distant endpoints. Bladt and Sørensen [2014] was further extended in Bladt et al. [2016] to consider multidimensional diffusion bridges.
Simulating diffusion bridges by means of any of the above timediscretisation approaches introduces bias, in addition to the inherent Monte Carlo error. Although the bias can be mitigated somewhat by increasing the fineness of the discretisation, this introduces considerable computational burden (particularly in our setting where we are interested in simulating diffusion bridges with distant endpoints). This is especially important for many practical problems, where simulating a diffusion bridge is only a single embedded step within an MCMC scheme to conduct parameter inference for partiallyobserved diffusions. (In particular, in the Bayesian setting Roberts and Stramer [2001] circumvent the intractability of the likelihood by applying a dataaugmentation strategy—i.e., simulating diffusion bridges). Eliminating bias is of particular importance in many modern MCMC schemes, such as pseudomarginal MCMC [Andrieu and Roberts, 2009].
Alternatively, a number of methods free of discretisation have been developed (socalled pathspace rejection samplers (PSRS) [Beskos and Roberts, 2005, Beskos et al., 2006a, b, 2008a, Chen and Huang, 2013, Pollock et al., 2016]), which allow sample paths of both unconditioned and conditioned diffusions to be simulated at finite collections of time points without any approximation error. Following the initial work of Beskos and Roberts [2005]
, there has been considerable interest in developing unbiased estimators for classes of stochastic differential equations (including
[Rhee and Glynn, 2015, Fearnhead et al., 2017, Chenxu and Linjia, 2019], perhaps most notably the related work of Blanchet et al. [2017] which is instead based on the the theory of rough paths. PSRS algorithms are based on rejection sampling: sample paths are drawn from a (target) measure by means of drawing sample paths from an equivalent proposal measure (in which access to the conditioned transition density is available), and are accepted or rejected with the correct probability by means of an unbiased estimator of a quantity proportional to the RadonNikodým derivative of the two measures. These variant approaches have not focussed on simulating diffusion bridges. Particular application of PSRS to Bayesian inference for partiallyobserved diffusions has been considered [Beskos et al., 2006a, 2008b, Sermaidis et al., 2013]; however, they have some limitations. A key limitation is that their computational cost increases exponentially with bridge duration, and so are unsuitable in many applications. In addition, there are a number of practical conditions on the coefficients of the underlying SDE which are difficult to satisfy (particularly for multidimensional diffusion bridges).In this paper we provide for the first time an exact method for simulating diffusion bridges with computational cost that is linear in the duration of the bridge. To do this we develop a novel way of incorporating pathspace rejection sampling (PSRS) into the Simple Diffusion Bridge (SDB) framework of Bladt and Sørensen [2014]. Our resulting confluent diffusion bridge (CDB) approach retains the key advantages of both constituents: Just like the PSRS, it is a discretisationfree (‘exact’) methodology which produces unbiased samples from the desired target path measure; like SDB, it retains the attractive feature that the computational cost scales linearly with the distance between the bridge’s endpoints. To achieve this, a number of new simulation strategies of independent interest have been developed and tailored to our approach. These include sampling from the first passage time density of a Brownian bridge, determining whether (and when) two Brownian bridges cross, and determining whether a Brownian bridge crosses a Brownian meander or a Bessel bridge. We find practical application of our work motivated by Monte Carlo Fusion [Dai et al., 2019], by considering the simulation of Langevin diffusion bridges in which the tdistribution is the invariant measure of the unconditioned diffusion. Although we limit our consideration to the simulation of onedimensional diffusion bridges (with application to Langevin bridges), we give some directions in Section 6 as to how our approach may be extended.
The paper is organised as follows: Sections 2 and 3 briefly summarise the important aspects of the existing SDB and PSRS approaches that are used within our confluent diffusion bridge (CDB) methodology, which is presented in Section 4. In Section 5 we benchmark the performance of our method against PSRS and SDB by considering the simulation of Langevin diffusion bridges. Finally, in Section 6 we conclude by discussing the limitations of our approach and how they may be mitigated, and by providing specific direction and application of our work for future research. For clarity of exposition all assumptions are stated in Appendix A, and all proofs are collected in Appendix B.
1.1 Notation
For a vector
, we write to denote its coordinate and . denotes a transpose. On the other hand, we write ,to denote multiple random variables or (random) functions. By convention:
and . For a function its graph is denoted by . We also write , with (or if necessary) to denote a timereversed function: , defined on . For brevity we write , whenever is clear from the context. Finally, we use the following convention to distinguish between the conditional and unconditional laws induced by SDEs. We write to denote the unconditioned law induced by diffusion (on ) started from . On the other hand, we use to denote the law induced by .2 Simple diffusion bridges
In this section, let denote the law induced by (1.1). The output of the SDB sampler, as introduced by Bladt and Sørensen [2014], is an approximate draw from the law . The algorithm comprises two components: simulation of a proposal bridge via rejection sampling; and a MetropolisHastings step to correct for the discrepancy between the law of the proposal and the law of the target. Bladt and Sørensen [2014, Theorem 2.1] describe how to choose the proposal. Consider three independent diffusions , , solutions to (1.1) with set to respectively , and , and conditioned on , , and , respectively. We refer to these three as the forward diffusion, backward diffusion, and auxiliary diffusion respectively. Define further , and let us refer to it as the timereversed diffusion. Let and define:
(2.1) 
Suppose that 1 holds (see Appendix A). Then we have the following equivalence in distribution:
We denote the law of by . The above result says that is equal to the law of the diffusion bridge, conditioned on the bridge being hit by an independent auxiliary diffusion started from at time . The result is used both for choosing proposals in Section 2.1 and for the Metropolis correction step in Section 2.2. Each of the steps—as they were presented in Bladt and Sørensen [2014]—employed various approximations, such as discretisation schemes to sample , , or approximate techniques for finding the first hitting times and (with defined in Algorithm 2). Consequently, the final draws of SDB are biased.
2.1 Sampling proposals
The proposal bridges are simply chosen to be the paths defined by (2.1), conditioned on . This can be achieved by simple rejection. To sample paths and we can employ any discretisation scheme based on the Stochastic Taylor Expansion [Kloeden and Platen, 2011]. The pairs are generated until the first occurrence of , upon which is returned. Algorithm 1 below summarises this sampling procedure.
2.2 Markov Chain Monte Carlo (MCMC)
We follow the notation of Bladt and Sørensen [2014]: let denote the canonical space of continuous functions defined on the time interval , and define to be the set of functions that intersect :
Moreover, let represent the set of pairs of intersecting functions:
Define also:
(2.2) 
where , are independent and denote the target diffusion bridge and the auxiliary diffusion respectively. Supposing and have common dominating measure, then by Bladt and Sørensen [2014, Theorem 2.1]:
(2.3) 
Equation (2.3) gives an explicit connection between the law of the proposal bridges and the law of the target . This immediately suggests a Markov Chain Monte Carlo (MCMC) algorithm with independent proposals. More precisely, denoting by the sample path kept at the iteration of the Markov Chain, we sample a new path and set with probability:
(2.4) 
where we suppress from the superscripts of the laws for the sake of clarity. Otherwise is kept: . Although cannot be computed in closed form, it is possible to obtain a positive, unbiased estimate. This is sufficient for implementing the algorithm: by substituting the exact value of the likelihood for its positive, unbiased estimator, the invariant distribution of the Markov Chain is unaltered and the unbiasedness of the entire algorithm is preserved [Andrieu and Roberts, 2009]; this is an example of pseudomarginal MCMC. For a proposal sample one such estimator can be defined as
with , , iid copies of the auxiliary diffusion . Once again, simulation of by Bladt and Sørensen [2014] is achieved via discretisation. The resulting MCMC algorithm is summarised in Algorithm 2.
Remark 2.1.
Bladt and Sørensen [2014] use an average of independent samples of as an unbiased estimator for . Our choice of is motivated by the fact that sampling is the most expensive part of our algorithm (see Section 4.3.3) and that for the regime of interest with very high probability [Bladt and Sørensen, 2014, §2.2].
3 PathSpace Rejection Sampler (PSRS)
Our goal is to describe a method of drawing , , and , which does not introduce any approximation error to Algorithms 1 and 2. To this end, we give a brief summary of PSRS, on which our proposed algorithm will rely.
The Path Space Rejection Sampler (PSRS)—also known as the Exact Algorithm [Beskos and Roberts, 2005]—is a rejection sampling algorithm targeting the law induced by the SDE (1.1) conditioned on or on . The output of the algorithm is a skeleton of the path : that is, a set (where we use the convention , ) revealed at a random time grid . Upon acceptance the path can be retrospectively revealed at any additional timepoints, and thus on a conceptual level the output of PSRS can be understood as an entire diffusion path . We remark that when (which in view of the existence of the Lamperti transformation (A.1) can be assumed without loss of generality) and the proposals are drawn from the Wiener measure, then the accepted path between any two revealed skeletal timepoints and is distributed as a Brownian bridge joining and . Thus retrospective revealing of amounts to drawing samples from the Brownian bridge measure.
PSRS can be employed only when assumptions 1–3 are satisfied (see Appendix A). When they are satisfied, PSRS should be preferred to other sampling schemes, as it outputs exact draws from the target law at a computational cost on par with the EulerMaruyama scheme. In this paper we additionally assume 4, which is perhaps the most restrictive assumption but which grants some technical shortcuts needed by the CDB algorithm.
Remark 3.1.
A naïve application of PSRS for has acceptance probability decaying exponentially in . However, using the Markov property we can apply PSRS sequentially to shorter subintervals (with length , and chosen independently of ), making up the entire interval . Each call to PSRS on an interval shorter than has cost (as ), yielding a linear, , rather than exponential, cost for the entire algorithm [Beskos et al., 2006a]. The same argument does not apply to however, so PSRS for the bridge measure scales exponentially with the timedistance between the bridge’s endpoints. Below, we show how this can also be improved to a linear cost.
4 Confluent diffusions bridge
CDB aims to incorporate PSRS within SDB to mitigate the shortcomings of the latter, which, inter alia, opens a new possibility for unbiased inference for sparsely observed diffusions. In view of the existence of Lamperti transformation (A.1), we henceforth assume without loss of generality that the diffusion coefficient in (1.1) is identically equal to .
4.1 Unbiased Sampling of Proposals
We sample from as follows. We start by sampling , as defined in Section 2 using a standard application of PSRS with biased Brownian motion proposals. The paths are revealed only at some random timepoints: , (see the top plot in figure 1). Next we must find the first crossing time of the two diffusions and . To this end it is easier to first crosspopulate the paths by sampling them at a common timegrid: , and then to sequentially consider intervals , , on which both paths are revealed only at their endpoints. The crosspopulation is achieved by sampling the forward path at additional timepoints , and the backwards path at (middle plot in figure 1) by sampling each retrospectively as in the standard formulation of PSRS.
Finally, for each subinterval it is verified whether the two paths and cross (the bottom plot in figure 1 illustrates this for one of the subintervals). To this end, notice that conditionally on all points that have been sampled up until this step, between and is distributed as a Brownian bridge joining at time and at time . Similarly, between and (conditioned on all available information) is distributed as another, independent Brownian bridge joining at time and at time . Our next goal is to characterise the law of the difference . The following result relies on elementary calculations so a proof is omitted.
Lemma 4.1.
Let , be two independent Brownian bridges, joining respectively at time with at time , . Define their sum and difference processes as and respectively. Then, the two dimensional process is distributed as a two dimensional Brownian bridge with a scaled covariance matrix:
joining at time with at time .
It is immediate from Lemma 4.1 that the process defined between the times and as is simply a Brownian bridge joining at time and at time
, with a variance modified to
. Because , it is enough to find the first passage time of to level , which we denote by . In the following section we show how to sample exactly.4.1.1 Simulation of the First Passage Time of a Brownian bridge to Zero
Let be some given constants, a Brownian motion scaled by a factor —i.e. a solution to an SDE , and let , denote a scaled Brownian bridge. Notice that the distribution of on coincides with defined here with and relabelling , . Consequently, we drop from the notation. Let as before be the first passage time of to level . Then can be simulated exactly as follows:

If , then a.s. and we can proceed to (iii).

Otherwise and are nonzero and are of the same sign, so there is a positive probability that will not cross zero before time . The event happens with probability [Karatzas and Shreve, 2012, Metwally and Atiya, 2002]:
(4.1) Toss a coin (see remark 4.1)—upon success return and terminate the subroutine, otherwise proceed to (iii).
Lemma 4.2.
The density in (4.2) coincides with the density of a random variable , where and IGau
denotes the InverseGaussian distribution.
Remark 4.1.
Many subroutines of the CDB algorithm require samples from Bernoulli(p) (where is problemdependent). It is important to note that evaluation of itself may not be essential, so long as there exist an algorithm which outputs with probability . We refer to such algorithms as constructions of coins. For instance, in Section 4.3.2 we use the following, generic construction of a coin. Suppose that it is possible to approximate to an arbitrary precision and simultaneously provide bounds on the maximal error. More precisely, denoting by the approximation to at the call to an approximating algorithm and by the respective maximal error () satisfying property and , , we have [Devroye, 1986, §IV.5.1]:
The output of this algorithm is an exact draw from Bernoulli(p).
4.1.2 Exact Sampling of Proposals—summary
is sampled on each subinterval —starting from the leftmost and proceeding sequentially to the right until the first occurrence of . Any crossings in the subintervals further to the right are irrelevant and thus sampling of on them should be omitted. We then set , where is the index of the first interval on which .
Once has been sampled, by Lemma 4.1 we can draw from the scaled Brownian bridge measure, and thus compute the values of , and .
If and do not cross on any subintervals, the pair is rejected and the entire procedure, starting from sampling using PSRS is repeated. We summarise the protocol in Algorithm 4 below.
4.2 Unbiased estimator for the probability of hitting an auxiliary diffusion
To simulate without bias we construct a coin whose probability of coming up heads is exactly equal to the probability of an auxiliary diffusion intersecting proposal path —in fact each coin toss is an independent experiment in which an iid path of is sampled, followed by an unbiased check for whether it and intersect. If we denote by , , the iid tosses of such coin, then can be defined as .
4.2.1 First stage of sampling , —drawing paths
We start by simulating using PSRS (top plot in figure 2), revealing the path at a random collection of time points (we sample on the interval , however we ignore the part from the first half and suppress any simulated points on that interval from the notation).
Next, we crosspopulate the paths , , , so that they lie on a common timegrid: (middle plot in figure 2). For this amounts to drawing samples from the Brownian bridge measures at times . The other two processes are drawn similarly anywhere to the right of , however their simulations to the left of differ because we already know the value of .
Consider first the interval , with . Without loss of generality suppose that and . Then on are a pair of Brownian bridges, conditioned on . Sampling of at any fixed can be conducted via rejection. Pairs are drawn from the two independent Brownian bridge measures: if occurs, the pair are immediately rejected and resampled, otherwise two additional coins are sampled with the probabilities of success given respectively by:
with defined as a solution to: . The exact expressions for these coins follow from (4.1). The proposal pair are accepted upon the first successful throw of both coins above.
Sampling the pair for a in the interval requires another approach. Without loss of generality suppose that . Then , is distributed as a (scaled) dimensional Bessel bridge, which can be simulated thanks to the identity Pitman and Yor [1982, 5.d]):
with , denoting three independent standard Brownian bridges joining at time and at time , and:
By Lemma 4.1, is a scaled Brownian bridge, independent of , and thus simulation of at any point can be accomplished by drawing and deterministically setting: and
. We remark that the case and requires only mild alteration to the calculations above.
4.2.2 Second stage of sampling , —checking for the presence of intersections
The final step consists of checking whether and intersect on any interval . If they do, is set to and otherwise it is set to . Since only the endpoints of and have been revealed on , any such check is a toss of some coin, with the probability that no crossing occurs inside . If , the crossing must happen and can be trivially set to . In the nontrivial case we distinguish three regimes for drawing the coins.
The most straightforward verifications are done on the intervals situated to the right of , i.e. . For these: , and are pairs of independent Brownian bridges. Consequently, the probability of and not intersecting is given by the expression based on (4.1):
Comments
There are no comments yet.