Bayesian Inference for Fluid Dynamics: A Case Study for the Stochastic Rotating Shallow Water Model

In this work, we use a tempering-based adaptive particle filter to infer from a partially observed stochastic rotating shallow water (SRSW) model which has been derived using the Stochastic Advection by Lie Transport (SALT) approach. The methodology we present here validates the applicability of tempering and sample regeneration via a Metropolis-Hastings algorithm to high-dimensional models used in stochastic fluid dynamics. The methodology is first tested on the Lorenz '63 model with both full and partial observations. Then we discuss the efficiency of the particle filter the SALT-SRSW model.

Authors

• 10 publications
• 14 publications
• 2 publications
• 4 publications
10/01/2019

Data assimilation for a quasi-geostrophic model with circulation-preserving stochastic transport noise

This paper contains the latest installment of the authors' project on de...
02/22/2022

A pathwise parameterisation for stochastic transport

In this work we set the stage for a new probabilistic pathwise approach ...
10/02/2018

Validation of a PETSc based software implementing a 4DVAR Data Assimilation algorithm: a case study related with an Oceanic Model based on Shallow Water equation

In this work are presented and discussed some results related to the val...
11/24/2017

Hierarchical Bayesian modeling of fluid-induced seismicity

In this study, we present a Bayesian hierarchical framework to model flu...
07/27/2019

A Particle Filter for Stochastic Advection by Lie Transport (SALT): A case study for the damped and forced incompressible 2D Euler equation

In this work, we apply a particle filter with three additional procedure...
11/09/2020

Shape optimization of a microalgal raceway to enhance productivity

We consider a coupled physical-biological model describing growth of mic...
12/11/2020

Stochastic dynamics of storm surge with stable noise

The Advanced Circulation (ADCIRC) and Simulating Nearshore Waves (SWAN) ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Let and

be two processes defined on the probability space

. is usually called the signal process or the truth and is the observation process. In our case is given by a pathwise solution of the stochastic rotating shallow water system computed on a staggered grid. We have proven in [8] that such a solution exists.111Note that in [8] we work with infinite dimensional function spaces, while here the state space will be .

The nonlinear filtering problem consists in finding the best approximation of the posterior distribution of the signal

given the observations 222For a mathematical introduction on the subject see [1]. For an introduction from the data assimilation perspective see [23] and [17].. The posterior distribution is usually denoted by . We denote the dimension of the state space by and the dimension of the observation space by .

A particle filter is a sequential Monte Carlo method in which the posterior is approximated using a set of particles, that is random measures of the form

 πt≈∑ℓwℓtδ(xℓt)

where is the Dirac delta function, are the weights of the particles and are their corresponding positions [1]

. One can make inferences about the signal process using Bayes’ theorem, the time-evolution induced by the model, and observations

[17], [1]. Observations are modelled as noisy measurements of the truth, using the observation operator:

 H:RdX→RdZ (1)
 Zt=H(Xt)+Vt (2)

where

are independent identically distributed random variables with standard normal distribution and

is a Borel-measurable function. Observations are incorporated into the system at assimilation times. The ensemble of particles is evolved between assimilation times according to the law of the signal. At each assimilation time the observation is incorporated into the system through the likelihood function:

 gztt:RdX→[0,1], gztt(x)=gt(zt−H(xt))=P(Zt∈dzt|Xt=xt) (3)

that is

 ∫Ag(zt−H(xt))dzt=P(Zt∈A|Xt=xt) (4)

where . The following recursion formula holds (see [1])

 πt=gt⋆πt−1Kt (5)

where

 Kt:RdX×B(RdX)→[0,1], Kt(Xt−1,B)=P(Xt∈B|Xt−1) (6)

for any measurable set and by ’’ we denoted the projective product as defined in the Appendix. Schematically, this can be arranged as

 πa,z0:t−1t−1Kt−−−−−−→modelforecastpredictionπa,z0:t−1t−1Kt=:πbt=:pttempering, gztt⋆−−−−−−−−−→assimilationanalysisupdategztt⋆πbt=πa,z0:tt. (7)

The indices and stand for analysis and background respectively. Here is the prior distribution of the signal, that is a probability measure which belongs to the space with , and corresponds to the path space generated by . By

we mean the random vector

with the corresponding observed values . Taking into account the definition of the projective product, the recursion formula (5) can also be written as

 πt(B)=∫Bgztt(xt)pt(dxt)∫RdXgztt(xt)pt(dxt)=β−1t∫Bgztt(xt)pt(dxt) (8)

where and is a normalising constant. We are looking for an approximation of the form

 πt≈πNt=N∑ℓ=1wℓtδ(xℓt). (9)

The typical particle filter runs as follows. Each particle is propagated forward in time using the model (SRSW or Lorenz ’63 in our case). We model the evolution of the signal discretely in time through a map . In the following, is a discrete realisation of the Lorenz ’63 or the SRSW model. We have

 Mt:RdX→RdX (10)

and if , is the position of the particle at time respectively then

 xℓt2=Mt2(xℓt1). (11)

The particle trajectory is independent of the trajectory of the signal. At each assimilation time, every particle is weighted depending on the likelihood of its position, given the observation. The weight measures how close the particle trajectory is to the signal trajectory.

 πat2=N∑ℓ=1wℓt2δ(Mt2(xℓt1))=N∑ℓ=1wℓt2δ(xℓt2) (12)

Particles which are close to the truth and therefore have higher weights will be multiplied, while those which are far away will be eliminated. There are several studies ([22],[24]) which have shown that in high-dimensional spaces the tendency is to have one particle gaining a weight close to one while all the others are discarded, having weight close to zero. Indeed, in [1] it has been proven that a particle filter will provide a good approximation of the posterior distribution only when enough particles are used, which explains the difficulty of the problem at hand. The standard rapid divergence of the particles’ trajectory from the signal trajectory is known in the literature as

[22], [23]. Nonetheless, considerable progress has been made in this direction over the last years. A state-of-the-art analysis of the most recent efforts on tackling the filter degeneracy problem can be found in [24] and [22]. Innovative remedies arise from different directions: optimal transportation [17], tempering [2],[13],[4],[5], localisation [3],[16], model reduction [4], data assimilation as a boundary value problem [18], jittering [2], nudging [20], and proposal densities [25]. Some of them have been tested in operational numerical weather prediction systems, e.g. [16].

Our work is part of these attempts of developing a particle filter methodology for high dimensional models originating in (stochastic) fluid dynamics. We have implemented a particle filter with adaptive tempering and jittering for the stochastic rotating shallow water model. The suitability of these two methods used in tandem for high-dimensional problems has been proven in [2] and tested in [4], [5], [20]. In [5] the method is used for the stochastic incompressible 2D Euler model with damping and forcing, while in [20] it is tested for the 2D quasi-geostrophic model. We have no knowledge of any previous application of this methodology for the stochastic rotating shallow water system. The complexity of this model makes the problem harder. The aim of this paper is to show the applicability of the particle filter described below and the stochastic rotating shallow water model running alongside. We start with first coupling the particle filter with the classical Lorenz ’63 model and then we move to the targeted SRSW model. The results are positive in both cases.

2 The Algorithm

In tempering, an artificial dynamics is introduced via a sequence of artificial target distributions between prior and posterior. Each intermediate distribution has a characteristic temperature chosen such that a reasonable number of good particles survive. We first follow a resampling procedure 333This is a static procedure. in which the particles with low weights are replaced with particles with higher weights such that at the end an ensemble of equal-weighted particles is obtained. However, the particles can be concentrated in the wrong place. In order to quantify the spread of the weights with respect to the posterior, we use the effective sample size statistic:

 ess(w)=1N∑ℓ=1(wℓ)2 (13)

The ess

will be chosen to be above a certain threshold in order for the ensemble of particles to be oriented in the right direction. In tempering one increases gradually the variance of the distribution using a sequence of

temperatures to ensure that the remains above the chosen threshold. Once the temperature is (dynamically) chosen, a resampling procedure is applied. The output is a sequence of tempered posterior distributions with corresponding normalised tempered weights. In this case the intermediate tempered posterior distribution at the tempering step is given by

 πrt(B)=∫B(gztt(xt))ϕrpt(dxt)∫RdX(gztt(xt))ϕrpt(dxt) (14)

for any . We denote by the ensemble of particles. Then

 πr,Nti=N∑ℓ=1wr,ℓti(ϕr,x)δ(xℓti) (15)

where

 wr,ℓti(ϕr,x)=(gztiti(xℓti))ϕr−ϕr−1,   with   ∑lwℓti=1. (16)

The corresponding is given by

 essi(ϕr,x):=∥wti(ϕr,x)∥−1ℓ2. (17)

The following is the tempering and jittering algorithm (see e.g. [13],[4]):

• At initial time : sample particles from the prior distribution.

• On the time interval we have an ensemble x of particles with positions and we want to assimilate observational data in order to obtain a new ensemble that defines :

• Evolve .

• Set temperature .

• While do

• Find such that . Resample according to and apply MCMC with jittering if required (i.e. if there are duplicates) a new ensemble .

• Set and .

• If then Stop and go to the filtering step with .

The jittering procedure is standard (see e.g. [4]) but we will briefly explain the idea. Without jittering we have

 xℓt2=M(xℓt1) (18)

which can also be written as

 xℓt2=M(xℓt1,W(t1:t2)) (19)

where is the driving Brownian motion of the model and with we denoted the Brownian path between and . In this case the prior distribution is given by

 pt2=1NN∑ℓ=1δ(xℓt2). (20)

The problem is that, when the number of independent observations is large, after resampling, the particles end up in the same place and we can have a large number of duplicates. In order to overcome this issue we modify the last part of the particle trajectory using a jittering parameter and a Gaussian random variable which is orthogonal to . Then the new dynamics is given by

 ~xℓt2=Mt2(xℓt1,ρW(t1:t2)+√1−ρ2Z(t1:t2)) (21)

where

 xℓti=Mti(xℓti−1,W(ti−1:ti)). (22)

3 Applications for the Lorenz 1963 Model

3.1 Model Description

The Lorenz ’63 model is a classical nonlinear three-dimensional model that is a precursor of turbulence theory which has been introduced in [14]. It reads

 dxdt=α(y−x) (23a) dydt=(β−z)x−y (23b) dzdt=xy−γz (23c)

where are real positive parameters. The model is well-known for the broad spectrum of patterns displayed for different values of and its well-known butterfly attractor. The original values chosen by Lorenz in [14] were and . For a discussion on the behaviour of the solutions for different parameter values see e.g. [21]. This model is implemented using a Runge-Kutta scheme of order 4, with initial conditions . The details of the Runge-Kutta scheme can be found in Appendix. The Lorenz ’63 model has noise in it: after using the Runge-Kutta scheme to implement the three variables from system (23), we generate a random field and perturb the system in a manner which is similar to the one explained in (21). The parameter is equal to here.

3.2 Data Assimilation Results

We perform the data assimilation analysis using an ensemble of 50 particles which evolve for 500 time steps and each time step has size 0.01. In the standard scenario (Figures 3(a)-4(b)

) all three variables of the system are observed every 20 time steps. The initial uncertainty is equal to 1, while the observational uncertainty and the model error are equal to 0.1. We present below a couple of scenarios obtained for different values of these parameters. The output is displayed for the first and the third variable. The ensemble of particles is plotted one standard deviation region around the ensemble mean. We first show in Figures

0(a)-0(b) how the model evolves without any assimilation of data .

As it can be seen in Figure 0(a) and 0(b), in the absence of the data assimilation step, the particles spread gradually around the entire attractor. This behaviour is exhibited when plotting both the first variable and the third variable . In other words, the uncertainty increases (at times quite dramatically). The root mean square error (RMSE) and the ensemble spread (ES) is plotted in Figure 3. In this case the RMSE and the ES oscillate around a value which is comparable with the size of the attractor (the particles fill out the attractor).

Compared to the previous scenario, we show in Figures 3(a)-4(b) that by assimilating observations every 20 time steps, we manage to substantially reduce the uncertainty. Through a repeated application of the forecast/data assimilation step, the cloud of particles successfully tracks the truth, evolving around the attractor set.

In Figure 4(a) we exhibit the first three instances to emphasize the reduction of uncertainty resulting from the application of the particle filter with tempering and jittering, as described in Section 1.

We plot in Figure 4(b)

the RMSE and the ES. We can see that the RMSE and the ES are comparable. This is a feature highly appreciated by data assimilation practitioners because it shows that our estimate of the uncertainty as measured by the width of the ensemble is a good estimate of the actual error in the ensemble mean. Based on this measure of success we can conclude that the particle filter performs satisfactory in the case where all the three variables of the system are observed.

In all previous settings the observation operator was linear. We now perform a couple of tests (Figures 5(a)-8(b)) with nonlinear observation operators. We first make the observations fully nonlinear (Figures 5(a)-6(b)). More precisely, the observation operator becomes

 H(x,y,z)=(x2,y2,z2)

with observations given by

 Z=H(Xt)+Vt

as explained in Section 1. As we can see from Figure 5(a) the information available is not enough to keep the posterior concentrated around the truth at all times, as the system misses information in which wing of the butterfly it is. This is the case particularly with the direction. All particles follow the attractor, while some jump from one wing of the attractor to the other. The behaviour of the posterior in the direction is much better, as the variable is always positive, and hence only one solution is present at all times for this variable. This can be observed in Figure 5(b). The values of the RMSE and the ES are plotted in Figure 6(b). We can see that both can become very large, essentially of the order of the support of the diffusion (the attractor), as the particles can spread around the entire attractor.

Note that these results do not point to a failure of the particle filter. The exact posterior will also show this bimodal behavior, and the particles keep following the attractor as they should. Methods based on linearizations, such as Ensemble Kalman Filters, would show a cloud of particles between the two wings, which is the wrong solution.

To study this further we make the observation nonlinear just for the first variable (Figures 7(a)-8(b)). That is,

 H(x,y,z)=(x2,y,z).

In contrast to the previous case, the particle filter performs much better. The (partial) linearity of the observation operator imposes a massive uncertainty reduction. The cloud of particles remains concentrated around the signal trajectory and the data assimilation steps keep the uncertainty in check. Both the and the variable (Figures 7(a)-7(b)) are successfully tracked, as highlighted in Figure 8(a). The RMSE and the ES plotted in Figure 8(b) confirm the findings: most of the time, the RMSE remains in the interval with rare excursions away from this interval. The ensemble spread remains also small, with some oscillations around the assimilation time. The underlying reason for this behaviour is that in this case the system knows in which wing it is via the linear observation.

4 Application to the Stochastic Rotating Shallow Water Model

4.1 Model Description

The rotating shallow water model (RSW) is a classical nonlinear fluid dynamics model which contains key aspects of the oceanic and atmospheric dynamics. A detailed analytical description of this model has been provided in [8]. From a numerical perspective, challenges are generated especially by the nonlinear advective terms. Nonlinear advection is a dispersive process. While in a linear case the wave-like solutions have constant amplitude and propagate at constant speed, in the nonlinear setting waves of different wavenumbers can propagate at varying speeds. The short waves can be amplified by the nonlinear structure, with direct impact on the accuracy of the solution ([10]). These intricacies can be overcome by making use of the natural dominant balances which appear in atmosphere and oceans (geostrophic and hydrostatic balance) and by implementing the models using suitable numerical schemes.

The stochastic rotating shallow water model used in our work is given by the following set of equations:

 dvt+[ut⋅∇vt+f^z×ut+∇pt]dt+∞∑i=1[(Li+Ai)vt]∘dWit=0 (24a) dht+∇⋅(htut)dt+∞∑i=1[∇⋅(ξiht)]∘dWit=0 (24b)

where , ,, is the horizontal fluid velocity vector, is the pressure term, is the total depth, is the bottom topography, , is the Coriolis parameter, is a unit vector pointing away from the centre of the Earth, is the Rossby number, is the Froude number, is the vector potential of the divergence-free rotation rate about the vertical direction, with

We start with a velocity vector field which is in geostrophic balance at the initial time . Nonetheless, we eliminate the geostrophic balance condition on the velocity field from onwards. On the other hand, the random fields are assumed to stay in geostrophic balance at each time step and we give the explicit formula for this in the next subsection, equations (34). Intuitively, the geostrophic balance can be explained as follows ([26] pp. 58): initially there exists a pressure gradient within the fluid, which generates a fluid dynamics from high-pressure regions towards low-pressure regions; while the fluid flow evolves in time, it is deflected by the Coriolis force; in the Northern hemisphere the Coriolis force deflects the fluid to the right, while the pressure gradient does not change direction much, so that the two forces can come into equilibrium. The resulting direction of the fluid motion is perpendicular on both the Coriolis force and the pressure force. In the atmosphere and oceans this balance tends to be stable, especially away from the boundaries. Small disturbances to it lead to the appearance of gravity waves. The geostrophic balance is dominant as the wind and the currents are usually weak in comparison to the speed of the Earth rotation ([26], pp.59). The Rossby number in this case is small (), meaning that the rotation dominates the advective part and it is balanced by the pressure gradient force ([9], pp. 86). If the rotation is fast, then the horizontal flow is in ’near’ geostrophic balance, that is ’nearly’ divergence-free ([27], pp.95). We also assume geostrophic balance for the stochastic forcing described in Section 4.1.1, to strongly reduce the generation of artificial gravity waves.

We initialise the model by first calculating the Coriolis parameter and the pressure term for each grid point. Then we generate using the geostrophic balance condition. The periodicity of the domain inspires a rigorous numerical setup for generating the stochastic forcing. This is described in Section 4.1.1. The system (24) is implemented on a staggered grid, using a Runge-Kutta scheme of order 4. The domain corresponds to a strip situated between 30 and 60 degrees north latitude. The numerical scheme is described in Appendix. A staggered grid (also known as Arakawa C-grid) is usually preferred when implementing weather prediction models due to the low dispersion errors and the lack of computational modes associated with it ([9], pp. 313).

4.1.1 Stochastic forcing

For a state vector of dimension the nonlinear stochastic forcing is given by

 ∞∑i=1Bi(ξi)XtdWit∼√QdW(X)∼(∞∑i=1Bi(ξi)dWi)X=:RiX (25)

where is a general operator which depends on the vector fields . In our case can be given by or by . With we denoted the equivalence in notation of the terms above, it has no mathematical meaning other than expressing different ways of rewriting the same stochastic term. Then we interpret as a random operator applied to the state vector and we denote it by . Here is a -dimensional vector of independent Brownian motions and is a space-covariance operator. We discretize the SPDE in time and space, and therefore becomes a covariance matrix of dimension .

Given the fact that generating is computationally expensive, we first do this in the spectral space, and then return to the physical space. The covariance matrix is symmetric and circulant. We determine it in the spectral space for a system which is periodic in both the and

directions, using the fast Fourier transform. Since our original domain is not periodic, we then truncate the resulting field to the physical domain, in order to avoid a periodic random field. In particular, we compute the Fourier transform of a column corresponding to the circulant matrix. This column has a Gaussian correlation structure. At every time step, we generate a Gaussian random field (again in the spectral space) and then perform the multiplication between the column of the covariance matrix and this newly generated random field. Finally, we use the inverse fast Fourier transform to return to the physical domain.

It is known (see [19]) that any continuous random field can be expressed as a Fourier-Stieltjes integral over a complex-valued Fourier increment :

 Ri(x)=Ri(x,y)=∫∞−∞eiμ⋅xdYRi(μ)i=1,2,… (26)

where the random component must satisfy the following properties:

 E[YRi(μ)]=0 (27) E[YRi(μ)Y∗Rj(μ′)]=0forμ≠μ′ E[YRi(μ)Y∗Rj(μ′)]=Qij(μ)dμ

where is the complex conjugate of , is the covariance between the two processes and , is the dimension of the spatial vector , and corresponds to the wave number. Therefore, the key ingredient in generating the random fields is the accurate retrieval of the processes . In practice this should be effectuated in discrete time, so that one can implement the integral numerically. Inspired by the periodic structure of the domain, one way of efficiently performing this is by using a complex-valued spectral decomposition ([11]). We know from classical Fourier analysis that the vector fields can also be expressed as

 Ri(x)=∫∞−∞eiμ⋅x^Ri(μ)dμ (28)

where is the Fourier transform of . Therefore we can write

 Ri(xk1,yk2)=∑j,k^Ri(μj,μk)ei(μjxk1+μkyk2)Δμ (29)

with , , , , and

 ^Ri(μj,μk)=C√Δμe−μ2j+μ2kσ2+2πiφjk (30)

where is the dimension of the grid and is a random number which introduces a random phase shift for any given wave number. Therefore

 Ri(xk1,yk2) =C√Δμ∑j,ke−μ2j+μ2kσ2+2πiφjkei(μjxk1+μkyk2) (31) =|Δμ|1/2∑j,keφjk(μ)Ceiμ⋅x ≈ΔYRi(μ)≈dYRi(μ)

where is a random process which corresponds to the (discretized) wave number domain with volume element given by for any fixed wave number . Therefore

 Ri(x) =∫∞−∞eiμ⋅xdYRi(μ) (32) ≈∑j,keφjk(μ)Ceiμ⋅x|Δμ|1/2

that is, a summation over all the discrete wave numbers of the grid. By the central limit theorem,

are normally distributed. In order for the expression in 32 to be well-defined, and must be orthogonal in the limit as the discretization becomes finer ([19]). An efficient way of computing 32 is by using a fast Fourier transform algorithm. One has

 E[Ri(x)R∗j(x′)] =∫∞−∞∫∞−∞eiμ⋅x−iμ′⋅x′E[dYRi(μ)dY∗Rj(μ′)] (33) =∫∞−∞eiμ(x−x′)Qij(μ)(dμ)

that is the covariance of and can be written as the inverse Fourier transform of . More details on random field generators can be found in [19] and [11]. As mentioned in the previous section, we assume that the dominant balance for the random vectors in this system is the geostrophic balance, that is the pressure gradient force and the Coriolis force balance each other. We have three random vectors: corresponding to each component of the velocity , and which corresponds to the pressure field . We obtain them as follows: first, we compute using equation (29); then we use to compute and under the geostrophic balance assumption:

 Rv1=−gf∂Rp∂y (34a) Rv2=gf∂Rp∂x. (34b)

4.1.2 Connection with SALT

The SRSW model considered here has been derived using the SALT approach described in [12] (see Appendix). Stochasticity is introduced in the advection part of the dynamics to model the uncertain transport behaviour within the fluid flow. To the best of our knowledge this is the first implementation of the SALT SRSW model in a data assimilation setting. Numerical implementations and particle filter algorithms for other SALT models (2D Euler, SQG) have been extensively developed in [4]-[7]. Our numerical implementation of the stochastic forcing is equivalent to the one described in [4] up to an isomorphism. More precisely, if one extends the approach described in [4] to a periodic domain, then the stochastic vector fields correspond to the Fourier modes of the transport covariance matrix and they form a basis of the underlying space. This basis may be different from the basis chosen by us above. However, an isomorphism can be established between any two orthogonal bases corresponding to the same underlying space.

4.2 Data Assimilation Results

We perform the data assimilation analysis using an ensemble of 50 particles. We plot the ensemble of particles one standard deviation region about the ensemble mean and compare it with the truth. The truth is a pathwise realisation of the SRSW model. In the standard setting (Figures 9(b)-9(d)) we observe the system for 50 time steps, with a time step size of 90 seconds (hence we observe every 1.25 hour). We use one observation at each analysis time, and the observational uncertainty and the initial uncertainty are equal to 1. The stochastic model error is set to 200 metres. As mentioned, the random forcing is implemented using a stochastic advection velocity in SALT. We generate these stochastic advection fields by first generating a stochastic pressure field of amplitude 200 m, then determining the velocity fields in the zonal and meridional directions, and using the geostrophic balance, as explained in the previous section. These then form the stochastic advection fields that are applied in all three evolution equations.

We first show the output obtained when no data is assimilated. Figure 9(a) contains the evolution of the ensemble of particles corresponding to the pressure field at a grid point in the middle of the domain. We can see that the prior distribution is spread out and the truth is not successfully tracked (the ensemble mean remains far from the truth).

Now we start to observe the system every 10 time steps (Figures 9(b)-9(d)), so every 15 minutes. In Figure 9(b) one can see that the assimilation of data improves the performance of the ensemble of particles: the standard deviation is reduced and the particle trajectories are corrected at assimilation times. We then decrease the stochastic model error to 50 and observe the system every 5 time steps (Figures 10(a)-10(b)). In figure 10(a) one can notice that the particle filter efficiency is significantly improved by the data assimilation time window: by observing the system more often we obtain a cloud of particles which are better concentrated around the truth. This success is due also to the fact that we decreased the model error to 50. The RMSE and SE illustrated in Figure 10(b) are much smaller than before.

Remark 1

Note that all plots are rescaled automatically.

For the next scenario (Figures 11(a)-11(b)) the system is observed every 5 time steps and we use 100 observations at each analysis time. Figure 11(a) shows that a drastic increase in the number of observations (100 now as opposed to 1 as we had before) greatly improves the efficiency of the particle filter. In Figure 11(b) one can see that the ensemble spread is substantially reduced and the accuracy is now also much better. The true innovation is that although we do not perform any localisation, we can assimilate 100 observations without filter degeneracy. This suggest that this particle filter is an important step towards beating the curse of dimensionality that plagued particle filtering for so long.

We show below (Figures 12(a)-12(b)) the evolution of the SRSW model for 100 time steps, when observed every 5 time steps, with 5 observations at each analysis time. Although the particles have a constant tendency to diverge from the truth, especially after the first assimilation step, by observing the system quite often and making the observations informative enough, we capture the truth most of the time.

5 Conclusions and further work

We conclude that our methodology based on a bootstrap particle filter with tempering and jittering can be successfully used to assimilate data for the Lorenz ’63 model. The performance of the data assimilation methodology is shown to be influenced by the following factors:

• Observational uncertainty: we test this for three values of the observation error (0.1, 0.5, and 5); the results improve as the observation error is reduced.

• Initial uncertainty: we test this in two cases (initially this parameter is set to 1, and then increased to 3). The results improve as the initial uncertainty is reduced, although that advantage decreases over time as the initial uncertainty is forgotten.

• Linearity of the observation operator - three observation operators are used: a fully linear one (), a partially linear one (), and a fully nonlinear one (). We obtain the best results in the linear case. However, it is worth highlighting that we obtain good results also in the nonlinear case, where we see a bimodal distribution. In contrast to this, a standard Kalman filter would probably fail, due to the difficulties generated by multiple modes.

In future research we intend to explore also the influence of other parameters such as: the number of particles, the size of data assimilation window, the threshold, the model error.

We conclude that our particle filter based on tempering and jittering can be successfully used also for the SRSW model. We highlight that this particle filter has been designed without using any other approximation methods (such as localisation). According to our results, there are a couple of parameters which have a relevant influence on the proficiency of the particle filter:

• DA time window: the results are better when we observe the system every 5 time steps, compared to when we observe it every 10 time steps.

• Number of observations per assimilation time: we test this for 1 observation, 5 observations, and 100 observations, respectively, per analysis time. The best output is obtained in the last case. However, by reducing the number of observations from 100 to 5, but observing the system every 5 time steps, the truth can be well tracked for a long period of time (Figure 12(a)).

• Model error: the particle filter is shown to be robust with respect to increases in model error.

The experiments show that the system is underobserved. This is to be expected, as the number of degrees of freedom is very high (

). Nonetheless, the findings in this paper are sufficiently promising to encourage further in-depth investigations based on this particle filter. We intend to explore also the influence of other parameters such as: the number of particles, observational and initial uncertainty, the threshold, variations of the stochastic forcing. Moreover, we intend to use this work as a stepping stone for modelling a slice of the real atmosphere using pressure observations collected by DWD using commercial aircraft.