1 Introduction
The software package VolnaOP2 is used for the simulations of tsunami modelling by a number of research groups around the world since its first introduction in 2011 [1]. The driving force for developing the code was a need within the tsunami research community for a solver which was applicable for analysis of realistic tsunami events and aimed to aid operational tsunami research [1, 2]. The code solves the depthaveraged Nonlinear Shallow Water Equations (NSWE) in two horizontal dimensions using modern numerical methods for solution of hyperbolic systems. VolnaOP2 can efficiently simulate the complete life of a tsunami from generation induced by bathymetry displacement, propagation and inundation onshore. It can be used for cases of a simplified bathymetry represented by a mathematical formula but also for the complex bathymetry and topography of the examined geographical region. Owing to the use of an unstructured triangular mesh, irregular bathymetric and topographic features can be efficiently captured and represented.
The first operational use of VolnaOP2 for a realistic scenario was for the modelling of sliding and tsunami generation in the St. Lawrence estuary in Canada [3]. The code has been used to model varying tsunamigenic episodes [4, 5]; in several cases it has been used in conjunction with statistical modelling to perform comprehensive sensitivity analysis tests and uncertainty quantification [6, 7, 8, 9, 10].
Both the originally developed and the newly parallelised version of VolnaOP2 have been carefully validated against well known benchmarks available to the tsunami community [1, 2]. However, in the present paper, we pay particular attention to the accuracy of the new GPU version of the code with special emphasis on dispersion and dissipation errors as well as its computational efficiency on the general purpose GPU cluster. Gaining a deeper understanding of the numerical errors present can inform the user towards more accurate real case simulations. The manuscript is organised as follows: in the next section (2), we describe the mathematical model and numerical schemes implemented in the parallel version of the code. Section (3) is dedicated to an analytic benchmark used to explore the accuracy of the code and it is followed by the analysis of the dispersion and dissipation errors in Section (4). Section (5) discusses application of the code to real cases and the scalability of its GPU implementation. The paper is wrapped up by concluding remarks and perspective developments in Section (6).
2 Mathematical Model and Algorithms
2.1 Nonlinear shallow water equations
The shallow water theory is efficiently used to describe the physics of long waves like tsunamis, which are characterised by very large wavelengths in comparison to the depth of the basin over which they propagate. Neglecting dispersion the NSWE yield:
(1)  
(2) 
where is the total water depth, described as the sum of the timedependent bathymetry and the free surface elevation , is the fluid velocity in the and horizontal directions, I
is the identity matrix and
is the acceleration due to gravity. Provided that the system is strictly hyperbolic. In the wet/dry transition the system starts to become nonhyperbolic since in a dry region. To deal with that an algorithm that solves the shoreline Riemann problem developed by [11] is implemented in the code.2.2 Spatial discretisation
A cellcentered Finite Volume (FV) numerical method is used for the spatial discretization in VolnaOP2 [1]. Other tsunami codes based on the finite volume approach include TsunamiHySEA [12] and GeoClaw [13]. The numerical flux implemented in a numerical algorithm has to ensure that some standard conservation and consistency properties are satisfied: the fluxes from adjacent control volumes sharing an interface exactly cancel when summed and the numerical flux with identical state arguments reduces to the true flux of the same state.
In VolnaOP2 the HartenLaxvan Leer (HLL) numerical flux was selected to ensure these conditions are met [1]. The HLL approximate Riemann solver was proposed by Harten et al. in 1983 and assumes a twowave configuration for the exact solution [14]. The wave speed chosen according to [15] yields a very robust approximate Riemann solver. The Riemann solver models two waves that travel with speeds and , the larger signal velocity is represented by and the smaller by ; three states are identified (Fig 1). The subscripts and are used to represent the right and left cell values respectively. The intermediate state is denoted by , where
is a vector of the conserved variables
. The numerical flux function of the scheme can be described by:where are the two interface states and denotes the true flux at state respectively. The right and left states are known. The intermediate state can be determined by applying the RankineHugoniot conditions twice [1]. It then derives that:
In VolnaOP2 the wave speeds are computed as:
where , and and are equal to:
where and are the gravity wave speeds for the right and left state of the system respectively and denotes the vector along the shared face between the right and left states. The shortcoming of the HLL scheme is that it cannot resolve isolated contact discontinuities. It can thus become quite dissipative.
2.3 Temporal discretisation
A Strong StabilityPreserving (SSP) method is used in conjunction with a RungeKutta method for the temporal discretization in VOLNA. In the current version of the code the optimal second order two stage RungeKutta scheme SSPRK(2,2) is used, with optimal CourantFriedrichsLewy (CFL) condition equal to 1. The scheme is given as follows:
Where is defined as the finite volume space discretization operator. The stability of the scheme is guaranteed if the CFL condition is satisfied. The RungeKutta scheme is very robust, especially in handling discontinuities. However, the scheme is both dissipative and dispersive [16]. Dissipativity causes a leak of energy from the system while dispersion leads to either phase lag or phase lead. A full explanation of these phenomena is given in Section (4).
2.4 Second order extension
The classical finite volume schemes are only first order accurate in space, which is insufficient for most modern computational simulations. Simulating a real tsunami case with a first order accurate scheme would require an unfeasible mesh resolution to obtain meaningful results. So in order to yield second order accuracy in space, a reconstruction technique is implemented. One must ensure that the scheme is total variation diminishing (TVD), i.e no artificial maxima and minima are introduced. Thus, within VolnaOP2 a MUSCL (Monotone Upstreamcentered Scheme for Conservation Laws) scheme is implemented. The second order spatial accuracy is achieved by reconstructing the conserved variables on the cell interfaces. The reconstruction relies on calculating the gradient of a conserved variable over a cell and projecting the reconstructed value on the interface. Within VolnaOP2, a least squares gradient reconstruction and BarthJesperson limiter [17] are implemented. The reconstructed values given below (3) are then used in the numerical flux calculation:
(3) 
where is a vector of conserved variables, is the conserved variable evaluated at the interface, is the cell specific conserved variable limiter, is the cell centred gradient and is a vector pointing from the cellcentre to the face centre. To avoid large gradients being calculated in the wet/dry region of the domain a threshold depth has been introduced to ensure that the code remains stable. When the depth goes below this threshold the scheme doesn’t carry out a reconstruction. This threshold depth has been set to be m. However, it has been found that a conservative value can ensure greater stability, for example in Section (3), m. For the real case (Section (5)), m. At present these values are found through a trial and error approach but more research is required on the optimisation of this threshold depth.
2.5 Boundary Conditions  Wall/Solid Boundary
For a full explanation on the treatment of boundary conditions the reader is referred to [1]. However, the case of a wall/solid boundary is given here. For all boundary conditions a ghost cell technique is used. This approach allows one to reconstruct the conserved variables on the boundary and thus preserve second order accuracy. Values of the conserved variables on the ghost cells are defined based on the type of boundary condition needed. In the following, cell is defined to be inside and cell (ghost cell) is outside of the domain. The boundary of the domain is the common edge between cell and cell .
For a wall/solid boundary, , where is the flow velocity and is the normal vector to the boundary edge. To ensure that this is satisfied, the tangential () and normal () velocities for the ghost cell are set to be equal and opposite to the those of the interior cell respectively:
3 Benchmark Test with Analytical Solution
The two dimensional case of a radially symmetric paraboloid is implemented following the analytic solution initially proposed by Thacker [18]. This solution is available in the SWASHES (ShallowWater Analytic Solutions for Hydraulic and Environmental Studies) library [19]. The major aim of SWASHES is to aid numerical modellers to validate shallow water equation solvers. The oscillatory motion of the paraboloid is described by a periodic solution in which damping is assumed to be negligible. The morphology of the domain is a paraboloid of revolution given by:
(4) 
where for each , where is the length of the domain, is the water depth at the central point of the domain when the shoreline elevation is zero and is the horizontal distance from the central point to the shoreline with zero elevation (Fig 2).
The free surface elevation and the velocity components and are then given by:
where is the frequency, is the distance from the central point of the domain to the initial shoreline location and . To model the solution we follow the values proposed in [19], where m, m, m, m, and .
We record the free surface elevation in three positions m (centre of the domain), m, and m (shoreline). Fig (3) highlights a topdown view of the bathymetry and the locations of the wave gauges. In numerical simulations with VolnaOP2 we model the free surface elevation at the three positions with various spatial resolutions as a function of time up to s. An analytic solution at time is chosen as an initial condition and for the simulations. This was chosen as it was found to be stable for all the mesh resolutions.
The finest mesh (m) yields a representation closest to the surface elevation given by the analytic solution. The discrepancies between the meshes can only be seen by zooming in on the plots (Fig 7). Focusing on the centre of the domain, we plot the absolute difference between the analytic and the numerical free surface elevation over time (Fig 8). As expected, the numerical error is always larger for the coarser meshes. In order to gain an idea on the numerical order of the scheme, a convergence study was carried out. The norm is calculated at 10s for various mesh resolutions (0.048 – 0.003)m and then plotted versus the characteristic mesh size (Fig 9).
The results shown in (Fig 8) and (Fig 9) highlight that the scheme is second order accurate in space. However, the role of numerical dispersion and dissipation has not been explored and they should be accounted for when running long time tsunami simulations. We come back to this discussion in section (4).
To check the temporal discretization error we keep the mesh size constant at m and vary the time step by adjusting the constant connecting the spatial and temporal resolutions. We choose three values: and run the test for a longer time s. The results of the simulations are shown in Figs 10 and 11.
Changes in timestep within the stability limits while keeping the same spatial resolution almost do not affect the accuracy of the solution (Fig 10). This is highlighted in Fig 11 where the plots of the numerical solution with various time steps overlap each other. However, comparing the subplots of Fig 11 as time goes on, one observes a damping of the numerical signal and a phase shift. The RungeKutta scheme implemented in the code is dissipative and we could expect a leak of the energy from the system, thus we can expect the damping of the numerical signal. Reasons for the phase shift in the signal will be explained in section (4). Overall, the results from this section show that the space discretisation has a strong influence on the solution, while reducing the time step has no visible effect.
4 Error analysis
The exact solution of the discretized equations satisfies a PDE which is generally different from the one to be solved. The original equation is replaced with the modified equation , or, in other words
The evenorder derivatives on the righthand side produce an amplitude error, or numerical dissipation. The oddorder derivatives on the righthand side produce a wavenumberdependent phase error known as numerical dispersion. In the long time simulations, the numerical behaviour of the scheme largely depends on the role played by the dispersive and dissipative effects also known as ”wiggles” (phase errors) and ”smearing” (amplitude errors) respectively. A negative dispersion coefficient corresponds to phase lagging (i. e. harmonics travel too slowly), while positive dispersion coefficients yield phase leading with spurious oscillations occurring ahead of the wave.
According to the LaxRichtmyer Equivalence Theorem [20], if a scheme has a truncation error of order and the scheme is stable, then the difference between the analytic solution and the numerical solution in an appropriate norm is of the order for all finite time. It has been observed numerically (see Fig 9) that the numerical solution is second order accurate. Taking into account that the time step is proportional to the spatial resolution that we call , we can write that the total error is of the order . To analyse the role played by the dissipation and dispersion errors, we rewrite the error as
where incorporates all the constants included in the error formula. We choose the three leading terms of this expansion:
The first term in this expansion corresponds to the truncation error (also the leading dissipation error). It is followed by the leading dispersion and the secondary dissipation error terms. We assume that the remaining terms are significantly smaller and can be neglected. Next we go back to the simulations with various spatial resolution discussed in Section (3). For each of the three grids, we have the absolute error as a function of time. If we define the spatial resolution m, two other grids have the resolution and . The system of equations has the form:
(5) 
and its solution is
The plots below show the composition of the total error for each of the grids (Fig 12 – Fig 14) , with a scaled surface elevation overlaid to give an idea on when the errors occur.
In all the error decompositions, the components tend to cancel each other, which results in the total error being less than the individual components. However, the leading dispersion error is a dominant component for all the total errors. The leading dispersion error exhibits large negative spikes, this points towards phase lagging as the surface elevation changes from negative to positive at the center of the domain. These large negative spikes in the leading dispersion error coincide with positive spikes in either the truncation or order dissipative errors. For finer grids the truncation error (leading dissipation error) plays a dominant role. For the larger grid the negative spikes are balanced by the order dissipative error (Fig 12).
This error analysis is important when considering real cases – see section (5) – as any error could be dominated by either the leading dispersion or dissipation terms. However, this numerical dispersion term could compensate for the fact that physical dispersion is neglected in the nonlinear shallow water equations, as shown by [21].
5 Real cases
In this section we discuss an actual tsunami simulation done with VolnaOP2 running on the general purpose GPU cluster. The cluster consists of two NVIDIA^{®} Tesla^{®} V100 cards, with 5,120 CUDA cores, 16GB max memory size each.
The domain size is km and the physical simulation time is hour. The simulation corresponds to a hypothetical scenario of edge volume collapse (765 km) at the Rockall Bank Slide Complex. A geophysical study of the event taking into account volumetric, rheological and multiphase collapse considerations has been done under a different framework [22]. Convergence using a simple approach of material with viscoplastic rheology sliding in one go is demonstrated. For this study we have chosen four gauges marked by the red dots on Fig 15 to present the evolution of the tsunami as a function of time in Fig 16.
Overall, the numerical simulations with varying spatial discretisation behave similarly. Notable differences can be seen at gauge 4 (Fig 16d), where unresolved bathymetric features of the continental shelf play a role. To investigate the numerical differences we will focus on the output at gauge 3 (Fig 16c). The reason for choosing this gauge is to minimise the effects of these unresolved bathymetric features as the bathymetry between this gauge location and the landslide source is relatively flat. The maximum wave amplitude of the initial tsunami wave at gauge 3 (Fig 16c) has been highlighted above. As there is no true solution to compare with for this real case we will take the simulation results from the finest mesh m as the ground truth. Relative differences between this ground truth and the other simulations are presented in Fig 17 and table 1. When comparing the signals (Fig 17), the coarser meshes exhibit phase lagging and/or damping of the signal, i.e. the maximum tsunami wave arrives later and its amplitude is diminished. This behaviour was highlighted in the previous error analysis – Section 4 – and is thus expected.
It should be noted that for cases which utilize nonuniform mesh resolutions the same error analysis findings will hold true. If the mesh is nonuniform (i.e the characteristic length scale of the cells vary across the domain), the analysis addresses the worst possible scenario and scales all cells by the largest for error computations. Thus, one would expect to see similar behaviour regarding numerical dissipation and dispersion.
Resolution  [m]  [s] 



600  0.08  2.8 
1000  0.43  2.8 
2050  1.55  14.8 

Turning to the performance of VolnaOP2 on the GPU cluster, the table (2) and plot (Fig 18) summarise the runtimes for the various mesh resolutions. One can see that we get a linear speed up of the runtimes. Those interested in the scalability of the code on other HPC architectures are referred to [2]. This analysis of the relative errors (1) and computational efficiency informs the user on what resolution will provide an acceptable level of accuracy within a given time constraint.
Resolution, [m]  Number of cells  Total runtime [s]  1 GPU  Total runtime [s]  2 GPUs 



2050  246,187  140.7  84.3 
1000  1,031,014  555.6  296.1 
600  2,872,156  1565.8  826.0 
450  5,109,001  2782.3  1433.1 

6 Concluding remarks
Based on the current study we can conclude that VolnaOP2 is a robust and efficient parallel solver for the NSWE. It is based on the finite volume scheme for spatial integration, implementing a MUSCL reconstruction and using the 2nd order RungeKutta scheme for integration in time. The scheme is conditionally stable with experimentally confirmed CFL=1.0. The code can handle complex geometries and simulate reallife cases.
The error analysis shows that it scales quadratically with refining the spatial mesh. However, reducing the timestep does not have a visible effect on the error. The solution amplitude decays in time, and one can observe both damping and phase shifts. So there is an energy leak from the system, which is expected due to the nonconservative RungeKutta scheme. However, the error can be minimised by reducing the spatial resolution. The phase error is mostly negative, which corresponds to the phase lag and the artificial wiggles behind the wave front. However, it changes sign occasionally and this leads to the phase lead.
The real case simulations show the efficiency and scalability of the code run on a GPU cluster. The 1hour realistic size tsunami model is simulated in 24 mins on the finest (450m resolution, cells) mesh using two GPUs. However, the simulations have shown that comparable results can be achieved using a coarser mesh and reduced runtime. Therefore the users should be familiar with code pros and cons before setting up their simulation in order to get the physically meaningful and numerically accurate results. This acceptable level of accuracy and runtime trade off is an important decision when using VolnaOP2 to perform comprehensive sensitivity analysis tests and uncertainty quantification.
To conclude the paper we want to mention that VolnaOP2 is still under development. Therefore its analysis and benchmark testing play an important role for the code’s continuous enhancement and improvement. The code is already an important tool for tsunami modelling community. However, we hope that our work will help to its wider adoption and will lead to discussion on the most suitable algorithms and software platforms for realistic tsunami modelling and prediction of its effects.
Acknowledgments
The Irish Centre of High End Computing (ICHEC) is acknowledged for their computing resources. Daniel Giles is supported by a Government of Ireland Postgraduate Scholarship from the Irish Research Council (GOIPG/2017/68). Eugene Kashdan would like to acknowledge NVIDIA for the donation, within the framework of the Professor Partnership Programme, of the Tesla K40 GPU which was used for this research. Serge Guillas acknowledges support from the Alan Turing Institute project “Uncertainty Quantification of multiscale and multiphysics computer models: applications to hazard and climate models” as part of the grant EP/N510129/1 made to the Alan Turing Institute by EPSRC.
References
 [1] D. Dutykh, R. Poncet, and F. Dias, The VOLNA code for the numerical modeling of tsunami waves. European Journal of Mechanics – B/Fluids, 30(6), (2011), 598–615.
 [2] I. Z. Reguly, D. Giles, D. Gopinathan, L. Quivy, J. H. Beck, M. B. Giles, S. Guillas, and F. Dias, The VOLNAOP2 tsunami code (version 1.5). Geosci. Model Dev, 11(11), (2018), 4621–4635.
 [3] R. Poncet et al., A Study of the Tsunami Effects of Two Landslides in the St. Lawrence Estuary. D. C Mosher et al. Editors, Submarine Mass Movements and Their Consequences. Springer, Netherlands, (2010), 755–764.
 [4] F. Dias et al., On the Modelling of Tsunami Generation and Tsunami Inundation. Procedia IUTAM, 10, (2014), 338–355.
 [5] D. Gopinathan, M. Venugopal, D. Roy, K. Rajendran, S. Guillas, and F. Dias, Uncertainties in the 2004 Sumatra–Andaman source through nonlinear stochastic inversion of tsunami waves. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 473(2205), (2017):20170353
 [6] T. S. Stefanakis et al., Can small islands protect nearby coasts from tsunamis? An active experimental design approach. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2172(470), (2014), 20140575
 [7] J. Beck and S. Guillas, Sequential design with mutual information for computer experiments (MICE): emulation of a tsunami model.. SIAM/ASA Journal on Uncertainty Quantification, 4(1), (2016), 739–766
 [8] S. Guillas, A. Sarri, S. J. Day, X. Liu, and F. Dias, Functional emulation of high resolution tsunami modelling over Cascadia.. The Annals of Applied Statistics, 12(4) , (2018) 2023–2053.
 [9] D. M. Salmanidou, S. Guillas, A. Georgiopoulou, and F. Dias, Statistical emulation of landslideinduced tsunamis at the Rockall Bank, NE Atlantic.. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 473(2200), (2017).
 [10] X. Liu and S. Guillas, Dimension reduction for Gaussian process emulation: An application to the influence of bathymetry on tsunami heights. SIAM/ASA Journal on Uncertainty Quantification, 5(1), (2017) 787–812.
 [11] M. Brocchini et al., An efficient solver for nearshore flows based on the WAF method. Coastal Engineering, 43, (2001), 105–129.
 [12] J.Macias, M.J. Castro, S. Ortega, C. Escalante, and J.M. GonzalezVida, Performance Benchmarking of TsunamiHySEA Model for NTHMP’s Inundation Mapping Activites. Pure Appl. Geophys, 173, (2017), 3147–3183.
 [13] M.J. Berger, D.L George, R.J. LeVeque, and K.T. Mandli, The GeoClaw software for depthaverage flows with adaptive refinement. Adv. Water Resour., 34, (2011), 1195–1206.
 [14] A. Harten, P. D. Lax, and B. van Leer, On upstream differencing and Godunovtype schemes for hyperbolic conservation laws. SIAM Review, 25, (1983), 35–61
 [15] J.G. Zhou et al., Numerical solutions of the shallow water equations with discontinuous bed topography. Internat. Journal on Numerical Methods in Fluids, 38, (2002), 769–788.
 [16] P.J. Van Der Howen and B.P. Sommeijer, Explicit Runge–Kutta (Nystrom) methods with reduced phase errors for computing oscillating solutions. SIAM J.Numer.Anal., 24, (1987), 595–617.
 [17] T.J. Barth and D.C. Jespersen, The design and application of upwind schemes on unstructured meshes. 27th Aerospace Sciences Meeting. (1989)
 [18] W. C. Thacker, Some exact solutions to the nonlinear shallow water wave equations. Journal of Fluid Mechanics, 107, (1981), 499–508.
 [19] O. Delestre, et al., SWASHES: a compilation of ShallowWater analytic solutions for hydraulic and environmental studies. International Journal for Numerical Methods in Fluids, 72, (2013), 269–300.

[20]
P. D. Lax and R. D. Richtmyer,
Survey of the Stability of Linear Finite Difference Equations
. Comm. Pure Appl. Math., 9, (1956), 267–293.  [21] D. Burwell, E. Tolkova and A. Chawla, Diffusion and dispersion characterization of a numerical tsunami model. Ocean Modelling, 19, (2007), 10 –30.
 [22] D.M. Salamanidou, A. Georgiopoulou, S. Guillas and F. Dias, Rheological considerations for the modelling of submarine sliding at the Rockall Bank, NE Atlantic Ocean. Physics of Fluids, 30, (2018):030705
Comments
There are no comments yet.