A Weighted State Redistribution Algorithm for Embedded Boundary Grids

by   Andrew Giuliani, et al.

State redistribution is an algorithm that stabilizes cut cells for embedded boundary grid methods. This work extends the earlier algorithm in several important ways. First, state redistribution is extended to three spatial dimensions. Second, we discuss several algorithmic changes and improvements motivated by the more complicated cut cell geometries that can occur in higher dimensions. In particular, we introduce a weighted version with less dissipation. Third, we demonstrate that state redistribution can also stabilize a solution update that includes both advective and diffusive contributions. The stabilization algorithm is shown to be effective for incompressible as well as compressible reacting flows. Finally, we discuss the implementation of the algorithm for several exascale-ready simulation codes based on AMReX, demonstrating ease of use in combination with domain decomposition, hybrid parallelism and complex physics.



page 17

page 18

page 19

page 20

page 21


A two-dimensional stabilized discontinuous Galerkin method on curvilinear embedded boundary grids

In this work, we propose a state redistribution method for high order di...

A state redistribution algorithm for finite volume schemes on cut cell meshes

In this paper we develop a new technique, called state redistribution, t...

Numerical computation of the cut locus via a variational approximation of the distance function

We propose a new method for the numerical computation of the cut locus o...

The Second-Generation Shifted Boundary Method and Its Numerical Analysis

Recently, the Shifted Boundary Method (SBM) was proposed within the clas...

An extended MMP algorithm: wavefront and cut-locus on a convex polyhedron

In the present paper, we propose a novel generalization of the celebrate...

Almost Tight Lower Bounds for Hard Cutting Problems in Embedded Graphs

We prove essentially tight lower bounds, conditionally to the Exponentia...

Barrier Simulations and Experimental Calculations Using Cell Merging Method

We apply the cell merging method to a model shallow water problem with a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Embedded boundary grids are useful for solving partial differential equations on complex engineering domains since mesh generation is robust and automatic. These grids are composed of regular Cartesian cells away from the boundary, covered cells that do not participate in the solution procedure, and irregular cut cells that intersect the embedded boundary. It is well known that the cut cells can be arbitrarily small, thus care must be taken when using explicit time stepping methods on these grids. Many different approaches that address the small cell problem have been proposed over the years. Most intuitive with a lot of recent progress is cell merging

[1]. The interesting work in [2] is dimensionally split, but not that accurate at the cut cells. -box methods [3] have nice theoretical properties but seem difficult to implement in three dimensions.

One technique that has been successfully used in three dimensions on complex geometries is called flux redistribution [4, 5]. The idea is that each cell takes its maximum stable time step, then excess flux is redistributed to neighboring cells to maintain conservation. This approach is attractive because it is simple to implement as a postprocessing step. However, the downside is that there is a loss of accuracy at the cut cells and the scheme is not linearity preserving. Recently, a new approach called state redistribution (SRD) was proposed for finite volume methods on two-dimensional cut cell grids [6]. This is a minimally invasive stabilization technique that is linearity preserving, conservative, and straightforward to implement in two dimensions for hyperbolic conservation laws. Inspired by flux redistribution, SRD postprocesses the numerical solution by accurately redistributing the solution states in a way that maintains conservation.

This work builds on [6] in several important ways. The work in [6] was in two space dimensions. As we show here, the extension to 3D was straightforward. We propose a weighted version of SRD with reduced dissipation, that smoothly activates depending on the volume fraction of a cut cell. We discuss some of the design choices made for easier implementation in a 3D parallel production code. Previous work on SRD focused on the Euler equations of gas dynamics. We show here that SRD is simple enough to apply to many types of equations. The extensions shown later include compressible reacting flow and low Mach number multiphase flow. Finally, the algorithm has been incorporated into AMReX-Hydro, a set of modules based on AMReX [7]. This is a software framework that supports the development of block-structured adaptive mesh refinement (AMR) algorithms for solving systems of partial differential equations on simple or complex geometries, using machines from laptops to exascale architectures. While all these examples use finite volume schemes on the embedded boundary mesh, we mention that SRD has also been extended to discontinuous Galerkin schemes in two dimensions [8], and research on higher order finite volume SRD schemes is in progress.

The rest of this paper is organized as follows. In section 2 we set the stage for the geometry representation, and outline the three dimensional version of the original algorithm in [6]. Section 3 discusses a generalization of SRD that smoothly activates as the volume fraction decreases below a threshold. Section 4 presents aspects of the parallel implementation. Section 5 discusses some particular choices that go into the extensions to other sets of equations. Computational examples are in section 6. Conclusions and directions for future research are in section 7.

2 Preliminaries

We briefly describe the the embedded boundary representation and geometry generation to give the reader an idea of the context in which State Redistribution is applied. We then present the extension to 3D of the original SRD algorithm, both the pre- and post- processing. This will provide a basis for comparison when the new weighted version is presented in section 3.

2.1 Embedded Boundary Data Structures

When an embedded boundary (EB) is present in the domain, there are three types of cells: regular, cut and covered. AMReX provides data structures for accessing EB information, which is precomputed and stored in a distributed database at the beginning of the calculation. It is available for AMR meshes at each level and for coarsened meshes in multigrid solvers. The information includes cell type, cell centroid, volume fraction, face area fractions and face centroids. For cut cells, the information also includes the centroid, normal and area of the EB face. There is at most one EB face per cell; this is enforced at the time the geometry is generated. Additionally, there is connectivity information between neighboring cells.

2.2 Geometry Generation

AMReX uses an implicit function approach for generating the geometry information. The implicit function describes the surface of the embedded object. It returns a positive value, a negative value or zero, for a given position inside the body, inside the fluid, or on the boundary, respectively. Implicit functions for various simple shapes such as boxes, cylinders, spheres, etc., as well as a spline based approach, are provided. Furthermore, basic operations in constructive solid geometry (CSG) such as union, intersection and difference are used to combine objects together. Geometry transformations (e.g., rotation and translation) can also be applied to these objects. In addition to an implicit function, an application code can also use its own approach to generate the geometric information and store it in AMReX’s EB database. In the current AMReX mesh generator, split cells and tunnel cells are not supported. Multivalued cells and edges with multiple cuts are not allowed.

2.3 3D SRD Algorithm

We describe here the original SRD algorithm extended to three dimensions. The algorithm comprises a mesh preprocessing step and a solution postprocessing step. The next section will introduce some improvements and generalizations.

2.3.1 Preprocessing

Before the time stepping portion of the finite volume solver begins, the mesh is preprocessed by associating a merging neighborhood with each cell in the base grid, both whole and cut. Small cut cells are merged with their neighbors until the volume of the neighborhood is at least

a threshold informed by results in [9]. This is illustrated in two space dimensions in Figure 1a, where cell merges with cell to form neighborhood , highlighted in green. Similarly, cell merges with cell to form ’s neighborhood, shown in Figure 1b. A cell with a volume larger than the threshold value does not merge with any neighboring cells, thus its merging neighborhood only contains itself, like cell in Figure 1c. Cell merges with cell , shown in Figure 1d.

The number of neighborhoods that contain cell , called the overlap count , is also indicated in Figure 1 in two space dimensions. Cell is contained in two other neighborhoods, that of cell and , plus it is in its own neighborhood by definition, so its overlap count is three (Figure 1c).

[width = 0.25]2-SRD-Overview/cutWithCounts_1


[width = 0.25]2-SRD-Overview/cutWithCounts_2


[width = 0.25]2-SRD-Overview/cutWithCounts_3


[width = 0.25]2-SRD-Overview/cutWithCounts_4

Figure 1: Merging neighborhoods (highlighted in green) and overlap counts of cells in the neighborhood of a corner geometry. a) in green, and because is not in any of the other neighborhoods; b) in green, and because is not in any of the other neighborhoods; c) in green, and because is in the neighborhoods shown in (b) and (d); d) in green, and because is in the neighborhood of as shown in (a).

There are several possible ways to choose which neighbors a cut cell merges with. In this example, a cell merges in the direction closest to the boundary normal. In the AMReX-Hydro implementation, if merging with the one cell in the direction closest to the normal does not result in a neighborhood with a volume greater than , the neighbor in the direction of the next largest component of the normal is added to the neighborhood.

In order to not create an “L-shaped” neighborhood we then automatically add the cell in the same plane as the two neighbors that defines the neighborhood as a 2 2 box. In 3D if this neighborhood is still not large enough, we then add the remaining four cells to have a 2 2 2 box.

This is done for aesthetic reasons; an L-shaped domain would still be stable.

In AMReX-Hydro we also made a design decision to preserve spatial symmetries whenever possible. For example, if the boundary normal in a cut cell points has identical components with we create a neighborhood that is larger than strictly necessary on volumetric considerations alone, to avoid directional bias in a possibly otherwise symmetric geometry and solution. In the above case the neighborhood would include all fluid cells in a 2 2 1 region in the cut cell’s merging neighborhood. If then the neighborhood would include all fluid cells in a 2 2 2 region in the cut cell’s merging neighborhood.

Once the neighborhoods have been identified, we define a neighborhood’s weighted volume using the standard cell volume by


and a weighted centroid


where is the standard cell centroid, and is the set of cells in the neighborhood of cell . In the above, cell volumes are weighted by the inverse of their overlap counts. A cell’s overlap count, as well as the weighted volume and centroid associated to neighborhoods are required during the state redistribution postprocessing step, which we describe below.

2.3.2 Postprocessing

State redistribution is implemented as a postprocessing step acting on cell averages updated by a finite volume scheme. (The finite volume scheme itself needs to be modified for the presence of cut cells to preserve accuracy, but not stability, since that is handled by SRD.) Thus, the first step of the algorithm is to compute a provisional, but possibly unstable cell update


where is a face area, is the numerical flux, and is the outward pointing normal. This update is computed on all cells using a time step that is proportional to the size of cells in the background Cartesian grid.

The next step is to compute a weighted solution average on each merging neighborhood


For second order accuracy, a gradient is reconstructed in all cells for which has at least two cells ( itself and at least one other). This results in a linear function for cell of the form


where is the gradient obtained with a least squares approach. For this step the least squares equations are centered on the weighted neighborhood centroids , not the original cell centroids .

In each coordinate direction, we compute the maximum distance between the weighted centroids of the neighborhoods in the reconstruction stencil to the weighted centroid of the neighborhood on which the reconstruction is centered. When the reconstruction stencil does not include points further than a threshold in any one coordinate direction, we grow the extent of the stencil in that coordinate direction as described in [6]. The threshold distances are defined as , , , in the , , and directions, respectively. For example, if the reconstruction stencil is initially the block, and the maximum distance in the -direction , for cells in the reconstruction stencil, is not larger than , then the stencil is increased to . We also point out that the block may not contain 27 cells as it could contain covered cells that are not included in the stencil.

We have also experimented with the less restrictive criterion of requiring the difference between the maximum and minimum centroid values in the stencil to be greater than half the mesh width. More experience in complicated geometries in three dimensions are necessary to ensure this is sufficiently stable. If necessary, a Barth-Jespersen-style limiting is performed in the neighborhood gradient reconstruction step as well.

The final step is to use the neighborhood polynomial to compute the final solution update


where is the set of neighborhood indices that include cell in their neighborhood. In words, the stabilized neighborhood solution replaces the unstable finite volume update in a conservative and accurate manner. In the above formula, the stabilized solution average on a cell in the base grid is obtained by averaging the centroid values of the overlapping merging neighborhood polynomials. As in the two-dimensional algorithm, SRD can be used with either a method of lines integration in which the SRD algorithm is applied at each stage or with second-order Godunov type approach that directly computes fluxes at the level.

This postprocessing procedure is also applied to the initial data, before any time steps have been taken. This is akin to pre-merging when using cell merging. It is also analogous to the procedure in incompressible flow, where the initial conditions have an initial projection before time stepping begins. We have found that our test cases have better monotonicity properties with this additional step.

3 Generalizing State Redistribution

We present here an improvement to the original algorithm, as just described, which addresses several issues. The original algorithm has a sharp cutoff when the volume fraction of a cut cell reaches 0.5 and the stencil abruptly shifts. A transition that gracefully shuts off the amount of redistribution would be smoother, and also introduce less numerical dissipation. The second issue is that the preprocessing step described earlier – which preserves symmetries and avoids “L-shaped” neighborhoods – can generate neighborhood volumes that are overly large, resulting in more redistribution than is necessary for stability. The modification described below addresses both these issues. The computational results in section 6 use this modification.

3.1 A Weighted SRD Algorithm

The main idea is that a large cut cell that is just below the threshold requires less stabilization from its neighbors. We use the notation , i.e. is the set of cells contained in the neighborhood of cell with the exception of cell itself. As before, is the set of indices containing cell in their neighborhood, i.e. so that cell is in We again set .

For the weighted version of SRD, we define two new scalars for each cell, and which are used as weights for the relative contributions of cell and its neighbors to the solution at In cells with volume we define


and then


The idea of is to control the contribution a cut cell gets from its merging neighborhood. If a cut cell has volume , we set . Equations (7) and (8) imply that .

As becomes larger, the weighted algorithm increases the dependence of the solution in cell on relative to the original SRD algorithm. Notice that this version is not equivalent to the original SRD algorithm and may have some different stability properties, especially for one-dimensional test cases that are not representative of Cartesian cut cell meshes in higher dimensions.

All the preprocessing formulas are now modified to pull the cut cell out of the expressions and weight it by , with the rest weighted by . Instead of Equation (1), we define a neighborhood’s weighted volume using the standard cell volume by


Instead of Equation (2) we use


to define the weighted centroid.

In the postprocessing step, instead of Equation (4), we define the weighted solution average on each merging neighborhood as


Slopes are computed as in the original algorithm, and we use Equation (5) to construct as before. The final step is to use the neighborhood polynomial to compute the final solution update. Instead of Equation (6) we use


We will show that this new variant retains the linearity and conservation properties of the original algorithm.

Figure 2: In this example, cell 3 is merged with cell 2, cell 6 is merged with cell 5, cell 7 is merged with cell 4, cell 8 is merged with cell 5 and cell 9 is merged with cells 5, 6 and 8. Cell 5 is contained in 3 other neighborhoods as well as its own, resulting in an overlap count of 4. Cell 1 is in only its own neighborhood.

To show conservation, define the matrix such that


is the vector of cell volumes and

is the vector of neighborhood volumes. To make it concrete, for the case considered in Figure 2, the matrices for the original and weighted algorithms are given by

In the entries are the inverse of cell ’s overlap count, if cell is in cell ’s neighborhood. In , the generalized weights correspond to the previously defined and . The key property of both matrices is that the columns sum to 1,

where is a vector of all 1’s. We then have that

Furthermore, if slopes are zeroed in Eq. (5) then

Conservation then follows from

Including the slopes in Eq. (5) redistributes mass to preserve linearity but does not alter the conservation properties of the method. Continuing, in matrix form using (5) the update with slopes can be written as

where are the centroids of the original cells, are the weighted centroids of the neighborhoods and are the slopes in the neighborhood reconstruction. We then note that

Although not pursued here, the structure suggests that any collection of non-negative weights that reach the target neighborhood volumes with can be used to define the SRD algorithm. This opens the possibility of more sophisticated approaches, including ones that might depend on the local solution.

3.2 Algorithm Comparisons

We compare the performance of flux redistribution, the original SRD algorithm and the improved weighted SRD algorithm in a simple two-dimensional example. We consider linear advection of a passive scalar in the presence of a ramp, with cut cells at the bottom of the mesh where it intersect the ramp. We consider several different angles of tilt (40, 45 and 50 degrees) and different combinations of volume fractions. The velocity vector in each numerical test is parallel to the ramp. The initial condition for the advected scalar is a discontinuous Heaviside function; the scalar has concentration 1 to the left of the discontinuity and 0 to the right.

In Figure 3, the solution in the cut cells is reconstructed linearly to the centroid of the embedded boundary face in each cut cell. Figure 3 (left) shows the solution on a 40 degree ramp, so that the merging neighborhood is in the vertical direction. Figure 3 (right) shows the solution for a 50 degree ramp, so that the merging neighborhood is in the horizontal direction. In both examples the results after 10 time steps are shown. None of the algorithms guarantee monotonicity. We see however that in both the 40 degree and 50 degree cases that FRD generates overshoots and undershoots while both the original and weighted SRD algorithm do not. The new weighted algorithm is seen to be less dissipative than the original.

Figure 3: One-dimensional profile for advection parallel to a slanted wall. Left figure has the wall at 40 degrees to the horizontal; on the right the wall is at 50 degrees. FRD shows undershoots and overshoots, whereas both SRD algorithms do not. The profiles are taken after 10 time steps.

In our second set of tests we compare the original and weighted SRD algorithm for flow along a ramp at exactly 45 degrees from the horizontal. If the ramp exactly bisects the Cartesian cells (assuming equal mesh widths), the cut cell volume fraction is exactly 0.5, so neither algorithm is invoked. If the bottom wall is shifted up by , the cut cells along that wall are either slightly below 0.5 or almost full. The central merging neighborhood for this case is shown in Figure 3(a). Recall that the original algorithm has a sharp cutoff when the volume fraction of a cut cell reaches 0.5 and the original algorithm abruptly activates, while the weighted version transitions gradually. In Figure 3(b) the maximum difference between the solutions is 0.132, whereas in Figure 3(c) it is , and the unstabilized and shifted solutions are visually indistinguishable. We note that for this example, if the original algorithm merges in the vertical direction the maximum difference is , while if it merges horizontally the maximum difference is .


(a) Merging neighborhood for shifted case.
(b) Orig. SRD changes by 0.132 with the shifted wall.
(c) Wtd. SRD changes by with the shifted wall.
Figure 4: Change in solution with central merging when a ramp is shifted vertically up by .

4 Parallel Implementation

The generalized SRD algorithm is implemented for two and three spatial dimensions in C++ in the AMReX111https://github.com/AMReX-Codes/AMReX, 2021software framework [10]

. It is publicly available in the open source

AMReX-Hydro222https://github.com/AMReX-Codes/AMReX-Hydro, 2021 collection of routines. AMReX-Hydro includes modules for flux redistribution as well as state redistribution; the redistribution modules are used by compressible and low Mach number flow codes and the examples given in this paper all use this implementation.

The fundamental data structure in AMReX contains multi-dimensional arrays on logically rectangular patches within the computational domain. AMReX distributes patches of data to different MPI ranks; communication between patches at the same refinement level occurs most frequently by filling ghost cells (also known as halo cells); the SRD algorithm exploits the native AMReX routines for doing so.

While the SRD algorithm can be used in a simulation code that computes the solution on an AMR hierarchy, the current implementation assumes that the coarse-fine boundary does not cross any cut cells. Thus the SRD algorithm is effectively used one level at a time, and is exactly the same as it would be in a code with uniform mesh resolution.

To implement the SRD algorithm on a set of rectangular patches that are all at the same level, we must know how many ghost cells need to be filled for each type of data. The version of the SRD algorithm implemented in AMReX starts with a set of neighborhoods, no neighbor is more than one cell away in any coordinate direction. If necessary, the slope computation can increase in a direction up to a stencil if doing so does not require additional ghost cells. We break the SRD algorithm into two sections for this purpose. In the preprocessing step of SRD, we compute the necessary geometric information such as neighborhood volumes, centroid locations, number of neighbors, etc. This information does not change between time steps unless a regridding operation is performed in an AMR simulation.





(a) and gradient needed to update cell .

(b) needed to compute gradient of neighborhood.

(c) needs and overlap count to form its neighborhood.

(d) needs to know neighborhood, which may need cell volume.
Figure 5: For illustrative purposes, we assume central merging neighborhoods are used for all cells. Each of the neighborhoods is illustrated.

The stencil width for the preprocessing step is determined by how many cells are required in postprocessing, so we count that first using only one direction. To update we may need the neighborhood value , for example with central merging. (Central merging uses a neighborhood around cell ). To obtain a slope on the neighborhood associated with cell we need the neighborhood average . This latter value may have needed , again if using central merging. So the stencil for postprocessing can be up to three cells in every direction.

To provide for the postprocessing step the term was needed (simplifying subscripts from now on). could depend on whether neighborhood overlaps cell . So we have to know the details of ’s neighborhood, which may depends on . This is the worst case stencil, depicted in Figure 5. This determines how many halo cells are needed to preprocess without any additional communication between patches during single time step. Of course in the simplest case, normal merging will suffice and fewer cells are required. There is also a trade-off between using fewer ghost cells but communicating more often. However in the example shown later we preprocess using this maximum number of ghost cells without additional communication.

5 SRD Applications

The SRD approach was originally applied to hyperbolic conservation laws in two dimensions. Here we describe how the SRD methodology is incorporated into a multi-physics solver with a more complex time stepping algorithm. We consider the compressible multi-component Navier-Stokes equations with chemical reactions, and low Mach number models where the flow evolves subject to a constraint. The computational results in section 6 will demonstrate some of these extensions in two geometries of interest.

5.1 Compressible Reacting Flows

For compressible flows, the following conservation equations for mass, species mass fractions, momentum, and energy with a finite rate evaluation of chemistry are solved [11]:


where is the density, is the velocity vector, is the pressure of the mixture, is the number of species, and is the mass fraction of the -th species. It is assumed that the species of the gaseous mixture are in thermal equilibrium, at a common temperature .

is the viscous stress tensor given by:


where is the shear viscosity and is the bulk viscosity. The diffusive transport flux of the -th species, , is approximated using a mixture-averaged diffusion process. is the chemical species reaction source term for the -th species. is the thermal conduction heat flux and is the specific enthalpy of species .

The system (13) includes both advective and diffusive fluxes and includes reactions that are potentially stiff on the time scale of advection and diffusion. The overall integration treats advection and diffusion explicitly and integrates the reactions using a stiff ODE solver. The time integrator for the compressible equations is a standard second order predictor-corrector approach with an optional fixed point iteration that can be used to tightly couple the reaction and transport terms in the equations, that is similar in spirit to spectral deferred corrections.

Let represent the cell update due to reactions, and represents the advective and diffusive update over a time step. The reactions terms are computed pointwise by integrating the ODE representing the reactions. Initially at the start of the simulation is obtained by integrating the reaction terms without advection or diffusion. For later steps, is initialized to the final value from the previous time step; i.e., as defined below. State redistribution is used to stabilize small cells for advection and diffusion as shown below.

Step 1: Initialize as discussed above.

Step 2: Compute advective and diffusive fluxes, at centroids of faces and evaluate preliminary update using Eq. (3) to obtain

Details of the flux computation are given below.

Step 3: Apply state redistribution and compute (but omitting the cell indices) using

Step 4: Define

Steps 5-7 are repeated times to couple advection, diffusion and reactions. Formal second order accuracy requires of at least 1.

Step 5: Evaluate fluxes at centroids using and form the flux divergence

and form

and apply state redistribution to obtain

Step 6: Update the reaction term

where evolves the reactions from to with initial condition and a constant source term and then computes the average change due to reactions over the time interval.

Step 7: Finally we compute the update

For the spatial discretization we perform a limited extrapolation of the characteristic variables to the centers of the cell faces using a standard van Leer MUSCL limiter. If one of the cells in the slope computation is completely covered then the one-sided slope in that direction is zeroed. A two-shock approximation Riemann solver is used at face centroids. A normal Riemann solver is used to compute the flux at the EB face.

For the diffusive fluxes, derivatives at cell faces are approximated with a centered second order finite-volume discretization. At cut faces with area fraction less than one, the stencil is modified to evaluate the fluxes at face centroids. At the embedded boundary face, we cast a ray into the interior of the domain and interpolate values onto that ray to compute a one-sided diffusive flux. By formulating the diffusion in a finite volume form, SRD stabilizes this term too.

5.2 Low Mach Number Flow

Next we show how SRD is applied to generic low Mach number flows of the form

evolving subject to the constraint


Here corresponds to some set of additional variable such as species mass fractions, enthalpy or an advected scalar, and represents additional source terms such as diffusion or reactions. The velocity satisfies an inhomogeneous constraint where is specified in terms of thermodynamic variables so that the system maintains thermodynamic pressure. The perturbational pressure, can be thought of as a Lagrange multiplier that ensures that the evolution of the momentum field is consistent with the constraint.

For this set of equations, SRD only needs to be applied to the convective terms, since the reaction terms are evaluated pointwise in each cell, and the diffusive term is treated implicitly. We give an overview of the time advance and indicate at which point SRD is applied.

The equations are discretized using a predictor-corrector method-of-lines approach with an approximate projection (see [12] for details). In the predictor we define provisional values at time by solving

Here is a discretization of , is a discretization of , and is a discretization of ; is an approximation to , and is an approximation to . After constructing , we perform an approximate projection to enforce the constraint (18) for and define .

In the corrector step, we define the solution at by solving

where the advective terms are defined using the results of the predictor step. We then apply another approximate nodal projection to enforce the constraint (18) for and define .

In both the predictor and corrector steps outlined above, we construct the advective terms as follows.

Step 1: Predict, and if necessary limit, the normal velocity on every cell face with non-zero area. For cut faces, this involves all components of the velocity gradient, since the cut face centroid is not coordinate-aligned with the cell centroid. Given left and right states at each face, solve the Riemann problem with upwinding based on the reconstructed velocities and .

Step 2: Project the normal velocities to satisfy the constraint (18). This defines .

Step 3: Predict all quantities on faces, similarly to Step 1. Here represents and all three velocity components. At each face solve a Riemann problem with upwinding based on

Step 4: For or , we construct

where ’s are the area fractions of the cut faces. If is one of the velocity components, an additional source, , is added to the right hand side. These formulas do not include an embedded boundary face since the flux is zero there.

Step 5: Apply state redistribution. To isolate the effect of SRD to the advective terms, first define

We then apply SRD to to give

and define

Time stepping then continues with the next predictor or corrector step.

As with the other examples, we also apply SRD to the initial data at the start of the simulation. In this case the initial data is simply replaced by the redistributed data.

6 Computational Examples

We present 3D results from fluid flow simulations in engineering geometries to demonstrate SRD in settings with advective, diffusive and reacting updates. The examples in sections 6.1 and 6.2 rely respectively on the compressible reacting flow and low Mach number flow models introduced in sections 5.1 and 5.2. The SRD algorithm is implemented in AMReX-Hydro as well as a number of other codes including the incompressible flow code IAMR333https://github.com/AMReX-Codes/IAMR, 2021, the Pele444https://github.com/AMReX-Combustion, 2021 suite of 3 codes, which includes a low-Mach reacting code, a compressible reacting code, and a physics library, and MFIX-EXA555https://github.com/AMReX-Codes/MFIX-Exa, a computational fluid dynamics–discrete element model (CFD-DEM) code for low Mach number reacting multiphase flows [12]. All of these codes, including the SRD implementation, run on hybrid architectures; the SRD implementation itself runs on both CPUs and GPUs.

6.1 Compressible Fuel Injection in a Piston-bowl Geometry

Figure 6: Schematic of the piston bowl geometry, based on a production turbocharged diesel engine.

To demonstrate the effectiveness of the state redistribution scheme for compressible flows, PeleC was used in a simulation of fuel injection in a piston bowl geometry. Complex geometries that concentrate and enhance the flow structure are typical in combustion engines. The geometry used in this work resembles a production turbocharged diesel engine, shown in Figure 6. The physical domain size is cm in the and directions and cm in the -direction. In compression ignition engines, a multi-hole injector is typically used to inject fuel at the end of the compression stroke. In this work, four discrete gas-phase fuel jets (methane) are injected at the top boundary into a high temperature oxidizer (oxygen and nitrogen). The inflow velocity conditions for the jets are taken from a turbulent pipe flow precursor simulation. The temperature of the fuel is 300K. The fuel is injected for s at the start of the simulation. The chemistry kinetics model in this simulation is DRM19 (21 species, 84 reactions) subset of the GRI-Mech 1.2 methane mechanism [13, 14]. The ideal gas model equation of state is used to close the equation system.

The simulation domain is discretized on the base level with cells, leading to a base level grid size of cm. The simulation is performed with 2 levels of refinement (effective grid resolution of cm) and, when the jets impact the piston bowl side walls, 11 million cells on the finest level. This simulation is not possible with the standard flux redistribution scheme as it leads to unphysical solution values (e.g. concentrations less than 0 or ). The smallest volume fraction allowed in the calculation was to avoid issues with the AMReX EB geometry generation. The SRD algorithm is a negligible portion of the total runtime, typically not exceeding .

(a) s.
(b) s.
(c) s.
(d) s.

Figure 7: Isosurface of methane mass fraction at 0.1 colored by the magnitude of velocity.

Simulation results are shown at different times in Figures 7 and 8. The velocity of the fuel is initially high as the jet penetrates into the domain. The different fluid densities of the oxidizer and fuel lead to characteristic Kelvin-Helmholtz roll-ups at the jet tips, Figures 6(a) and 7(a). As the fuel jets hit the piston bowl side walls, the fuel is redirected towards the top and the towards the bottom of the bowl, Figures 6(b) and 7(b). As can be seen in the mass fraction contours in Figure 7(b), the fuel concentration at the inlet has decreased since the fuel stops being injected starting at s. After impacting the side walls of the piston bowl and interacting with the sides walls, Figures 6(c) and 7(c), the jets break up and mix into the background fluid, Figures 6(d) and 7(d).

(a) s.
(b) s.
(c) s.
(d) s.
Figure 8: Pseudocolors of velocity magnitude (left) and methane mass fraction (right) at cm.

6.2 Circulating Fluidized Bed Geometry

Fluidization is the phenomenon by which solid particles are converted to a fluid-like phase through the introduction of gas. The resultant mixing of gas and particles provides favorable heat and mass transfer within the system. Such systems are commonly encountered in drying, granulation, coating, heating, and cooling, and over a wide range of industries such as food, agriculture, pharmaceutical, energy and mining. The circulating fluidized bed (CFB) is a type of fluidized bed system that utilizes a recirculating loop for even greater mixing efficiency between the particles and gas. One CFB configuration consists of four main sections – riser, standpipe, loop seal and cyclone. Fluidizing air is introduced primarily from the bottom of the riser. Secondary inflows at lower velocities are also introduced at the bottom of the loop seal and standpipe sections. The gas exits via the cyclone on the top after flowing over the dense particle phase near the bottom. The loop seal returns the particles to the bottom of the riser as they get collected in the standpipe.

MFIX-Exa [12] was used to simulate such a circulating fluidized bed (CFB) geometry, but without any particles since the focus of this study was only on the fluid phase. (See [12] for a discussion on how particles are treated in an embedded boundary setting.) In this three-dimensional configuration (Figure 9a), the riser, standpipe and loop seal sections have a rectangular cross section, whereas the cyclone has a cylindrical cross section.

The CFD model for the fluid uses an incompressible flow formulation and employs the SRD algorithm for velocity redistribution. No heat transfer or chemical reactions were considered. The domain of size 3.2m in direction, 1.6m in the direction and 8m in the direction is resolved by a 2.5cm mesh. Cells whose volume fractions are less than are treated as covered cells in this simulation, due to difficulties in the linear solver that will be the subject of future research. The gas inflow velocity in the riser is increased from 2.5m/s to 9m/s after 1s. The secondary inflows in the loop seal and standpipes are 0.25m/s and 0.15m/s respectively. Figure 9b shows an instantaneous snapshot of the gas velocity magnitude distribution along a vertical slice at = 0.8m at = 1.8s.





(a) CFB geometry


(b) Gas velocity distribution = 0.8m
Figure 9: Circulating fluidized bed simulation in MFIX-Exa.

The smallest cut cell in this computation had a volume fraction of . Zooms of the grid around the loop seal and the area below the cyclone are shown in Figure 10.

Figure 10: Zoom of grid cross-sections, with colorbar showing cut cell volume fractions.

7 Conclusions and Future Work

We have extended the state redistribution algorithm in several important ways. We introduced a framework to develop new state redistribution methods, and proposed a new variant that activates smoothly with cut cell volume fraction. Next, we showed that SRD can be used in the context of both compressible and incompressible flows, as well as when the PDEs contain reaction and diffusion terms. All of the presented numerical experiments were obtained using an implementation of SRD in the parallel exascale code AMReX-Hydro.

There are a number of interesting possible extensions of the SRD algorithm presented here. For example, one could incorporate knowledge of the local velocity in defining a local weighting scheme for low Mach number flows. This would be advantageous when the time step is constrained by a velocity away from the cut cells. One could adjust the target volume by the ratio of the local velocity and thus potentially eliminating the need to redistribute at all in low-speed regions. For low Mach number flows one could incorporate the pressure gradient as well. In the context of compressible reacting flows, we would like to formulate a density-weighted SRD algorithm, borrowing ideas used in flux redistribution algorithms [15]. We also plan to develop a multi-component slope limiting procedure for the neighborhood slopes to guarantee that chemically reacting species have mass fractions that always sum to 1.


AG was partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) under Award No. PDF-546085-2020. AS and JB were supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration under contract DE-AC02-05CH11231. MH’s work was authored in part by the National Renewable Energy Laboratory, operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. Funding provided by U.S. Department of Energy Office of Science and National Nuclear Security Administration. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. A portion of the research was performed using computational resources sponsored by the Department of Energy’s Office of Energy Efficiency and Renewable Energy and located at the National Renewable Energy Laboratory.


  • [1] R. Saye, “Implicit mesh discontinuous Galerkin methods and interfacial gauge methods for high-order accurate interface dynamics, with applications to surface tension dynamics, rigid body fluid-structure interaction, and free surface flow: Part I,” J. Comput. Phys., vol. 344, pp. 647–682, 2017.
  • [2] N. Gokhale, N. Nikiforakis, and R. Klein, “A dimensionally split Cartesian cut cell method for hyperbolic conservation laws,” J. Comput. Phys., vol. 364, pp. 186–208, 2018.
  • [3] M. J. Berger, C. Helzel, and R. J. LeVeque, “H-box methods for the approximation of one-dimensional conservation laws on irregular grids,” SIAM J. Numer. Anal., vol. 41, pp. 893–918, 2003.
  • [4] I.-L. Chern and P. Colella, “A conservative front tracking method for hyperbolic conservation laws,” LLNL Rep. No. UCRL-97200, Lawrence Livermore National Laboratory, 1987.
  • [5] P. Colella, D. T. Graves, B. J. Keen, and D. Modiano, “A Cartesian grid embedded boundary method for hyperbolic conservation laws,” J. Comput. Phys., vol. 211, no. 1, pp. 347–366, 2006.
  • [6] M. Berger and A. Giuliani, “A state redistribution algorithm for finite volume schemes on cut cell meshes,” J. Comput. Phys., vol. 428, p. 109820, 2021.
  • [7] W. Zhang, A. Myers, K. Gott, A. Almgren, and J. Bell, “AMReX: Block-structured adaptive mesh refinement for multiphysics applications,” Int. J. High Perform. Comput. Appl., June 12, 2021.
  • [8] A. Giuliani, “A two-dimensional stabilized discontinuous Galerkin method on curvilinear embedded boundary grids,” To appear in SIAM J. Sci. Comput., arXiv:2102.01857.
  • [9] M. Berger, “A note on the stability of cut cells and cell merging,” Appl. Numer. Math., vol. 96, pp. 180–186, 2015.
  • [10] W. Zhang, A. Myers, K. Gott, A. Almgren, and J. Bell, “AMReX: Block-structured adaptive mesh refinement for multiphysics applications,” Int. J. High Perform. Comput. Appl., vol. 35, no. 6, pp. 508–526, 2021.
  • [11] H. Sitaraman, S. Yellapantula, M. T. Henry de Frahan, B. Perry, J. Rood, R. Grout, and M. Day, “Adaptive mesh based combustion simulations of direct fuel injection effects in a supersonic cavity flame-holder,” Combust. Flame, vol. 232, p. 111531, 2021.
  • [12] J. Musser, A. S. Almgren, W. D. Fullmer, O. Antepara, J. B. Bell, J. Blaschke, K. Gott, A. Myers, R. Porcu, D. Rangarajan, M. Rosso, W. Zhang, and M. Syamlal, “MFIX-Exa: A path toward exascale CFD-DEM simulations,” Int. J. High Perform. Comput. Appl., 2021.
  • [13] M. Frenklach, H. Wang, M. Goldenberg, G. Smith, D. Golden, C. Bowman, R. Hanson, W. Gardiner, and V. Lissianski, “Tech. Rep. GRI-95/0058,” tech. rep., Gas Research Institute, 1995.
  • [14] J. B. Bell, M. S. Day, J. F. Grcar, M. J. Lijewski, J. F. Driscoll, and S. A. Filatyev, “Numerical simulation of a laboratory-scale turbulent slot flame,” Proc. Combust. Inst., vol. 31, no. 1, pp. 1299–1307, 2007.
  • [15] R. B. Pember, J. B. Bell, P. Colella, W. Y. Curtchfield, and M. L. Welcome, “An adaptive Cartesian grid method for unsteady compressible flow in irregular regions,” J. Comput. Phys., vol. 120, no. 2, pp. 278–304, 1995.