1 Introduction
We review methods that integrate medical image analysis algorithms with biophysical models. There are several benefits in integrating biophysical modeling with medical imaging. It can aid prognosis, diagnosis, and the design of new treatment protocols. It can aid our understanding of how biophysics relate to imaging information. Examples of medical image analysis algorithms include image segmentation, image registration, and parameter estimation for biomarkers.
The literature on the integration of biophysical (or biophysically inspired) models with image analysis algorithms is quite extensive. We will consider three problems: The first one is diffeomorphic image registration (see §2). This problem is paramount to many applications in medical imaging [63, 186]. It is about establishing a pointwise spatial correspondence , , between two images and of the same object so that the transformed template image becomes similar to the reference image , i.e,. for all , where denotes the function composition [63, 145]. We require that the map is a diffeomorphism, i.e., is a bijection, continuously differentiable, and has a continuously differentiable inverse. In our formulation, we do not directly invert for the map but introduce a pseudotime variable and invert for the velocity of . In the context of Lagrangian methods, the map at a particular location represents the end point of the characteristic at defined by the velocity (see [132]). In the present work, we use an Eulerian formulation instead. Here, the PDE constraint is, in its simplest form, given by a hyperbolic transport equation for the image intensities subjected to the velocity . The regularization model is a Sobolev norm that stipulates smoothness requirements on . If is sufficiently smooth it is guaranteed that gives rise to a diffeomorphism [20, 39, 209]. This formulation can be augmented by biophysics operators (hard constraints^{1}^{1}1In general, constraints can either be hard or soft. Hard constraints are a set of conditions that the variables are required to satisfy. Soft constraints are penalties for variables that appear in the objective functional; they penalize a deviation of the variables from a condition.
) to incorporate additional prior knowledge on the expected deformation map. One motivation for adding such constraints is that often, especially in longitudinal studies of the same patient, there is a real deformation by the realization of an actual physical phenomenon. A simple constraint, which nonetheless poses significant numerical challenges, is the incompressibility of tissue
[128, 129, 131]. Examples for more complex biophysical constraints are brain tumor growth models [99, 73, 174, 210, 211, 212] or cardiac motion models [188].We will discuss all of these constraints either in the context of diffeomorphic image registration or in the more generic context of data assimilation in medical imaging. This brings us to the second problem we are addressing in this article: Data assimilation in brain tumor imaging [70, 100, 114, 135, 138] (see §3). The PDE constraint is, in its simplest form, a nonlinear parabolic differential equation. The inversion variables are, e.g., the initial condition, the growth rate of the tumor, or a diffusion coefficient that controls the net migration of cancerous cells within brain parenchyma. The regularization model is in our case an penalty. The third problem is cardiac motion estimation (see §4).
1.1 Contributions

We discuss the implementation of a distributedmemory Newton–Krylov solver for two of these problems (see §5).
In this paper, we summarize our recent contributions in the field of diffeomorphic image registration [128, 129, 131, 132, 130] and data assimilation in brain tumor imaging [70, 71, 135, 127]. In particular, we will present the formulations described in [128, 129] and [70], the Newton–Krylov methods presented in [128, 130] and [70], and review the distributedmemory implementations described in [129, 71]. We have also added some new results, which have not been presented elsewhere (see §6.1).
1.2 Mathematical Setting
The problems we consider in this paper can be viewed as an inverse problem of estimating a parameter function (inversion field) based on observations , which we assume to be solutions of a system of PDEs with , defined on a domain , , . We, e.g., consider linear hyperbolic and nonlinear parabolic PDE constraints . We assume that the data is a nonlinear function of the parameters , i.e., the parametertoobservation map is given by , where is measurement noise. Here, is an observation operator, i.e., a projection of the state variable onto locations in (or ) to which the observation (or desired state) is associated. The problem of recovering from is often illposed due to the fact that the measured data is finite and perturbed by noise. A common approach to tackle this challenge is to solve for a smooth, locally unique solution of a nearby problem by imposing prior knowledge based on an adequate regularization scheme [59], for example a Tikhonov functional . Overall, we arrive at the following formulation
(1)  
The problem in (1) balances data fidelity (for simplicity, we only consider an distance) with regularity (controlled by the (regularization) parameter ). Designing efficient algorithms for PDEconstrained optimization problems remains a significant challenge [2, 27, 34, 95, 97, 120]. These problems typically involve an infinite number of unknowns, which, upon discretization, leads to highdimensional systems (in our case, e.g., unknowns for clinically relevant problems). Usually, the solver has to be tailored to the structure of the operators. We focus on derivative/adjoint based algorithms for PDEconstrained optimization. The evaluation of the objective functional and its derivatives involves expensive computations, which, in our examples, are the solution of PDEs. We tackle these challenges by designing computational methods that rigorously follow mathematical and physical principles, and are based on efficient, robust, highfidelity algorithms with a sound theoretical basis.
Remark
Typically, the model parameters are not
the final quantity of interest. More often, we need to predict some other future quantity (e.g., tumor infiltration to healthy tissue or the probability of tumor recurrence). Using a deterministic formulation, like the one in (
1), only provides point estimates. Due to uncertainties in the data (denoted above by ), the model parameters , the inversion algorithm, and the mathematical model, we require confidence intervals for the quantities of interest, not just point estimates. That is, we are interested in uncertainty propagation from the input to the quantity of interest. This can be achieved by using a Bayesian framework
[107, 192, 187]. Formulating the problem as a Bayesian inverse problem provides the posterior joint probability density of the parameters . Related quantities of interest can be computed by computing expectations with respect to the posterior. The solution of the deterministic inverse problem corresponds to the maximum a posteriori probability estimate. This connection between deterministic inversion and statistical inference can be used to devise efficient sampling strategies that exploit local problem structure [67, 68, 136, 162]. In what follows, we will focus on deterministic inversion.1.3 Limitations and Open Issues
Despite the success of biomechanical models, in many cases their clinical application is limited to qualitative analysis. Biomechanical simulators are hampered by imprecise and complex constitutive laws, and the uncertainties in boundary and initial conditions. Moreover, even assuming we had a complete knowledge of the models and their parameters, solving the underlying PDEs is computationally challenging; accounting for the uncertainty amplifies the computational difficulties.
The key challenges and limitations of biomechanically informed algorithms for medical image analysis can be summarized as follows:

Models are often phenomenological; many parameters in the equations do not have direct biological counterparts and can therefore not be directly obtained from data.

Boundary and initial conditions have significant uncertainties.

In a clinical setting, invivo medical imaging data has typically poor resolution and may contain significant amount of noise, which makes the calibration of complicated (multiparameter) models quite challenging.^{2}^{2}2We note that highresolution imaging technologies have emerged; an example is CLARITY imaging [44, 194, 115]; we revisit this exsitu imaging technique briefly in §2. Highresolution imaging techniques are not yet available in a clinical setting. Resolution levels for routinely collected imaging is between and along each spatial direction (depending on the imaging modality).

The imaging operator is extremely complicated. The question on how observations in imaging data map to model outputs is a difficult one and depends on the scanner. Oftentimes, pseudocorrespondences are used.

Complicated models result in illconditioned, multiphysics operators that are challenging to solve.

We obtain complex optimization problems that are highly nonlinear and timedependent.
We limit this review to problems we have considered in our past work. We note that PDEconstrained optimization appears in other contexts in medical imaging. For example, many image reconstruction algorithms involve physical models and the reconstruction phase could be formulated as a PDEconstrained optimization problem. However, this usually leads to overly complicated algorithms. Typically approximation and spectral analysis of the underlying operators are preferable and lead to direct reconstruction algorithms. Here, we briefly give some references for problems in which PDEconstrained optimization has been considered in practice. In elastography, the goal is to estimate elastic material properties using, e.g., ultrasound. The main PDE is wave propagation. Examples can be found in [72, 153, 155]. In optical diffusion tomography, the goal is to image optical properties of tissue using diffuse approximations of light scattering. The main PDEs are diffusion, radiationdiffusion, or highlydamped Helmholtz. Examples include [7, 8, 105, 168, 173].
1.4 Organization and Notation
We denote the standard inner product by and the norm by , both defined on the spatial domain , . We denote discretized quantities by upright boldface letters. That is, the discrete representation of is denoted by . Likewise, a superscript is added to the operators whenever we refer to discretized quantities.
We review the literature and present the formulation for the diffeomorphic registration problem in §2 and for the data assimilation problem in brain tumor imaging in §3. We review relevant literature for cardiac motion estimation in §4. We present the implementation of our solver and its deployment in highperformance computing platforms in §5. We showcase exemplary numerical results in §6. We conclude with §7. Additional details about our solver can be found in the appendix (§B and §C).
2 Diffeomorphic Registration
Let us start by defining more precisely what we mean by diffeomorphic image registration. In diffeomorphic image registration we require that the spatial transformation that maps the template image to the reference image is a diffeomorphism. A diffeomorphism is an invertible map, which is continuously differentiable (in particular a function) and maps onto itself. Formally, we require that does not change sign or vanish for to be locally diffeomorphic.
2.1 Literature Review
Providing a comprehensive review on image registration is impossible due to the large body of literature. We refer to [63, 144, 145, 186] for a general introduction to the problem, recent algorithmic developments, and applications. We consider the problem of diffeomorphic image registration as a hyperbolic optimal control problem [33, 39, 90, 128, 129, 130, 131]; i.e., problems with a transport or continuity equation as PDE constraint (we refer to §2.2 for details on the problem formulation).^{3}^{3}3Similar formulations can be found in other applications, e.g., geophysical sciences [65]. These types of formulations are based on conceptual ideas from computational fluid dynamics and flow control [78]. They share characteristics with traditional optical flow formulations [102, 108, 163, 172] and formulations for optimal mass transport [6, 22, 25, 85, 184, 197].
Viscous fluid formulations for diffeomorphic image registration are pioneered in [42, 43]. Here, the basic idea is to introduce a pseudotime variable and parametrize the deformation map by its velocity . This approach has been embedded into a variational framework in [20, 55, 140, 142, 195]; the problem formulation takes the form
subject to for and for , where is equivalent to , and . In the formulation above, we seek to find a time dependent, smooth, and compactly supported velocity field , where is a Sobolev space of certain regularity. The Sobolev norm in the problem formulation above guarantees that belongs to a space
of regular vector fields, which in turn guarantees (the regularity requirements are discussed in
[55, 195, 20]) that the solution to is a diffeomorphism. This approach embeds the diffeomorphisms in a Riemannian space; the time integral of the squared Sobolev norm of is a geodesic distance (a Riemannian metric) between the identity map at and the diffeomorphism at . The optimal defines the shortest path on the manifold of diffeomorphisms that connects with .As we will see below, we will use a PDEconstrained (Eulerian) formulation instead. A related formulation that includes additional hard and soft constraints on and can be found in [118, 119]. This is different to the PDEconstrained formulations in [33, 38, 39, 90, 128, 129, 130, 131]. Here, a linear advection equation is used to model the transport of the intensities of . The latter approach does not operate on the space of diffeomorphisms; does not appear explicitly, only does. This formulation has been augmented by hard or soft constraints on the divergence of [33, 39, 128, 129, 131, 172], rendering the deformation map (nearly) incompressible [79, p. 77ff.].^{4}^{4}4An interesting direction of research is to augment these PDE constraints by more complex biophysics operators [99, 73, 188, 210, 211, 212]. This results in additional parameters that need to be calibrated. In both approaches has to be sufficiently smooth to guarantee that the associated deformation map is a diffeomorphism. These smoothness requirements are typically stipulated by the regularization operator—a Sobolev norm, the order of which depends on the smoothness of the images, the particular form of the transport equation, and the existence of additional constraints (such as, e.g., a divergencefree velocity) [18, 20, 38, 39, 49, 55, 195] (see §2.2).
The differences between our formulation and PDEconstrained formulations for optimal mass transport [22, 25, 85, 184] are the constraint (a continuity equation to preserve mass^{5}^{5}5Applications for masspreserving registration can be found in [36, 132, 205]. versus a transport equation) and the regularization model (norm of scaled by the transported density versus Sobolevnorm of ). The conceptual differences between our formulation and traditional optical flow formulations [102, 108, 163, 172] are that we do not consider a time series of images and the transport equation is a hard constraint; it does not directly enter the objective functional. PDEconstrained formulations for optical flow, which are equivalent to our formulation, are described in [5, 18, 33, 39].
There has only been a limited amount of work devoted to the design of effective algorithms for velocitybased diffeomorphic image registration that exploit stateoftheart technology in scientific computing. There are several challenges for the design of efficient solvers. First and foremost, image registration is an illposed, highly nonlinear inverse problem. It is established that gradient descent schemes for numerical optimization (which are still predominantly used [13, 14, 15, 20, 39, 81, 90, 201]) only converge slowly when applied to nonlinear, illposed problems. Recently, several groups have developed Newtontype algorithms to address this issue [11, 25, 128, 129, 130, 131, 132, 184, 96]. In this context, the design of an effective preconditioner for the Hessian is essential for these methods to be competitive in terms of the timetosolution; different approaches have been discussed in [25, 128, 130, 132, 184, 96] (see [23] for an introduction to preconditioning of saddle point problems). Another issue is the size of the search space, in particular when considering 3D problems (in space).^{6}^{6}6The number of unknowns is the number of discretization points in space times the number of discretization points in time for the velocity times the dimensionality of the ambient space (we invert for a timedependent vector field). The size of the search space can be reduced by exploiting the fact that the momentum of the variational problem is constant in time (in Lagrangian coordinates). That is, the flow of the diffeomorphism can completely be encoded by the momentum at [11, 141, 201, 208, 209]. Another approach is to invert for a stationary velocity field [9, 10, 94, 129, 131, 132].^{7}^{7}7A stationary velocity field is a velocity field that is constant in time, as opposed to a nonstationary—i.e., timedependent or transient—velocity field. We note that stationary velocity fields do not cover the entire space of diffeomorphisms and do not provide a Riemannian metric on this space, something that may be desirable in certain applications [20, 140, 213]; this requires timedependent velocities. The work in [128] uses a Galerkin method to control the number of unknowns in time. It is shown experimentally that stationary and nonstationary velocities yield an equivalent registration quality in terms of data mismatch. In addition to that, we have to deal with an expensive parametertoobservation map, which, in our formulation, is a hyperbolic transport equation. Several strategies to efficiently solve these equations have been considered; they range from conditionally stable total variation diminishing [33], second order RungeKutta [128, 129], and highorder essentially nonoscillatory [90] schemes, to unconditionally stable implicit LaxFriedrich [25, 184], semiLagrangian [20, 39, 130, 131], and Lagrangian [132] schemes.
We note, that there exists a large body of literature that discusses effective parallel implementations of solvers for PDEconstrained optimization problems; examples include [1, 2, 28, 29, 30, 27, 26, 183]. There has been only little work on distributedmemory algorithms for diffeomorphic image registration. Realtime implementations for low dimensional (rigid) registration are, e.g., described in [180, 171]. Popular software packages for (in many cases) diffeomorphic image registration are described in [10, 15, 109, 143, 145, 156, 199, 200]. None of these uses a PDEconstrained formulation like the one discussed here. Most implementations use open multiprocessing [15, 109, 199, 200] or graphic processing units (GPUs) for acceleration [77, 82, 143, 197, 185, 151, 77, 58, 112, 198, 126]. A survey of registration algorithms that exploit GPUs for acceleration can be found in [64]. A literature survey on high performance computing in image registration can be found in [181, 57, 178]. Works on deploying solvers for diffeomorphic image registration to supercomputing platforms can be found in [82, 81, 80, 131, 71]. The work that is most closely related to ours [131, 71] is [82, 80] and [198]. In [82, 81, 80] a multiGPU accelerated implementation of the diffeomorphic image registration approach described in [106] is presented. The work in [198] discusses a GPU accelerated implementation of the diffeomorphic algorithm described in [10].
Recent advances in medical imaging [44, 194, 115] result in data sizes that pose the demand for deploying scalable, distributedmemory implementations.^{8}^{8}8The resolution of these datasets can reach resulting in of data (if stored with half precision) [194]. Overall, this results in 2.4 trillion discretization points in space. To the best of our knowledge, only our solver [131, 71] has been shown to scale up to similar problem sizes. Moreover, we are the first group that proposed an implementation of a globalized (Gauss–)Newton–Krylov solver for a PDEconstrained approach to diffeomorphic image registration that scales on supercomputing platforms [131, 71].
2.2 Formulation
We assume that the reference image and the template image are continuously differentiable functions, and compactly supported on , . In diffeomorphic image registration we seek to find a diffeomorphism such that the transformed template image becomes similar to the reference image , i.e., [144, 145]. In our formulation, we do not directly invert for the deformation map . We introduce a pseudotime variable and invert for a smooth velocity field (control variable) , , , where is a Sobolev space of order (see below); the velocity parameterizes the deformation map . We use an Eulerian formulation; in its simplest form, the PDE constraint in (1), i.e., the forward operator, is given by the transport equation [128, 90] equationparentequation
(2a)  
(2b) 
and periodic boundary conditions on . In our formulation, the map does not appear explicitly. This is different from mapbased formulations (i.e., formulations that directly invert for ) [36, 63, 133, 145] and Lagrangian formulations for velocitybased diffeomorphic image registration [15, 20, 132]. In our formulation, we model the deformed template image as the solution of the transport equation (2) at and denote it by .
In the inverse problem we seek to find a velocity such that the transported template image is equal to for all . As can be seen in (1), we use a squared distance to measure the discrepancy between and , where corresponds to and projects the state to the terminal time frame at , i.e., .^{9}^{9}9Other distance measures, such as mutual information, normalized gradient fields, or cross correlation can be used (see [144, 145] for an overview). In our formulation, changing the distance measure will affect the first term in the objective functional and the terminal condition of the adjoint equation (8) in §5.1.1. Notice, that the reference image as well as the initial condition of the forward problem in (2) are measured function, and, hence, may contain noise.
This formulation has been augmented by constraints on the divergence of [39, 128, 130, 131]. We obtain an incompressible diffeomorphism if is divergencefree [79, p. 77ff.], i.e., . In this case, is the space of divergencefree velocities with Sobolev regularity of order . We can relax the incompressibility assumption by introducing a masssource , i.e., [129]. This is equivalent to adding a penalty on the divergence of to the variational problem [33]. In [132] it has been suggested to replace (2a) with the continuity equation, e.g., used in optimal mass transport formulations [22, 25, 85, 184] to model the preservation of mass of .
A critical ingredient of the variational problem formulation is the regularization operator ; choosing an adequate Sobolev norm guarantees that the deformation map associated with is a diffeomorphism [20, 55, 140, 142, 195]. In general format, is given by
(3) 
Here, is a positive definite, selfadjoint differential operator. In [20] it is shown that we require (slightly more than) regularity for if we assume that and are functions. Accordingly, with , , has become a popular choice [20, 94, 213]. If we model the images as functions of bounded variation instead, we require that is an function [39]. This requirement can be relaxed to integrability if we add a diffusion operator to (2a) [18]. A detailed study of (near) incompressible flows can be found in [38, 39, 49]. The work in [33] considers an additional regularization term for . In [129, 131] is modeled as a stationary field; the integral in time in (3) can be dropped. Extensions to nonquadratic regularization models are discussed in [129, 163]. In particular, the work in [129] introduces a nonlinear regularization operator that enables a control (promote or penalize) of shear (the first variation yields a control equation with a viscosity that depends on the strain rate).
3 Data Assimilation in Brain Tumor Imaging
3.1 Literature Review
We view biophysics simulations as a powerful tool to augment morphological and functional medical imaging data. There is a long tradition in the design of mathematical models of cancer progression [21, 170, 202]. In many cases, the models are rather simplistic.^{10}^{10}10More sophisticated multispecies models that, e.g., account for hypoxia, necrosis and angiogenesis can be found in [92, 76, 166]. This is due to the fact that medical imaging only provides scarce information to drive the model (see Fig. 2); we have to limit the size of the parameter space. Most of the approaches used for data assimilation in medical imaging exclusively rely on observations derived from morphologic imaging data and consider continuum models; cancerous cells are not tracked individually but modeled as a density on , . In their simplest form, these models assume that cancer progression is dominated by two phenomena: cell division/mitosis and cell migration, resulting in reactiondiffusion type equations [150, 189, 190, 104].^{11}^{11}11Models that account for the mechanical interaction of the tumor with its surroundings have been described in [40, 45, 98, 146, 203, 206, 207]. Despite their simplicity and phenomenological character, these types of models have been successfully applied to capture the appearance of tumors in medical images [45, 113, 125, 127, 135, 167]. They have been used to (i) study tumor growth patterns in individual patients [45, 127, 135, 167, 189], (ii) extrapolate the physiological boundary of tumors [113, 148], (iii) or study the effects of clinical intervention [116, 117, 139, 164, 169, 191, 203] .
Most of the work mentioned above is concerned with brain tumor imaging; exceptions are, e.g., [203] (breast cancer) and [125] (pancreatic cancer). A key challenge is the design of methods that have the potential to generate patientspecific predictions. In general, this requires the solution of an inverse parameter identification problem. Several groups have addressed this problem. There has only been little work on modelbased image analysis in brain tumor imaging based on PDEconstrained optimization [47, 70, 99, 111, 110, 125, 135, 127, 165];^{12}^{12}12We note that there exists a large body of literature on optimal control problems with similar PDEs as constraints in other areas. Examples can, e.g., be found in [19, 50, 61, 62, 159]. the works in [47, 70, 99, 111, 110, 125, 165] consider adjoint based approaches for numerical optimization. Others use derivativefree optimization [40, 110, 114, 133, 127, 139, 206, 207], finitedifference approximations to the gradient [101], or tackle the parameter estimation problem within a Bayesian framework [116, 117, 138, 91, 154, 123, 48, 122] (see Rem. 1.2). These approaches have in common that they only require the evaluation of an objective functional (i.e., essentially a repeated solution of the forward problem for perturbed parameters). As such, they are easy to implement. However, it is established that they are in general not as efficient as gradient based optimization strategies (at least for highdimensional problems); they display slow convergence resulting in an excessive number of iterations/evaluations of the forward model. Another strategy that has been considered, is to exploit the asymptotic behavior of reactiondiffusion equations [191, 89, 114]. The basic idea is to consider the boundary of the abnormality visible in medical imaging as a traveling wave solution and convert it into parameter estimates for the growth/migration rate (see [89]). The work in [99] is to the best of our knowledge the first to consider adjoint equations in the context of data assimilation in brain tumor imaging. The adjoint equations are derived for the onedimensional case; they fall back to a derivativefree optimization strategy when solving the threedimensional problem. The work in [47] is based on a multispecies model. They derive adjoint equations for inverting for the vascularization. The work in [125] extends [99]. They derive firstorder optimality conditions for an advectionreactiondiffusion PDEconstrained optimization problem; there is no information on the numerical treatment of the problem.
In [165] a coupled reactiondiffusion system for the density of normal tissue, cancerous tissue, and ion concentration is considered. They derive adjoint equations for a scalar parameter controlling the excess ion concentration. The work in [111] models the tumor as a radially symmetric spheroid. The only work we are aware of (in the context of model based brain tumor image analysis) that derives Hessian information for numerical optimization is [70]. We describe this model in more detail in §3.2. Aside from numerical optimization, there has only been little work on the design of effective solvers (in terms of timetosolution and rate of convergence). Conditionally stable explicit time integrators are a common choice due to their simplicity [47, 203]. Using methods that require a large number of time steps can become computationally prohibitive in a threedimensional setting not only in terms of the timetosolution but also due to memory requirements.^{13}^{13}13A naive implementation may require storage of the time history of the state and adjoint fields; one remedy is the implementation of checkpointing/domain decomposition schemes [1, 75, 93]. The works in [99, 70, 133, 127] consider unconditionally stable schemes.
An open problem in data assimilation in brain tumor imaging is how to establish a physiologically meaningful correspondence between the model output (predicted state; typically a cell density) and the data (observed state; typically morphological imaging information, i.e., an abnormality in the images; see Fig. 2
for an example). As a consequence one has to rely on heuristics and pseudocorrespondences. One such example is to relate (manual) segmentations of imaging abnormalities to a detection thresholds for the computed population density of cancerous cells
[70, 89, 116, 117, 127, 133, 191]. Another approach is to derive cell density estimates from physiological imaging data [12, 101, 203]. Either of these techniques only provides scarce information to drive the parameter estimation, which in turn limits the size of the search space, and as such the complexity of the models.Considering parallel, distributedmemory implementations, we are not aware of any work in this area other than the one presented by our group [71].^{14}^{14}14We note that there certainly exist parallel implementations of solvers for similar PDEconstrained optimization problems. We refer to [1, 2, 28, 29, 30, 27, 26] for examples.
3.2 Formulation
The PDE constraint in (1), i.e., the forward operator, is given by equationparentequation
(4a)  
(4b) 
with Neumann boundary conditions on . The parameter
represents the growth rate; it controls the proliferation of cancerous cells modeled by a logistic function. The diffusion tensor
, , controls the net migration of cancerous cells . A common assumption in many models is that the rate of migration depends on the tissue composition; typically, it is modeled to be faster in white than in gray matter [116, 117, 189]. If we denote the probability maps for these tissue types by and , respectively, the associated model for is given by [127](5) 
with . We can augment this model by integrating the white matter tract architecture^{15}^{15}15The white matter fibre architecture can be estimated from data measured by socalled diffusion tensor imaging, a magnetic resonance imaging technique that measures the anisotropy of diffusion in the human brain. The result of this measurement is a tensor field that can directly be inserted into (4).
(i.e., information about neuronal pathways; see Fig.
3 for an illustration). This has, e.g., been considered in [45, 70, 113, 127, 135, 148].The input to our inversion are (i) estimates for and , (ii) a tumor free patient image obtained from medical imaging data [73, 137], and (iii) an estimate for the patient’s tumor given in terms of a probability map . Given the forward model (4) our parameter vector essentially consists of , (i.e., and in (5)) and . We assume that we have experimental data that provides estimates for and ; we only invert for . We parameterize the initial condition in (4b) using an dimensional space spanned by a Gaussian basis
we obtain . The regularization model for our problem formulation (see (1)) is the norm of .
4 Cardiac Motion Estimation
Medical imaging can help in the diagnosis of cardiac masses, cardiomyopathy, myocardial infarction, and valvular disease. One example is cineMRI, which is used routinely for diagnosing a variety of cardiovascular disorders [37, 179]. Given the cineMRI data we seek to reconstruct the motion of the cardiac wall, which in turn is used to evaluate clinical indices like ejection fraction and localized abnormalities of the heart function. Segmentation of the ventricles and the myocardium is the first step toward motion reconstruction for quantitative functional analysis of cineMRI data. Manual segmentation is time consuming [16] and inter and intraobserver variability can be high [32]. For this reason automatic segmentation algorithms are preferable. However, even today most of them are not robust enough for a fully automatic segmentation. Here we discuss methods that incorporate biophysical constraints and often result in PDEconstrained optimization algorithms.
One of the main thrusts in recent research has been the 4D motion estimation using biomechanical models. In [103], a piecewise linear composite biomechanical model was used to determine active forces and the strains in the heart based on tagged MRI information. In [157, 158] echocardiography and MR images were combined with biomechanical models for cardiac motion estimation. Interactive segmentation was combined with a Bayesian estimation framework that was regularized by an anisotropic, passive, linearly elastic, myocardium model. The authors recognized the importance of neglecting active contraction of the left ventricle. In [176, 177], the need for realistic simulations and the need for inversion and data assimilation was outlined. In [31], an incompressible deformable registration was used for global indicators. In [160], a regular grid with Bsplines was used with a mutual informationbased similarity measure;^{16}^{16}16Mutual information is a statistical distance measure that originates from information theory. As opposed to the squared distance used in the present work (see §2
), which is used for registering images acquired with the same modality, mutual information is used for the registration of images that are acquired with different modalities (e.g., the registration of computed tomography and magnetic resonance images). Mutual information assess the statistical dependence between two random variables (in our case image intensities; see
[144, 145, 186] for more details). however, the focus has been on intersubject registration [161] (without consideration of mechanics). In [182], a 4D deformable registration was proposed with a diffusive spacetime regularization that worked well for global motion measures. In [188], a biomechanicallyconstrained motion estimation algorithm that explicitly couples raw image information with a biomechanical model of the heart is described. The main features of that scheme are (i) a patientspecific imagebased inversion formulation for the active forces; (ii) a multigridaccelerated, octreebased, adaptive finiteelement forward solver that incorporates anatomicallyaccurate fiber information; (iii) an adjoint/Hessianbased inversion algorithm; and (iv) a 4D coupled inversion for all images . The method requires segmentation of the blood pool and myocardium at the initial frame to assign material properties and a deformable registration to map fiberorientations to the patient. More complex models are considered in [52]. The authors use an adjoint method for cardiacmotion estimation and contractility combining catheterized electrophysiology data and cineMRI sequences. They consider a gradient descent method that uses adjoint equations for the elastic deformation of the heart. A similar approach for cardiac motion estimation is described in [196]; the adjoints are derived using an automatic differentiation framework.5 Numerics
Here, we summarize the numerical strategies for solving the problems in §2 and §3, respectively. We discuss the discretization in space and time, the solvers, the computational kernels, and the parallel implementation. Overall, we use a globalized, preconditioned, inexact, reduced space Gauss–Newton–Krylov method for both problems. We will see that the optimality systems of the considered problems yield large, illconditioned operators that are challenging to solve in an efficient way. The associated PDE operators will determine the choices we make for the numerics.
5.1 Optimality Conditions
We use the method of Lagrange multipliers [124] to turn (1) into an unconstrained problem. From thereon we can consider two alternative approaches to tackle the optimization problem: optimizethendiscretize and discretizethenoptimize (see [78, 204, p. 57ff.] for a discussion; see also Rem. 5.1). We use the former strategy; we first compute variations of the Lagrangian functional with respect to the state, adjoint, and control variables, and then discretize them. We will assume that all variables fulfill the necessary regularity requirements to be able to carry out these computations; we refer the interested reader to, e.g., [18, 38, 39, 118, 119], for a rigorous treatment.
Remark
An optimizethendiscretize approach does not guarantee that the discretization of the gradient is consistent with the discretization of the objective functional. Further, it is not guaranteed that the discretization of the forward and adjoint operators are transposes of one another; similarly, it is not guaranteed that discretization of the Hessian is symmetric. These inconsistencies can have a negative effect on the convergence of the solver. For a discretizethenoptimize approach one can (by construction) guarantee that these operators are consistent, transposes, and symmetric. However, one may, e.g., obtain a different numerical accuracy for the forward and adjoint operators (see, e.g., [87, 54] for examples). For the optimizethendiscretize approach we can choose arbitrarily accurate solvers for the forward and adjoint problems. We refer to [78, 34] for more remarks on the discretization of optimization and control problems. For the PDEconstrained optimization problems considered in this study, we have experimentally observed that the discretization errors are below the tolerances relevant in practical application (e.g., a relative change of the gradient of ). If we are interested in solving the inverse problems more accurately (e.g., up to ) our solver may fail to converge. If more accurate solutions are required, we will have to consider PDE solvers that preserve the properties of the continuous operators. For the diffeomorphic registration problem, we present a discretizethenoptimize implementation in [132]; works of other authors that consider a discretizethenoptimize strategy can be found in [11, 25, 96].
5.1.1 Diffeomorphic Registration
We assume that is stationary and divergencefree. We introduce the Lagrange multipliers , , , to obtain the Lagrangian functional^{17}^{17}17We neglect the periodic boundary conditions for simplicity.
(6)  
We obtain the optimality system by taking variations with respect to the adjoint, state, and control variables, and applying integration by parts. We use a reduced space formulation. That is, we assume that the state and adjoint equation are fulfilled exactly (see below) and only iterate on the reduced space of the velocity . Accordingly, we require vanishing variations of the Lagrangian with respect to the control variable , i.e.,
This is the strong form of the control or decision equation (i.e., the reduced gradient) of our problem; is an integrodifferential operator of the form
(7) 
The operator originates from the regularization operator in (3); depending on its order we arrive at an elliptic [129], biharmonic [128] or triharmonic [39] integrodifferential control equation. The pseudodifferential operator is a projection onto the space of divergencefree velocities. It originates from the elimination of the constraint and the adjoint variable from our optimality system (see [128, 129]). The operator is given by (Leray projection). If we neglect the incompressibility constraint in our formulation becomes an identity operator. If we introduce a masssource map into our formulation this operator becomes more involved [129]. Evaluating the gradient (7) for a given trial requires us to find and . We can compute by solving the forward problem (2) forward in time. Once we have found the final state we can compute by solving the adjoint equation equationparentequation
(8a)  
(8b) 
with periodic boundary conditions on backward in time; (8) is a final value problem and models the transport of the residual difference between the final state and the reference image backward in time.^{18}^{18}18Instead of solving an advection or continuity equation as done in the Eulerian formulation we present here, it has been suggested to solve for the state and adjoint variables using the diffeomorphism
(which involves interpolation operations instead of the solution of a transport equation)
[201].5.2 Data Assimilation
The Lagrangian for our problem is given by^{19}^{19}19We neglect the Neumann boundary conditions for simplicity.
(9)  
with Langrange multiplier , and , , and . The unknowns of our problem are the diffusion coefficients and , the growth rate , and the initial condition (and its location). We can derive the adjoint equations for either or all of these variables. For simplicity of presentation, we will derive the optimality conditions for the parameterization of in (4b). A more complete picture can be found in [70]. The strong form of the control or decision equation (i.e., the reduced gradient of our problem) is given by the vanishing first variation of the Lagrangian functional in , where
(10) 
To obtain , we have to solve the adjoint equation equationparentequation
(11a)  
(11b) 
5.3 Discretization
We use the trapezoidal rule for numerical quadrature with respect to space and with respect to time. We assume that the functions in our formulation (including images) are periodic and continuously differentiable.^{20}^{20}20The data assimilation problem in §3 requires Neumann boundary conditions on , with . We use a penalty approach to approximate these boundary conditions. We apply periodic boundary conditions on and set the reaction and diffusion coefficients in (4a) to sufficiently small penalty parameters and outside of ; see, e.g., [70, 100, 127, 133]. We apply appropriate filtering operations and periodically extend or mollify the discrete data to meet these requirements. We use regular grids to discretize , , . The spatial grid consists of , , grid points , , , . We consider a nodal discretization in time, which results in discretization points. We use a spectral projection scheme for all spatial operations. That is, we approximate spatial functions as , where , , , is the grid index and are the spectral coefficients of . The mappings between the spatial and spectral coefficients are done using FFTs. We approximate spatial derivative operators by applying the appropriate weights in the spectral domain. This scheme allows us to efficiently, stably, and accurately apply differential operators and their inverses.^{21}^{21}21Note that the  condition for pressure spaces arising in finite element discretizations of Stokes problems [35, p. 200ff.] is not an issue with our scheme (incompressibility constraint in diffeomorphic registration problem).
5.4 Numerical Time Integration
The iterative solution of the optimality conditions of our problems requires a repeated solution of PDE operators of mixed type. Here we summarize the time integration schemes for the parabolic or hyperbolic PDEs that appear in our optimality systems. In general we opt for unconditionally stable schemes to reduce the number of timesteps. This will not only reduce the computational work load but also reduce the memory requirements; evaluating the Hessian operators that appear in our scheme requires us to store at least one spacetime field (the state variable).^{22}^{22}22A possible remedy is to employ checkpointing or domain decomposition strategies [1, 75, 93].
5.4.1 Parabolic PDEs
We use an unconditionally stable, secondorder Strangsplitting method to solve the parabolic equations (see also [70, 99]). We explain this for the forward problem (4) in Alg. 1. Let denote the tumor discretized distribution at time , . We apply an implicit Crank–Nicolson method for diffusion (lines 4 and 6 in Alg. 1) and solve the reaction part analytically (line 5 in Alg. 1). A similar splitting strategy is used in [92]. In other related work, conditionally [47, 203] and unconditionally stable [127, 133] explicit schemes were considered.
We use a preconditioned conjugate gradient (PCG) method with a fixed tolerance of to solve the diffusion steps. We use a preconditioner based on a constant coefficient approximation of given by , where is the average diffusion coefficient. The inversion and construction of has vanishing computational cost due to our spectral scheme; it only requires a pointwise multiplication in the Fourier domain and a single forward and backward FFT.
5.4.2 Hyperbolic PDEs
We employ a semiLagrangian scheme [60] to solve the hyperbolic transport equations of the form . In the context of diffeomorphic image registration, semiLagrangian schemes have, e.g., been used in [20, 39, 130, 131]. They are unconditionally stable. This allows us to keep the number of timesteps small and by that significantly reduces the memory requirements. We illustrate this scheme for the forward problem (homogeneous case) in Alg. 2; additional details can be found in [130]. The algorithm consists of two steps: In a first step we have to compute the characteristic by solving in with final condition at backward in time. The second step is to solve an ODE for the transported quantity along ; i.e., we have to solve . Both of these steps require the evaluation of scalar or vector fields along the characteristic , i.e., at locations that do not coincide with grid points ; this involves interpolation (see lines 3 and 6 in Alg. 2 for an example). We use an explicit, secondorder accurate Runge–Kutta scheme to integrate these ODEs in time, and a cubic interpolation model in space.
5.5 Numerical Optimization
We have derived the expressions for the optimality conditions in §5.1. Here, we revisit the generic formulation used in (1). The specific operators for the two problems can be found in §5.1 and §C.
In general, we require that the gradient with respect to the control variable vanishes; i.e., for an admissible solution to (1). We use a globalized, inexact [53, 56], preconditioned, matrixfree, reduced space^{23}^{23}23By reduced space we mean that we iterate only on the reduced space of the control variable of our problem as opposed to so called fullspace or allatonce methods (see, e.g., [24, 29, 30, 83, 95] for more details). Gauss–Newton–Krylov method for numerical optimization. The Newton step is (in general format) given by
(12) 
where .^{24}^{24}24The order of the optimality system depends on the problem. For the diffeomorphic registration case the control variable is given by the velocity field , i.e., ; for the tumor case is given by , i.e., . We compute the search direction by solving the reduced space Karush–Kuhn–Tucker (KKT) system iteratively using a PCG method. Here, , , , is a discrete representation of the Gauss–Newton approximation (see Rem. 5.5) to the reduced Hessian at Newton (or outer) iteration index . We globalize our scheme based on a backtracking line search subject to the Armijo condition (we use default parameters; see [152, algorithm 3.1, p. 37]).^{25}^{25}25Our implementation also features a trust region method.
Since we use a matrixfree Krylovsubspace method to iteratively solve for the search direction , we only require an expression for the application of the Hessian to a vector (Hessian matvec). In Newtontype methods, most work is spent on the (iterative) solution of the KKT system; it is the computational bottleneck of our solver. For both of our problems, each Hessian matvec involves the solution of two PDEs: the incremental state and adjoint equations; this is costly. Therefore, it is essential to design an effective preconditioner to reduce the number of Hessian matvecs we have to perform. We summarize the overall algorithm in Alg. 3 (Newton step) and the steps necessary to solve (12) in Alg. 4 (see §B). We provide expressions for the Hessian matvec for the two individual problems in §C, respectively.
The preconditioner for the reduced space Hessian of the tumor problem uses, likewise to the preconditioner for the forward problem described §5.4.1, a constant coefficient approximation for the diffusion operators (see [70]). For the registration problem we use the inverse of the regularization operator as a preconditioner (a common choice in PDEconstrained optimization [3, 128, 131, 132]). The preconditioned Hessian is a compact perturbation of identity; for smooth data and sufficient regularity of (i.e., we can resolve the problem), we observe a rate of convergence that is independent of the mesh size. In recent work, we have presented a 2level (multigrid inspired) preconditioner. The convergence for both preconditioners is not independent. Details for the 2D case can be found in [130]. The results presented in this study are in 3D and computed using as a preconditioner.
Remark
We use a Gauss–Newton approximation to the Hessian to ensure that the operator is positive definite far away from the optimum. The Gauss–Newton approximation is obtained by dropping all expressions in the evaluation of the Hessian that involve the adjoint variable . (We provide a precise definition of the individual operators in §C.) We expect that the rate of convergence of our scheme drops from quadratic to superlinear using this approximation. We recover local quadratic convergence as (i.e., the mismatch goes to zero, and we have solved the problem). We observe this for synthetic problems. For problems involving real data, the data will be perturbed by noise. In fact, we may not only have to deal with noise perturbations, but also with intensity drifts (which we can correct for in a preprocessing step). Based on these perturbations, we may not recover local quadratic convergence.
5.6 Implementation Details
Generally speaking, (interesting) images are functions of bounded variation. Our scheme cannot handle this type of discontinuities in the data. We assume that all fields that appear in our formulations are smooth, compactly supported functions. We ensure this numerically by presmoothing the data, and applying appropriate extensions to the data, if necessary. For the registration problem we estimate the regularization parameter based on a parameter continuation scheme [84, 86, 128]. For the tumor problem we use a similar approach—the Lcurve method [88] (see [70] for an example). We terminate the inversion if the relative change of the gradient of our problem reaches a user defined tolerance. We typically choose values between and for real world applications.
Remark
From a numerical optimization point of view, we want to drive the gradient to zero. In diffeomorphic image registration problems on real data (mainly multisubject neuroimaging applications) we have observed that, while the solution may still change, the mismatch seems to be quite stable after reducing the gradient by roughly one order of magnitude. Reducing the gradient further does, in our experience, not dramatically alter the mismatch (which is what practitioners mostly care about).
We have developed and presented prototype implementations of our nonlinear solver in Matlab. These solvers are described in detail in [128, 130, 70]. We summarize the Newton–Krylov scheme in §B. Our parallel code [129, 71] is implemented in C++, and uses the message passing interface (MPI) for parallelism. We use a 2D pencil decomposition for 3D FFTs [74, 51] to distribute the data among processors. We use PETSc for linear algebra operations [17], PETSc’s TAO package for the nonlinear optimization [149]
, and AccFFT for Fourier transforms
[69] (a library for parallel FFTs developed by our group). Our code implements the operations for evaluating the objective functional, the gradient, and the Hessian matvec, and the interfaces to set tolerances and control the stopping criteria. The main computational kernels of our scheme are the FFT used for spatial differentiation and the interpolations in the semiLagrangian scheme. These kernels, and their parallel implementation, are described in detail in [69, 131, 71].6 Numerical Experiments
We showcase some numerical experiments to illustrate the behavior of our solvers next.
6.1 Diffeomorphic Image Registration
A study for the performance of our Gauss–Newton–Krylov scheme and a comparison to (Sobolev) gradient descent type approaches can be found in [128]. We could demonstrate that our Gauss–Newton–Krylov scheme yields a significantly better rate of convergence. A study on registration quality as a function of different regularization operators can be found in [129]. In this work, we showed that we can generate highfidelity registration whilst maintaining wellbehaved , by constraining local volume change. A study of the performance of our semiLagrangian scheme in combination with a twolevel preconditioner be found in [130]. Here, we could achieve a speedup of up to 20x compared to our original solver [128]. All this work is limited to the 2D case and implemented in Matlab. In this section, we present results for the implementation of a memorydistributed solver for the 3D case. For a study on the scalability of this solver on supercomputing platforms we refer to [131, 71]. In [71] we achieved a 8x speedup compared to our original implementation [131]. With this work we pave the way for solving diffeomorphic image registration problems on real clinical data sets for clinically relevant problem sizes in realtime—with a time to solution of under two seconds for 50 million unknowns (a grid size of ) using 512 cores on TACC’s Lonestar 5 system (2socket Intel Xeon E52690 v3 (Haswell) with 12 cores/socket, memory per node). We were also able to solve an inverse problem of unprecedented scale with 200 million unknowns (a problem size, that is 64x bigger than in [131]) using 8192 cores on HLRS’s Hazel Hen system (2socket Intel Xeon E52690 v3 (Haswell) with 12 cores/socket, memory per node; we summarize these results in Tab. 1).
In what follows, we will showcase some new results for our algorithm for diffeomorphic registration [128, 129, 130, 131]. We will start by examining the capabilities of our solver of modeling large diffeomorphic deformations in 3D. We will then consider synthetic (smooth) and real brain imaging data. We will report results that showcase the performance of our solver on moderate computing platforms.
6.1.1 Default Parameters
We normalize the images to an intensity range of
prior to registration. If we consider data with sharp edges or real images, we apply a Gaussian smoothing operator with standard deviation of one voxel per spatial direction. We perform the registration in double precision. The number of time steps for the PDE solver is fixed to
. We invert for a stationary velocity field . We use a PCG method to solve for the search direction with a superlinear forcing sequence. We use the inverse of the regularization operator as a preconditioner. The results presented below are obtained on a single node of CACDS’s Opuntia cluster (2socket Intel Xeon E52680v2 at with 10 cores/socket and memory). We do not perform any grid, parameter, or scale continuation. All results reported in this study are computed on the original resolution of the images.6.1.2 SphereToBowl
We showcase results for a benchmark problem for large deformation diffeomorphic image registration algorithms.
Setup: We consider the synthetic registration problem proposed in [43]—a standard benchmark problem for diffeomorphic image registration. We illustrate the template and reference image in Fig. 4. The task is to register a sphere to a bowl. The images are discretized using a grid of size ( unknowns). We use an regularization model and invert for a stationary velocity field .
Results: We illustrate the results in Fig. 5. We show three orthogonal planes through the center of the 3D volumes in Fig. 4 (top to bottom: coronal, axial and sagittal views). Each column in Fig. 5 shows (from left to right) (i) the reference image, (ii) the template image, (iii) the mismatch between the reference image and the template image before registration, (iv) the mismatch between the template and the reference image after registration, (v) the determinant of the deformation gradient, i.e., the determinant of the Jacobian of the deformation map , and (vi) an illustration of the deformation pattern . The mismatch is large in black areas (corresponds to 100%) and small in white areas (corresponds to 0%). The color map for the determinant of the deformation gradient is illustrated in Fig. 5. Black corresponds to a , orange to and white to . In Fig. 6 we show the evolution of the deformation. That is, we solve the forward problem from to based on the found velocity field . We limit this visualization to the sagittal view. We display intermediate states for . In the top row, we show the mismatch between the state variable and the reference image . In the two bottom rows, we show the state variable itself and the state variable with a grid in overlay to illustrate the deformation. The configuration at is equivalent to the results visualized in Fig. 5.
Observations: The results reported in Fig. 5 and Fig. 6 illustrate that we can recover large deformations. We reduce the mismatch between the transported template image and the reference image significantly. The mismatch is almost zero everywhere after registration. Only at the sharp corners of the bowl and along its boundary we can observe residual differences between the deformed template image and the reference image (dark areas). In [132] (in a 2D setting) we experimentally observed that using timedependent velocities may help to further reduce this mismatch.
Careful visual inspection of the evolution of the deformation (see Fig. 6) suggests that we obtain a well behaved map . This is confirmed by the values for the determinant of the deformation gradient; the determinant of the deformation gradient does not change sign and is nonzero for all . The extremal values are (. We conclude that the deformation map is locally diffeomorphic (up to numerical accuracy) as judged by these values.
6.1.3 Smooth Synthetic Data
We report results for a smooth synthetic test problem to demonstrate the performance of our solver under idealized conditions (smooth data and smooth velocity field without the presence of noise).
Setup: The template image is given by . The velocity field is given by , , , . The reference image is computed by transporting the template image with . We solve the inverse problem on a grid size of ( unknowns). We set the tolerance for the relative reduction of the gradient to and the absolute tolerance for the gradient to (not reached). We use an regularization model with a regularization parameter of . We set the maximal number of Newton iterations to 50 and the maximal number of Krylov iterations to 100.
Results: We report the convergence behaviour of our solver in Fig. 7 (top row). We show (from left to right) the trend of the relative values (normalized with respect to the values obtained at the first iteration) for the mismatch, the norm of the gradient, and objective values with respect to the Gauss–Newton iteration index. The last graph shows the residual of the PCG per Gauss–Newton iteration (from Gauss–Newton iteration 1 (light blue) to Gauss–Newton iteration 5 (dark blue)), respectively.
Observations: Our solver converges after six Gauss–Newton iterations. Overall, we require 73 Hessian matrix vector products and a total of 168 PDE solves. We require 1, 1, 4, 9, 23, and 35 PCG iterations per Gauss–Newton iteration, respectively (see also Fig. 7; right column at the top). We reduce the gradient by to an absolute value of (the initial norm of the gradient is ). The runtime of our solver is ( ) on one node with 20 cores. We reduce the mismatch from to . Considering the plots in Fig. 7, we can observe that we mismatch, the gradient, and the objective drop rapidly. We converged to a stationary value of the objective functional and the mismatch after only 4 iteration. The relative reduction of the gradient at iteration 4 is (). During the last three iterations the gradient is still significantly reduced, but the mismatch and the objective value only change marginally. Considering the residual for the PCG iterations, we can see that the residual not only drops fast, but also that at each new Gauss–Newton iteration, we start almost at the same residual we terminated the former iteration with. That is, by using a superlinear forcing sequence we do not seem to oversolve for the Gauss–Newton step for this particular test problem. This changes when we move to real data.
6.1.4 Real Brain Imaging Data
We report results for real brain imaging data in a challenging multisubject registration problem (i.e., the registration of brain imaging data of different individuals).
Setup: The images are taken from the nonrigid image registration evaluation project [41]. This repository contains 16 rigidly aligned T1weighted MRI brain data sets of different individuals with a grid size of ( unknowns).^{26}^{26}26Additional information on the data sets, the imaging protocol, and the preprocessing can be found in [41] and at http://www.nirep.org/. We show one dataset of this repository in Fig. 7 (dataset na01 (reference image) and dataset na03 (template image)). We consider the datasets na01 through na05 for these experiments; na01 serves as the reference image for all experiments.
We consider an regularization model in combination with a penalty on the divergence of the velocity field (div regularization model). The regularization parameter for the Sobolev norm for the velocity is set to . The regularization parameter for the penalty on the divergence of the velocity is set to . These values were chosen by experimentation. We set the maximal number of Newton iterations to 50 and the maximal number of Krylov iterations to 100. We use a tolerance of for the relative change of the gradient and set the absolute tolerance for the gradient to (not reached).
Remark
We use tighter tolerances since we found by experimentation that decreasing the gradient further, significantly increases the runtime. The reduction in mismatch associated with this increase in runtime does, from our perspective, not justify the use of smaller tolerances for the reduction of the gradient.
Results: We report the convergence behaviour of our solver in Fig. 7 (top row). We show results for four registrations (na0, , to na01). We display (from left to right) the trend of the relative values (normalized with respect to the values obtained at the first iteration) for the mismatch, the norm of the gradient, and objective values, with respect to the Gauss–Newton iteration index. The last graph shows the residual of the PCG per Gauss–Newton iteration (from Gauss–Newton iteration 1 (light blue) to Gauss–Newton iteration 5 (dark blue)), respectively. The last plot is only for the registration of the dataset na02 to the dataset na01. We provide additional measures of performance in Tab. 2. We illustrate results for an exemplary dataset (na03 to na01) in Fig. 8.
dataset  matvecs  PDE solves  mismatch  runtime  
na02  6  143  308  ()  
na03  9  255  563  ()  
na04  8  203  434  ()  
na05  6  137  296  () 
Observations: Our solver converges after 6 to 9 Gauss–Newton iterations for these datasets for a tolerance of for the relative reduction of the gradient. We require between 296 and 563 PDE solves (this includes the solves for the evaluation of the objective functional, the gradient, and the Hessian matvecs) and 137 up to 255 Hessian matvecs (see Tab. 2). The mismatch is reduced by a factor of up to . We can see that the mismatch drops quickly during the first few iterations. After about 6 iterations the reduction stagnates. The same is true for the objective functional. From these plots we can also see that further reducing the gradient will not lead to a significant reduction in the mismatch (which is what we care about from a practical perspective). For some of the runs we can also see that the reduction in the gradient starts to stagnate. If we use a smaller tolerance for the gradient, the runtime will increase significantly. Overall, we obtain a runtime of 15 minutes up to 30 minutes. This is not competitive with certain existing methods for diffeomorphic registration ([200] is a prominent example). We note, that these methods do, in general, not monitor the reduction of the gradient. They just perform a fixed number of iterations. Another key difference is, that these approaches perform a grid continuation, something we have not considered here. We expect that this will significantly reduce the runtime, given our solver has a complexity of , where is the number of unknowns.
In Fig. 7 we can also see that for the considered dataset the residual for the PCG iterations drops steadily, but slowly (these results are for the registration of the dataset na02 to dataset na01). We require 14, 17, 18, 21, 30, and 43 PCG iterations per Gauss–Newton iteration, respectively. Qualitatively, we can observe that the template and reference image are in good agreement (see Fig. 8). The deformation map is locally diffeomorphic as judged by the value for the determinant of the deformation gradient. The visualization of the deformed grid also illustrates a well behaved deformation map.
6.2 Data Assimilation
We study the performance of our formulation for synthetic test problems, for varying noise perturbations and detection thresholds.
Setup: We apply different levels of noise and consider partial observations of the computed states. We assume observations at two time points ( and ). We consider different thresholds for the partial observations since there is no consensus in the literature on how to model th
Comments
There are no comments yet.