The total variation (TV) functional is popular as a regularizer in imaging and inverse problems; see for instance [RudinOsherFatemi1992, ChanGolubMulet1999, BachmayrBurger2009, Langer2017] and [Vogel2002, Chapter 8]. It is most commonly applied to functions with values in or . In the companion paper [BergmannHerrmannHerzogSchmidtVidalNunez2019:1_preprint], we introduced the total variation of the normal vector field along smooth surfaces :
In contrast to the setting of real- or vector-valued functions, the normal vector field is manifold-valued with values in the sphere . In (1), denotes the derivative (push-forward) of , and is an arbitrary orthonormal basis (w.r.t. the Euclidean inner product in the embedding ) of the tangent spaces along . Finally, denotes the norm induced by a Riemannian metric on . It was shown in [BergmannHerrmannHerzogSchmidtVidalNunez2019:1_preprint] that (1) can be alternatively expressed as
where and are the principal curvatures of the surface.
In this paper, we discuss a discrete variant of (1) tailored to piecewise flat surfaces , where (1) does not apply. In contrast with the smooth setting, the total variation of the piecewise constant normal vector field is concentrated in jumps across edges between flat facets. We therefore propose the following discrete total variation of the normal,
Here denotes an edge of length between facets, and is the geodesic distance between the two neighboring normal vectors.
We investigate (2) in Section 2. It turns out to coincide with the discrete total mean curvature known in discrete differential geometry. Subsequently, we discuss the utility of this functional as a prior in shape optimization problems cast in the form
Here denotes the solution of the problem specific partial differential equation (PDE), which depends on the unknown domain . Moreover,
represents a loss function, such as a least squares function. In particular, (3) includes geometric inverse problems, where one seeks to recover a shape representing, e.g., the location of a source or inclusion inside a given, larger domain, or the geometry of an inclusion or a scatterer. Numerical experiments confirm that , as a shape prior, can help to identify polyhedral shapes.
Similarly as for the case of smooth surfaces discussed in [BergmannHerrmannHerzogSchmidtVidalNunez2019:1_preprint], solving discrete shape optimization problems (3) is challenging due to the non-trivial dependency of on the vertex positions of the discrete surface , as well as the non-smoothness of . We therefore propose in Section 3 a version of the split Bregman method proposed in [GoldsteinOsher2009], an algorithm from the alternating direction method of multipliers (ADMM) class in which the jumps in the normal vector are treated as a separate variable. The particularity here is that the normal vector has values in and thus the jump, termed , is represented by a logarithmic map in the appropriate tangent space. An outstanding feature of the proposed splitting is that the two subproblems, the minimization w.r.t. the vertex coordinates representing the discrete surface and w.r.t. , are directly amenable to numerical algorithms.
Although many optimization algorithms have been recently generalized to Riemannian manifolds, see, e.g., [Bacak2014, BergmannPerschSteidl2016, BergmannHerzogTenbrinckVidal-Nunez2019_preprint], the Riemannian split Bregman method for manifolds proposed in this and the companion paper [BergmannHerrmannHerzogSchmidtVidalNunez2019:1_preprint] is new to the best of our knowledge. Its detailed investigation will be postponed to future work. For a general overview of optimization on manifolds, we refer the reader to [AbsilMahonySepulchre2008]. We anticipate that our method can be applied to other non-smooth problems involving manifold-valued total variation functionals as well. Examples falling into this class have been introduced for instance in [LellmannStrekalovskiyKoetterCremers2013, BergmannTenbrinck2018]. An alternative splitting scheme, the so-called half-quadratic minimization, was introduced by [BergmannChanHielscherPerschSteidl2016].
The structure of the paper is as follows. In the following section we provide an analysis of the discrete total variation of the normal (2) and its properties. We also compare it to geometric functionals appearing elsewhere in the literature. In particular, we provide a numerical comparison between (2) and surface regularization for a mesh denoising problem. Section 3 is devoted to the formulation of an ADMM method which generalizes the split Bregman algorithm to the manifold-valued problem (3). In section 4, we describe an inclusion detection problem of type (3), motivated by geophysical applications. We also provide implementation details in the finite element framework . Corresponding numerical results are presented in Section 5.
2. Discrete Total Variation of the Normal
From this section onwards we assume that is a piecewise flat, compact, orientable surface without boundary, which consists of a finite number of flat facets with straight sided edges between facets. Consequently, can be thought of as a mesh consisting of polyhedral cells with a consistently oriented outer unit normal. We also assume this mesh to be geometrically conforming, i.e., there are no hanging nodes. A frequent situation is that is the boundary mesh of a geometrically conforming volume mesh with polyhedral cells, representing a volume domain . In our numerical example in Section 5, we will utilize a volume mesh consisting of tetrahedra, whose surface mesh consists of triangles; see Figure 1.
Since the surface is non-smooth, the definition (1) of the total variation of the normal proposed in the companion paper [BergmannHerrmannHerzogSchmidtVidalNunez2019:1_preprint] for smooth surfaces does not apply. Since the normal vector field is piecewise constant here, its variation is concentrated in spontaneous changes across edges between facets, rather than gradual changes expressed by the derivative . We therefore propose to replace (1) by
where denotes an edge of Euclidean length between facets. Each edge has an arbitrary but fixed orientation, so that its two neighboring facets can be addressed as and . The normal vectors, constant on each facet, are denoted by and . Moreover,
denotes the geodesic distance on , i.e., the angle between the two unit vectors and ; see also Figure 6.
To motivate the definition (4), consider a family of smooth approximations of the piecewise flat surface . The approximations are supposed to be of class such that the flat facets are preserved up to a collar of order , and smoothing occurs in bands of width around the edges. Such an approximation can be constructed, for instance, by a level-set representation of by means of a signed distance function . Then a family of smooth approximations can be obtained as zero level sets of mollifications for sufficiently small . Here is the standard Friedrichs mollifier in 3D and denotes convolution. A construction of this type is used, for instance, in [GomezHernandezLopez2005, BonitoDemlowNochetto2019_preprint]. An alternative to this procedure is the so-called Steiner smoothing, where is taken to be the boundary of the Minkowski sum of with the ball ; see for instance [Sullivan2008, Section 4.4].
Let denote a family of smooth approximations of obtained by mollification, with normal vector fields . Then
Let us denote the vertices in by and its edges by . Since mollification is local, the normal vector is constant in the interior of each facet minus its collar, which is of order . Consequently, changes in the normal vector are confined to a neighborhood of the skeleton. We decompose this area into the disjoint union . Here are the transition regions around edge where the normal vector is modified due to mollification, and are the regions around vertex . On , we can arrange the basis to be aligned and orthogonal to so that
holds, which can be easily evaluated as an iterated integral. In each stripe in perpendicular to , changes monotonically along the geodesic path between and , so that the integral along this stripe yields the constant . Since the length of parallel to is up to terms of order , we obtain
The contributions to from integration over are of order since is of order and the area of is of order . This yields the claim. ∎
2.1. Comparison with Prior Work for Discrete Surfaces
The functional (4) has been used previously in the literature. We mention that it fits into the framework of total variation of manifold-valued functions defined in [GiaquintaMucci2007, LellmannStrekalovskiyKoetterCremers2013]. Specifically in the context of discrete surfaces, we mention [Sullivan2006] where the term appears as the total mean curvature of the edge . Here is the exterior dihedral angle, which agrees with , see (5). Consequently, (4) can be written as . Moreover, (4) appears as a regularizer in [WuZhengCaiFu2015] within a variational model for mesh denoising but the geodesic distances are approximated for the purpose of numerical solution. We also mention the recent [PellisKilianDellingerWallnerPottmann2019] where (4) appears as a measure of visual smoothness of discrete surfaces. Particular emphasis is given to the impact of the mesh connectivity. In our study, the mesh connectivity will remain fixed and only triangular surface meshes are considered in the numerical experiments.
In addition, we are aware of [ZhangWuZhangDeng2015, ZhongXieWangLiuLiu2018], where
was proposed in the context of variational mesh denoising. Notice that in contrast to (4), (7) utilizes the Euclidean as opposed to the geodesic distance between neighboring normals and is therefore an underestimator for (4).
2.2. Properties of the Discrete Total Variation of the Normal
In this section we investigate some properties of the discrete total variation of the normal. As can be seen directly from (4), a scaling in which is replaced by for some yields
This is the same behavior observed, e.g., for the total variation of scalar functions defined on two-dimensional domains. Consequently, when studying optimization problems involving (4), we need to take precautions to avoid that degenerates to a point. This can be achived either by imposing a constraint, e.g., on the surface area, or by considering tracking problems in which an additional loss term appears.
2.2.1. Simple Minimizers of the Discrete Total Variation of the Normal
In this section, we investigate minimizers of subject to an area constraint. More precisely, we consider the following problem. Given a triangulated surface mesh consisting of vertices , edges and facets , find the mesh with the same connectivity, which
To the best of our knowledge, a precise characterization of the minimizers of (8) is an open problem and the solution depends on the connectivity; compare the observations in [PellisKilianDellingerWallnerPottmann2019, Section 4]. That is, different triangulations of the same (initial) mesh, e.g., a cube, may yield different minimizers. We also refer the reader to [AlexaWardetzky2011] for a related observation in discrete mean curvature flow.
We do have, however, the following partial result. For the proof, we exploit that (4) coincides with the discrete total mean curvature and utilize results from discrete differential geometry. The reader may wish to consult [MeyerDesbrunSchroederBarr2003, Polthier2005, Wardetzky2006, BobenkoSpringborn2007, CraneDeGoesDesbrungSchroeder2013]. The icosahedron and the cube with crossed diagonals are stationary for (8) within the class of triangulated surfaces of constant area and identical connectivity.
Let us consider the Lagrangian associated with (8),
Here denote the coordinates of vertex and is the total number of vertices of the triangular surface mesh. Notice that the normal vectors , edge lengths and facet areas depend on these coordinates. The gradient of (9) w.r.t. can be represented as
see for instance [CraneDeGoesDesbrungSchroeder2013]. Here denotes the index set of vertices adjacent to vertex . For any , denotes the edge between vertices and . Moreover, and are the angles as illustrated in Figure 3.
For the icosahedron with surface area , all edges have length . Moreover, since all facets are unilaterial triangles, holds. Finally, the exterior dihedral angles are all equal to . Consequently, the Lagrangian is stationary for the Lagrange multiplier .
We remark that (4) and thus (10) is not differentiable when one or more of the angles are zero.
This is the case for the cube with crossed diagonals, see Figure 3.
However, the right hand side in (10) still provides a generalized derivative of in the sense of Clarke.
In contrast to the icosahedron, the cube has two types of vertices. When is the center vertex of one of the lateral surfaces, then and for all . Moreover, since holds, is an element of the generalized (partial) differential of at w.r.t. , independently of the value of the Lagrange multiplier . Now when is a vertex of corner type, we need to distinguish two types of edges. Along the three edges leading to neighbors of the same type, we have a exterior dihedral angle of , length and . Along the three remaining edges leading to surface centers, we have and . Thus for vertices of corner type, it is straightforward to verify that belongs to the generalized (partial) differential of at w.r.t. if
holds, which is true for the obvious choice of . ∎
Numerical experiments indicate that the icosahedron as well as the cube are not only stationary points, but also local minimizers of (8). We can thus conclude that the discrete objective (4) exhibits different minimizers than its continuous counterpart (1) for smooth surfaces. In particular, (4) admits and promotes piecewise flat minimizers such as the cube. This is in accordance with observations made in [PellisKilianDellingerWallnerPottmann2019, Section 3.2] that optimal meshes typically exhibit a number of zero dihedral angles. This property sets our functional apart from other functionals previously used as priors in shape optimization and geometric inverse problems. For instance, the popular surface area prior is well known to produce smooth shapes; see the numerical experiments in section 2.2.3 below.
|number of edges||12||6||30|
|exterior dihedral angle|
2.2.2. Comparison of Discrete and Continuous Total Variation of the Normal
In this section we compare the values of (1) and (4) for a sphere , and a sequence of discretized spheres . For comparison, we choose to have the same surface area as the cube in the previous section, i.e., we use as the radius. It is easy to see that since the principal curvatures of a sphere of radius are , (1) becomes
which amounts to for the sphere under consideration.
To compare this to the discrete total variation of the normal, we created a sequence of triangular meshes of this sphere with various resolutions using and evaluated (4) numerically. The results are shown in Table 2. They reveal a factor of approximately between the discrete and continuous functionals for the sphere. To explain this discrepancy, recall that the principal curvatures of the sphere are . This implies that the derivative map has rank two everywhere. Discretized surfaces behave fundamentally different in the following respect. Their curvature is concentrated on the edges, and one of the principal curvatures (the one in the direction along the edge) is always zero. So even for successively refined meshes, e.g., of the sphere, one is still measuring only one principal curvature at a time. We are thus led to the conjecture that the limit of (4) for sucessively refined meshes is the anisotropic, yet still intrinsic measure , whose value for the sphere in Table 1 is . The factor can thus be attributed to the ratio between the - and -norms of the vector . This observation is in accordance with the findings in [PellisKilianDellingerWallnerPottmann2019, Section 1.2].
One could consider an isotropic version of (4) in which the dihedral angles across all edges meeting at any given vertex are measured jointly. These alternatives will be considered elsewhere.
2.2.3. Discrete Total Variation Compared to Surface Area Regularization
In this section we consider a specific instance of the general problem (3) and compare our discrete TV functional with the surface area regularizer. We begin with a triangular surface mesh of a boxtimes the average edge length. We denote the noisy vertex positions as and utilize a simple least-squares functional as our loss function and consider the following mesh denoising problem,
Here the sum runs over the vertices of . For comparison, we also consider a variant
where we utilize the total surface area as prior.
A numerical approach to solve the non-smooth problem (11) will be discussed in section 3. By contrast, problem (12) is a fairly standard smooth discrete shape optimization problem and we solve it using a simple shape gradient descent scheme. The details how to obtain the shape derivative and shape gradient are the same as described in section 4.2 for problem (11).
Figure 4 shows the numerical solutions of (11) and (12) for various choices of the regularization parameters and , respectively. The initial guess for both problems is a sphere with the same connectivity as . We can clearly see that our functional (4) achieves a very good reconstruction of the original shape for a proper choice of . By contrast, the surface area regularization requires a relatively large choice of in order to reasonably reduce the noise, which in turn leads to a significant shrinkage of the surface and a rounding of the sharp features.
3. Discrete Split Bregman Iteration
In this section, we develop an optimization scheme to solve the non-smooth problem (2). To this end, we adapt the well-known split Bregman method to our setting. This leads to a discrete realization of the approach presented in [BergmannHerrmannHerzogSchmidtVidalNunez2019:1_preprint, section 4]. Recall that combining (2) with (3) results in the problem
where are the edges of the unknown part of the boundary . We will consider a concrete example in Section 4.1.
Notice that the second term in the objective in (13) is non-differentiable whenever occurs on at least one edge. Following the classical split Bregman approach, we introduce a splitting in which the variation of the normal vector becomes an independent variable. Since this variation is confined to edges, where the normal vector jumps (without loss of generality) from to , this new variable becomes
Here denotes the logarithmic map, which specifies the unique tangent vector at the point such that the geodesic departing from in that direction will reach at unit time. The logarithmic map is well-defined whenever . Moreover, holds; see (27) for more details.
The vectors and are simply the collections of their entries , three components per edge . Hence, since the tangent space changes between shape updates, the respective quantities have to be parallely transported, which is a major difference to ADMM methods in Euclidean or Hilbert spaces.
We state the split Bregman iteration in Section 3.
Split Bregman method for (13)
Parallely transport the multiplier estimateon each edge from to along the geodesic from to
We now address the individual steps of section 3 in more detail, i.e., the successive minimization with respect to the unknown vertices of and , followed by an explicit update for the multiplier .
Step 4 is the minimization of (15) with respect to the unknown vertex positions of . To this end, we employ a gradient descent scheme, where we compute the sensitivities with respect to those node positions discretely, see Section 4.2 for more details. Following [GoldsteinOsher2009], an approximate minimization suffices, and thus only a certain number of steepest descent steps are performed. After has been updated to , the quantity has to be parallely transported into the new tangent space , see step 5, which is detailed in (28) for more details.
Step 6 is the optimization of (15) with respect to , which is a non-smoooth problem. It can be solved explicitly by one vectorial shrinkage operation per edge . Given the data and associated normal field , as well as multiplier parallely transported into , the minimizer of (15) is given by
for each edge . Notice that (16) is independent of the previous value and thus a parallel transport of into the updated tangent space is not necessary.
Step 7 is the multiplier update for , which is done explicitly via
for each edge .
4. An EIT Model Problem and its Implementation in
In this section we address some details concerning the implementation of Section 3 in the finite element framework (version 2018.2.dev0), [LoggMardalWells2012:1, AlnaesBlechtaHakeJohanssonKehletLoggRichardsonRingRognesWells2015]. For concreteness, we elaborate on a particular reduced loss function where the state arises from a PDE modeling a geological electrical impedance tomography (EIT) problem with Robin-type far field boundary conditions. We introduce the problem under consideration first and discuss implementation details and derivative computations later on.
4.1. EIT Model Problem
Electrical impedance tomography (EIT) problems are a prototypical class of inverse problems. Common to these problems is the task of reconstructing the internal conductivity inside a volume from boundary measurements of electric potentials or currents. These problems are both nonlinear and severely ill-posed and require appropriate regularization; see for instance [SantosaVogelius1990, CheneyIsaacsonNewell1999, ChungChanTai2005].
Traditionally, EIT problems are modeled with Neumann (current) boundary conditions and the internal conductivity is an unknown function across the entire domain. In order to focus on the demonstration of the utility of (2) as a regularizer in geometric inverse problems, we consider a simplified situation in which we seek to reconstruct a perfect conductor inside a domain of otherwise homogeneous electrical properties.
Consequently, the unknowns are the vertex positions of the interface of the inclusion. As a perfect conductor shields its interior from the electric field, there is no necessity to mesh and simulate the interior of the inclusion. However, we mention that our methodology can be extended also to interface problems, non-perfect conductors and other geometric inverse problems.
The perfect conductor is modeled via a homogenous Neumann condition on the unknown interior boundary of the domain . To overcome the non-uniqueness of the electric potential, we employ Robin boundary conditions on the exterior boundary . The use of homogeneous Robin boundary conditions to model the far field is well-established for geological EIT problems; see, e.g., [Helfrich-Schkarbanenko2011]. We use them here also for current injection.
The geometry of our model is shown in Figure 5, where is the unknown boundary of the perfect conductor and is a fixed boundary where currents are injected and measurements are taken. We assume that experiments are conducted, each resulting in a measured electric potential , the finite element space consisting of piecewise linear, globally continuous functions on the outer boundary . Experiment # is conducted by applying the right hand side source
, which is the characteristic function of one of the colored regions shown inFigure 5. Here, denotes the space of piecewise constant functions. We then seek to reconstruct the interface of the inclusion by solving the following regularized least-squares problem of type (3),
with respect to the vertex positions of . Here is the computed electric field for source . Hence, the problem features PDE constraints with identical operator but different right hand sides.
As detailed in Section 4.2, we compute the shape derivative of the least-squares objective and the PDE constraint separately from the shape derivative of the regularization term. To evaluate the former, we utilize a classical adjoint approach. To this end, we consider the Lagrangian
for . The differentiation w.r.t. leads to the following adjoint problem for :
The above adjoint PDE was implemented by hand. Since all forward and adjoint problems are governed by the same differential operator, we assemble the associated stiffness matrix once and solve the state and adjoint equations via an ILU-preconditioned conjugate gradient method.
Provided that and solve the respective state and adjoint equations, the directional derivative of coincides with the partial directional derivative of , both with respect to the vertex positions. In practice, we evaluate the latter using the coordinate derivative functionality of as described in the following subsection.
4.2. Discrete Shape Derivative
We now focus on computing the sensitivity of finite element functionals, when mesh vertices of are moved in accordance to with . As discussed in [HamMitchellPaganiniWechsung2018_preprint], a convenient way to compute this within the finite element world is by tapping into the transformation of the reference element to the physical one. Hence, we use the symbol for this object and we obtain it using the coordinate derivative functionality, first introduced in release 2018.2.dev0.
Our split Bregman scheme requires the shape derivative of (15), which is given by
originates from the splitting approach (15). Because our design variable is only, we introduce the extension of to the volume
by padding with zeros. Furthermore, a reduction to boundary only sensitivities can also be motivated from considering shape derivatives in the continuous setting, see[BergmannHerrmannHerzogSchmidtVidalNunez2019:1_preprint, Section 3].
The term is computed via the adjoint approach as explained above,
In order to employ this AD functionality, (21) needs to be given as a UFL form, a domain specific language based on Python, which forms the native language of the framework, see [AlnaesLoggOlgaardRognesWells2014]. Such a UFL representation is easy to achieve if all mathematical expressions are finite element functions. Notice that and in (21) are constant functions on the edges of the boundary mesh representing . We can thus represent them in the so called HDivTrace space of lowest order in .
From the directional derivatives (20), we pass to a shape gradient on the surface w.r.t. a scaled scalar product by solving a variational problem. This problem involves the weak form of a Laplace–Beltrami operator with potential term and it finds such that
holds for all test functions .
The previous procedure provides us with a shape gradient on the surface alone. In order to propagate this information into the volume , we solve the following mesh deformation equation: find such that
for all test functions with zero Dirichlet boundary conditions, where is subject to the Dirichlet boundary condition on and on . Subsequently, the vertices of the mesh are moved in the direction of .
4.3. Intrinsic Formulation Using Co-Normal Vectors
We recall that our functional of interest (2) is formulated in terms of the unit outer normal of the oriented surface . This leads to the term (21) inside the augmented Lagrangian (15). In order to utilize the differentiation capability of w.r.t. vertex coordinates, we need to represent (21) in terms of an integral. Since the edges are the interior facets of the surface mesh for , and and can be represented as constant on edges as explained above, (21) can indeed be written as an integral w.r.t. the interior facet measure dS on . Then, however, the outer normal vectors appearing in the term are not available. We remedy the situation by observing that the geodesic distance between two normal vectors and on the two triangles and sharing the edge can also be expressed via the co-normal (or in-plane normal) vectors , , as is shown in Figure 6. Indeed, one has
Since the co-normal vectors are intrinsic to the surface , they are available on while and are not.
5. Numerical Results
In this section we present numerical results obtained with Section 3 for the geological impedance tomography model problem described in the previous section. The data of the problem are given in Table 3 and the initial guess of the inclusion , as well as the true inclusion, are shown in Figure 7. The state and adjoint state were discretized using piecewise linear, globally continuous finite elements on a tetrahedral grid of minus the volume enclosed by . The mesh has vertices and tetrahedra. Regarding the shape optimization problem of Section 3, we perform gradient steps per split Bregman iteration combined with an Armijo linesearch with starting step size of . Also, we stop the whole algorithm, i.e., the outer Bregman iteration, when the initial gradient of the above mentioned shape optimization problem has a norm below in the sense of (22).
|measurement boundary||boundary of|
|boundary of true inclusion||boundary of|
|initial guess for||boundary of|
|number of measurements|
|Robin coefficient in (17)|
|split Bregman parameter|
|standard deviation of noise||or|
|regularization parameter …|
|for total variation regularization|
|for surface area regularization||,|
|shape step size|
In Figure 8, we show the results obtained in the noise-free setting (top row) and with noise (bottom row). In the latter case, normally distributed random noise is added with zero mean and standard deviation
per degree of freedom ofon for each of the simulations of the forward model (17). The amount of noise is considerable when put in relation to the average range of values for the simulated states, which is
Due to mesh corruption, we have to remesh at some point in the cases with noise. Afterwards, we start again Section 3 with the remeshed as new initial guess.
For comparison, we also provide results obtained for a related problem in Figure 8, using the popular surface area regularization with the same data otherwise. For the surface area regularization, is replaced by , where are the facets of . Because the problem is smooth in this case, we apply a shape gradient scheme directly rather than a split Bregman scheme and terminate as soon as the norm of the gradient falls below . The regularization parameters and are selected by hand in each case. Automatic parameter selection strategies can clearly be applied here as well, but this is out of the scope of the present paper.
As is expected and well known, the use of surface area regularization leads to results in which the identified inclusion is smoothed out. This can be explained by the observation that the gradient based minimization of the surface area yields a mean curvature flow. By contrast, our novel prior (4) allows for piecewise flat shapes and thus the interface is closely reconstructed in the noise-free situation. Even in the presence of noise, the reconstruction is remarkably good. In particular, the flat lateral surfaces and sharp edges can be identified quite well.
In this paper we introduced a discrete analogue of the total variation prior for the normal vector field as shown in [BergmannHerrmannHerzogSchmidtVidalNunez2019:1_preprint].