In classical solid mechanics, a set of constraints and conservation laws, such as the equations of compatibility and equilibrium, describe the deformation of material bodies Oden2011 . These equations result from fundamental physical principles and hold for any material body irrespective of its material properties. To close the associated boundary value problem, constraints and conservation laws must be accompanied by constitutive equations that relate the quantities of the phase space, such as stresses and strains. In classical solid mechanics, constitutive equations are provided by phenomenological material models that can be calibrated via observational data. The design, mathematical formulation and numerical solution of material models has been an important research field since the beginnings of solid mechanics Simo2006 ; De2011 and remains the subject of extensive ongoing research to date.
In contrast to balance laws that have axiomatic character, material models are empirical and therefore constitute a source of error and uncertainty, especially when the material behavior is complex. Given that experimental data on the constitutive behavior is scarce, material modeling seems to be an inevitable step. Over the last few years, however, materials science has been undergoing a remarkable transition from a data-starved to a data-rich field, with an increasing number of scenarios where an abundance of data is available to characterize the constitutive behavior of materials. This is mainly due to technological advances in the field of experimental measurements, data storage and data processing among others. In this context, Data-Driven Computational Mechanics, a new computing paradigm originally initiated by Kirchdoerfer and Ortiz Kirchdoerfer2016 , is currently emerging. Its idea is to reformulate classical boundary value problems of elasticity and inelasticity in such a way that empirical material models are replaced by experimental material data described in the phase space. Its overarching goal is to eliminate the modeling error and uncertainty of phenomenological material models and instead directly exploit the available wealth of experimental data in its entirety. However, a new source of error appears that is related to the measurements in themselves and to the measuring chain. Up to this point, it is not clear, which of the error sources is the most harmful.
At this early stage of development, one can differentiate two basic approaches in Data-Driven Computational Mechanics. Following Kirchdoerfer and Ortiz Kirchdoerfer2016 , one class of methods results in a data-driven solver that seeks to assign to each material point a point in the phase space, which besides satisfying compatibility and equilibrium is closest to the data set provided. Its formulation is based on a discrete-continuous optimization problem that minimizes the distance between the material data set and the subspace of compatible strain fields and stress fields in equilibrium. First attempts exist that extend this approach beyond linear elasticity, for instance to geometrically nonlinear elasticity Nguyen2018 , general elasticity Conti2018 , elastodynamics Kirchdoerfer2018 , and inelasticity Eggersmann2019
. The current discrete-continuous approach, however, is computationally expensive and missing robustness in certain situations. In particular, it exhibits strong sensitivity to scattering of the data set, and solution by meta-heuristic optimization techniques leads to relatively poor convergence. First attempts have been published that try to mitigate these drawbacks,e.g.,
a maximum entropy scheme that increases robustness with respect to outliersKirchdoerfer2017 , or the mathematically well-behaved formulation of the discrete-continuous optimization problem as a computationally tractable mixed-integer quadratic optimization problem Kanno2019 .
The second class of methods follows the data-driven inverse approach of Cueto, Chinesta and collaborators Ibanez2018a
who seek to reconstruct a constitutive manifold from data, using manifold learning methods. In the case of elasticity, the goal is to use data to identify a suitable approximation of the strain energy density functional, whose first and second derivatives result in the stress tensor and the elastic tangent, respectively. In a broader contextIbanez2017 , it is proposed to identify the locally linear behavior and for non-convergent cases, it is proposed to find the intersection between the equilibrium and constitutive manifolds. And recently, this approach has been successfully applied to the setting of “General Equation for Non-Equilibrium Reversible-Irreversible Coupling” Gonzalez2019 . Although this approach has a number of advantages, such as straightforward reassurance of thermodynamic consistency, the transfer of data-intensive computations in an off-line step, the potential for nonintrusive implementation in standard codes, it also entails a number of significant limitations. Most importantly, the method relies in the assumption of constitutive manifolds with a special functional structure and is thus limited to the explicit definition of stress, at least in its incremental form.
The goal of the current work is to explore a synergistic compromise between these two classes of methods that combines their strengths and mitigates some of their main weaknesses. Our main focus is the formulation of an approximate nonlinear optimization problem. On the one hand, it improves computational efficiency and robustness with respect to the current discrete-continuous optimization approach of the first class of data-driven solvers. In particular, our approximate nonlinear optimization problem can be solved locally with standard Newton type nonlinear optimization methods, without the need to resort to more exotic options such as meta-heuristic methods that lack well-matured mathematical foundations when compared to gradient-based methods. On the other hand, it does not rely on special assumptions of the functional structure of the reconstructed constitutive manifold as the second class of data-riven solvers that potentially re-introduce modeling errors and uncertainty. Additional benefits of our optimization approach include the natural incorporation of kinematic constraints and the possibility to operate with implicitly defined stress-strain relations, which enlarges the range of material behavior that can be addressed. As our primary goal is a proof of concept for our new approach, we will use artificial rather than real measurements when required. We showcase the advantages of our approach for the case of a data-driven geometrically exact beam element that makes use of all components of our methodology.
The structure of the article is as follows: Section 2 provides a concise review of some pertinent elements of computational nonlinear solid mechanics and fixes our notation and terminology that we will use in the remainder of the article. In Section 3, we first describe the full discrete-continuous optimization problem in its global format. We then describe in detail our new approach based on an approximate nonlinear optimization problem. In particular, we discuss the approximated implicitly defined constitutive manifold, the associated Lagrangian functional and the Karush–Kuhn–Tucker (KKT) conditions, the linearization of such KKT conditions and the resulting KKT matrix in explicit format. We note that the formulations presented in Section 3 are general and only assume a discretization of the problem in the context of the finite element method. In Section 4, we first illustrate our optimization approach for a simple truss element and then proceed to the more complex case of the geometrically exact beam element. Section 5 presents numerical examples for the geometrically exact beam that demonstrate the advantages of our approach. We close with a summary and the main conclusions in Section 6.
2 Framework of computational nonlinear solid mechanics
2.1 Continuous setting
Let us consider a continuum body characterized by two reference sets denoted by , the original configuration, and , the current configuration, both open sets of ( with the standard Euclidean structure), see Figure 1. Here is a smooth regular motion from the original configuration to the current one, i.e., its inverse and derivatives are well defined everywhere, and the symbol denotes a suitable composition rule. The parameter is a smooth parameter intended for the indexation of all possible successive configurations. Here, a chart is given by the pair , with a subset of and the mapping function
. The configuration of the system is described by the vector field, and the static problem can be thoroughly formulated by means of the following (non-variational) set of equations:
Here and indicate that a rank- tensor is times covariant and times contravariant, respectively.
In the first equation (1a), the displacement-based Green-Lagrange strain tensor, an element of with
indicating the skew-symmetric part of the tensor considered, is given in the curvilinear setting by
The pullback of the metric tensor at through the regular motion , i.e., , is defined as
where are the components of the Euclidean metric tensor and denotes the outer product. The elements of the contravariant basis are defined as , where with from to contains the elements of the standard orthonormal basis in , i.e., the space of column vectors. The elements of the covariant basis are defined such that . The admissible variation of the displacement-based Green-Lagrange strain tensor is denoted by and, as stated previously, this is intrinsically related to .
In the second equation (1b), the so-called balance equation, we use the property
to define , the vector density of internal forces. The angle brackets denote an appropriate dual pairing, in which is a vector space (whose elements are called vectors) and is its dual space (whose elements are called covectors or one-forms), and the double brackets represent an appropriate double dual pairing. denotes any admissible variation of the configuration vector. Moreover, is the Jacobian matrix of , a finite-dimensional field of integrable constraints specified in the third equation (1c). These constraints can be used for instance to impose boundary conditions. represents the corresponding Lagrange multipliers, and denotes the vector density of external forces.
Furthermore, it is worthwhile to note that the Green-Lagrange strain tensor, , is to be separately accounted for as a primal field. The specific strain measure chosen is accompanied by the second Piola-Kirchhoff stress tensor, an element of , which in the case of general hyperelastic materials is given by , where denotes the associated strain energy. We recall that a three-field variational principle, for instance, requires this particular functional structure. In the context of the present work, we do not rely on the standard functional structure between the strain and stress tensors. If one instead considers the existence of a strain energy function and strongly imposes the compatibility condition, which can be done by direct substitution of the strain tensor (i.e., replacing by everywhere when needed), then, the governing equations can be reverted to a fully variational set of equations.
2.2 Discrete setting
By using, for instance, isoparametric finite elements, the governing equations for a single element can be discretely approximated as
The superscript “
” denotes that the field is discretized in space by means of the interpolation of nodal or elemental discrete variables and the matrices, and contain the admissible test functions chosen. In addition, the strain and stress tensor fields are represented by an adequate notation that takes advantage of their symmetries, i.e., and . As a direct consequence, the discrete version of can be written as and the discrete version of as . Moreover, , where the approximated displacement field is computed in terms of a finite dimensional set of generalized coordinates, i.e., . spans the master domain symbolized as , and is the discrete version of the volume element. Finally, the integrals are computed by means of a given integration scheme, for instance Gauss quadrature, yielding the final format of the discrete governing equations that will be used in the remainder of this article.
3 Optimization problems for Data-Driven Computational Mechanics
In the context of Data-Driven Computational Mechanics, the simplest scalar cost function to be minimized can be defined as Kirchdoerfer2016
Here comprises continuous strain and stress variables and , respectively, the given data set contains finitely many strain and stress measurements , (with ) is a symmetric positive-definite weight matrix with inverse , and and are norms derived from the inner product. At this point, we do not specify the dimension because it depends strongly on the model considered. For instance, we have for a Reissner-Simó beam Romero2004 and for a Kirchhoff rod Romero2019 , and therefore, we leave this detail out of the current discussion. The cost function (6) has to be minimized under the following constraints111As this section focuses on the overall optimization problem, we skip details on domain integration, quadrature scheme chosen, etc. Further details on such aspects can be consulted elsewhere.: i) the compatibility equation that enforces the equivalence between strain variables and displacement-based strains,
in which is the vector of generalized coordinates employed to describe the system kinematics, with standing for the configuration manifold; ii) the balance equation that establishes the static equilibrium,
in which is the Jacobian matrix of the displacement-based strains, is the Jacobian matrix of the kinematic constraints, is the corresponding vector of Lagrange multipliers, and represents the vector of generalized external loads (the superscript “” has been dropped); and, iii) the kinematic constraints
a finite set of integrable restrictions that belongs to . In the finite element context, the operator and the Jacobi matrix are typically linear in , which simplifies the computation of related higher-order derivatives enormously. We will see this later on in Section 4. A counterexample to this is, for instance, a geometrically exact beam element in which the rotations are parametrized with the Cartesian rotation vector Cardona1988 . To avoid problems caused by overdetermination and singular KKT matrices in the subsequent optimization problems, we eliminate the Lagrange multipliers from the balance equation. This goal can be achieved by several techniques, such as projection onto a space orthogonal to the constraint forces, reduction by means of coordinate partitioning, or redefinition of the derivative on the configuration manifold (connection), see for instance Heard2006 . Here, we choose the first approach, known as the null-space approach, which requires a null-space basis matrix for with
, the system’s number of degrees of freedom, and, such that
Then the balance equation adopts the form
and further on, the (unique) vector of Lagrange multipliers is
and the unique orthogonal null-space projector can be expressed as
Additionally, an infinite number of non-orthogonal (oblique) null-space projectors exist. In practice, it is often more efficient to obtain the vectors and from an implicit representation of than forming explicitly. Details can be found in standard textbooks on numerical linear algebra or on optimization, such as Golub-van_Loan:1989 ; Nocedal-Wright:2006 . In the context of slender structures, the reader may refer to Gebhardt2017 ; Hente2019 and references therein for the efficient calculation of the null-space basis.
3.1 A discrete-continuous nonlinear optimization problem
Employing directly the strain and stress measurements, a discrete-continuous nonlinear optimization problem (briefly called a DCNLP here) can be stated as
Note that the discrete variables appear only in the cost function. In (14) there is no mathematical structure that relates these discrete variables to each other; a mathematically well-behaved formulation with a binary selection variable for each pair would be called a mixed-integer nonlinear optimization problem (MINLP). For fixed , the problem becomes a smooth nonlinear optimization problem (NLP), referred to as NLP, with associated Lagrangian function
where are Lagrange multipliers of the compatibility equation, are Lagrange multipliers of the balance equation premultiplied by the null-space basis matrix, and are Lagrange multipliers of kinematic constraints. For a fixed measurement pair , any solution of NLP provides a set of values that locally minimizes the cost function under the given constraints. To derive the corresponding first-order optimality conditions, we calculate the variation of as
This variation has to vanish for any choice of the varied quantities, and hence we obtain the following primal-dual system of equations, which are known as the KKT conditions in optimization:
Notice that and can be eliminated by direct substitution of and . However, in order to study the overall problem we are not going to eliminate anything unless strictly necessary.
Now, the second term of the first equation can be evaluated as
The linearization of the variation of can be expressed as usual as
with the primal-dual NLP variable vector
The KKT matrix is symmetric indefinite. It can be written as
Even though explicit expressions for these definitions depend on the kinematic description adopted, the format proposed here is generally valid. As the KKT matrix is non-singular, all local minima of NLP are strict minima. Hence, standard nonlinear optimization methods that reduce to a full-step Newton iteration in the local convergence area are highly suited for solving NLP.
The overall DCNLP is often treated by meta-heuristic methods. However, since it has no useful structure with respect to the discrete variables , a mathematically rigorous solution requires enumeration, that is, finding the minimal value over all measurements by solving every NLP globally. Alternatively, a well-behaved MINLP formulation could be solved by rigorous mathematical methods, but this is highly expensive already for relatively “easy” special cases such as the one discussed in Kanno2019 . Therefore we suggest a different approach: to add suitable structure that enables us to replace the DCNLP with a single approximating NLP.
3.2 An approximate nonlinear optimization problem
An approximate NLP can be stated as
The new strain and stress variables and are parameters that describe an underlying constitutive manifold in its reconstructed version (an approximation), which is implicitly defined as
and that satisfies
for some accuracy tolerance . Additionally, physical consistency requires that implies and implies . The idea here is to replace the measurement data set by enforcing the state to belong to the reconstructed constitutive manifold that has a precise mathematical structure and that is derived from the same data set. The underlying assumption is, of course, that such a constitutive manifold exists and that we can reconstruct a (smooth) implicit representation whose linearization takes the simple form
which can be interpreted as a sort of hidden constraint. For materials with symmetric properties, a further condition can be stated as
requiring the regularity of and . The reconstructed constitutive manifold will enormously facilitate the task of the data-driven solver, avoiding the cost of solving a DCNLP, either by enumeration or by heuristic or meta-heuristic methods. The latter can in general only provide approximate solutions that strongly depend on the initial guess and whose convergence properties are inferior when compared to gradient-based methods.
The reconstruction of the hyperelastic energy function is very attractive due to three main reasons: i) it ensures the thermomechanical consistency and then, all symmetries are retained; ii) reconstructing a single scalar function represents a smaller computational expense than reconstructing all components of the elastic tensor; and, iii) no reformulation of the finite element technology is necessary. However, in the case of new composite materials or meta materials that exhibit non-convex responses, the reconstruction of the energy function may not be very convenient. More importantly, in some cases the formulation of an energy function may not even be possible. Thus, we adopt the constitutive manifold as introduced previously without assuming any special functional structure of the constitutive constraint . Further specializations are possible and should be instantiated for specific applications of the proposed formulation.
The Lagrangian function of the approximate NLP is then given by
where are Lagrange multipliers that correspond to the enforcement of the strain and stress states to remain on the constitutive manifold.
Once again, to find the first-order optimality conditions, the variation of is calculated as:
Setting this to zero for any choice of the varied quantities, we obtain the following KKT conditions:
The linearization of the variation of can be expressed as
The explicit form of the KKT matrix is
Again the KKT matrix is non-singular, hence all local minima are strict and the approximate NLP can be solved robustly. In the following section, we provide an example that demonstrates how to derive the concrete approximate NLP for a given finite element formulation.
4 Application of the proposed approach
In this section, we describe two structural models that are reformulated within the proposed setting of Data-Driven Computational Mechanics. The first model is a data-driven truss element that serves as a starting point; this element type was already successfully investigated in Kirchdoerfer2016 ; Nguyen2018 . In contrast to the approaches available in the literature, we apply the framework developed in the previous section to the truss element, unveiling details of its global format that to the best of our knowledge have not yet been published elsewhere. Therefore we call this an “illustrative example”. The second model, a main innovation of the present work, is a data-driven geometrically exact beam that is given in a frame-invariant path-independent finite element formulation. This model relies on a kinematically constrained approach, where the orientation of the cross section is described by means of three vectors that are constrained to be mutually orthonormal. Both examples have favorable mathematical structures that are exploited to derive the required finite element machinery analytically.
4.1 Data-driven truss element (illustrative example)
The position of any point belonging to a truss element can be written as
with and in being the positions of both ends, in which is the reference length and is a spatial variable. By defining
we can compute the axial Green-Lagrange strain as
where is a symmetric matrix and the subscript “” indicates the stress-free configuration.
The operator , which relates the variation of displacement-based strains with the variation of the kinematic fields through the relation , has the explicit form
and its derivative with respect to is
For simplicity, we consider constraints that are linear in ,
where is a reference offset vector and is the constant Jacobian matrix. The null-space matrix is also constant and can be determined by inspection once the constraints are specified. For this case the cost function to be minimized adopts the very simple form
where the scalar is just a weight factor. Finally, by putting all pieces together, we obtain the KKT conditions of NLP: