Enabling the next-generation of robots that can, for example, enter a kitchen and prepare dinner, requires new control algorithms capable of navigating complex environments and performing dexterous manipulation of real-world objects, as illustrated in Figure 1
. Machine-learning based approaches hold the promise of unlocking this capability, but require large amounts of data to work effectively[Levine et al., 2018]. For many tasks, gathering this data from the real-world may be inefficient, or impractical due to safety concerns. In contrast, simulation is relatively inexpensive, safe, and has been used to learn and transfer behaviors such as walking and jumping to real robots [Tan et al., 2018][Sadeghi et al., 2017]
. Extending transfer learning beyond locomotion to a wider range of behaviors requires the robust and efficient simulation of richer environments incorporating multiple physical models[Heess et al., 2017].
We believe the computer graphics community is uniquely placed to address the simulation needs of robotics. One area that computer graphics has studied extensively is two-way coupled simulation of rigid and deformable objects. Such algorithms are necessary to simulate tasks involving dexterous manipulation of soft objects, or even soft robots themselves. An example of the latter is found in the PneuNet gripper [Ilievski et al., 2011] as shown in Figure 2. This design incorporates a flexible elastic gripper, an inextensible paper layer, and a chamber that is pressurized to cause the finger to curve. An additional example is shown in Figure 3 where a deformable ball is manipulated by a humanoid robot hand made of rigid parts. This type of multiphysics scenario is challenging for simulation because the internal forces may be large, and they must interact strongly with contact to allow robust grasping.
Methods for simulating multi-body systems in the presence of contact and friction most commonly formulate a linear complementarity problem (LCP) that is solved in one of two ways: relaxation methods such as projected Gauss-Seidel (PGS), or direct methods such as Dantzig’s pivoting algorithm. Relaxation methods are popular due to their simplicity, but suffer from slow convergence for poorly conditioned problems [Erleben, 2013]. In contrast, direct methods can achieve greater accuracy, but are typically serial algorithms that scale poorly with problem size. Finding methods that combine the simplicity and robustness of relaxation methods with the accuracy of direct methods remains a challenge in computer graphics and robotics. While solving LCP problems efficiently is still the subject of active research, as a model they may not capture all of the dynamics we wish to simulate. For example, hyperelastic materials have highly nonlinear forces that significantly affect behavior compared to linear models [Smith et al., 2018]. In addition, contact models themselves may be nonlinear particularly when considering compliance and deformation [Li and Kao, 2001].
In this work we develop a framework based on Newton’s method that solves the underlying nonlinear complementarity problems (NCPs) arising in multi-body dynamics. We combine a nonlinear contact model, articulated rigid-body model, and a hyperelastic material model as a system of semi-smooth equations, and show how it can be solved efficiently. A key advantage of our Newton-based approach is that it allows the use of off-the-shelf linear solvers as the fundamental computational building block. This flexibility means we can choose to use accurate direct solvers, or take advantage of highly-optimized iterative solvers available for parallel architectures such as graphics-processing units (GPUs). In summary, our main contributions are:
A formulation of smooth isotropic Coulomb friction in terms of non-smooth complementarity functions.
A new complementarity preconditioner that significantly improves convergence for contact problems.
A generalized compliance formulation that supports hyperelastic material models.
A simple approximation of geometric stiffness to improve robustness without changing system dynamics.
2. Related Work
The seminal work of Jean and Moreau  introduced an implicit time-stepping scheme for contact problems with friction. This work was further popularized by Stewart and Trinkle  who linearize the Coulomb friction cone and solve LCPs using Lemke’s method in a fixed-point iteration to handle nonlinear forces. We also make use of a fixed-point iteration, but in contrast to their work we use non-smooth functions to model nonlinear friction cones. Kaufman et al.  proposed a method for implicit time-stepping of contact problems by solving two separate QPs for normal and frictional forces in a staggered manner. In our work we do not stagger our system updates but solve for contact forces and friction forces in a combined system, and without using a linearized cone. In addition, rather than using QP solves our method requires only the solution of a symmetric linear system as a building block, allowing for the use of off-the-shelf linear solvers.
Relaxation methods such as projected Gauss-Seidel are popular in computer graphics thanks to their simplicity and robustness [Bender et al., 2014]. While robust, these methods suffer from slow convergence for poorly conditioned problems, e.g.: those with high-mass ratios [Erleben, 2013]. In addition, Gauss-Seidel iteration suffers from order dependence and is challenging to parallelize, while Jacobi methods require modifications to ensure convergence [Tonge et al., 2012]. Daviet et al.  used a change of variables to restate the Coulomb friction cone into a self-dual complementarity cone problem followed by a modified Fischer-Burmeister reformulation to obtain a local non-smooth root search problem. In comparison, we work directly with the friction cone as limit surfaces as we believe this provides us with more modeling freedom, e.g.: for anisotropic or non-symmetric friction cones. Another difference is that we solve for all contacts simultaneously rather than one-by-one.
Otaduy et al.  presented an implicit time-stepping scheme for deformable objects that solves a mixed linear complementarity problem (MLCP) using a nested Gauss-Seidel relaxation over primal and dual variables. Prior work on simulating smooth friction models has used proximal-map projection operators [Jourdan et al., 1998; Jean, 1999; Erleben, 2017] which work by projecting contact forces to the friction cone one contact at a time until convergence. There has been considerable work to address the slow convergence of relaxation methods, Mazhar et al.  use a convexification of the frictional contact problem [Anitescu and Hart, 2004] to obtain a cone complementarity problem (CCP) and solve it using an accelerated version of projected gradient descent. Silcowitz et al. [2009; 2010] developed a method for solving LCPs based on non-smooth nonlinear conjugate gradient (NNCG) applied to a PGS iteration. Francu et al.  proposed an improved Jacobi method based on a projected conjugate residual (CR) method. We also make use of Krylov space linear solvers, however our Newton-based iteration is decoupled from the underlying linear backend, allowing the application of matrix-splitting relaxation methods, or even direct solvers.
Early work on Newton methods for contact problems used a formulation based on a generalized projection operator and an augmented Lagrangian approach to unilateral constraints [Alart and Curnier, 1991; Curnier and Alart, 1988]. Newton-based approaches found in libraries such as PATH [Dirkse and Ferris, 1995] have proved successful in practice, and have been applied to smooth Coulomb friction models in fiber assemblies in computer graphics [Bertails-Descoubes et al., 2011; Daviet et al., 2011; Kaufman et al., 2014]. These approaches formulate the complementarity problem in terms of non-smooth functions, and solve them with a generalized version of Newton’s method. This approach can yield quadratic convergence, although with a higher per-iteration cost than relaxation methods.
Todorov  observed that if solving a nonlinear time-stepping problem, e.g.: due to an implicit time-discretization, it makes little sense to perform the friction cone linearization, since the smooth contact model can be treated simply as an additional set of nonlinear equations in the system. This observation is at the heart of our method, but in contrast to their work we formulate friction in terms of arbitrary NCP-functions, and extend our framework to handle deformable bodies. Our approach is based on Newton’s method, and combined with our proposed preconditioner we show that it enables handling considerably more ill-conditioned problems than relaxation methods, while naturally accommodating nonlinear friction models. For a review of numerical methods for linear complementarity problems we refer to the book by Niebe and Erleben . For a review of non-smooth methods applied to dynamics problems we refer to the book by Acary & Brogliato .
2.2. Coupled Systems
There has been considerable work in computer graphics on coupling between rigid and deformable bodies. Shinar et al.  proposed a method for the coupled simulation of rigid and deformable bodies through a multi-stage update that peforms collisions, contacts, and stabilization in separate passes. In contrast, we formulate a system update that includes elastic and contact dynamics in a single phase. Duriez  showed real-time control of soft-robots using a co-rotational FEM model and Gauss-Seidel based constraint solver. Servin et al.  introduced a compliant version of elasticity that fits naturally inside constrained rigid body simulators. Their work was extended by Tournier et al.  who include a geometric stiffness term to improve stability. They use temporal coherence of Lagrange multipliers to build the system Jacobian and compute a Cholesky decomposition, followed by a projected-Gauss Seidel solve for contact. For smaller problems our approach is compatible with direct solvers, however we avoid the requirement of dense matrix decompositions by using a diagonal geometric stiffness approximation inspired by the work of Andrews et al. , this improves stability and allows us to apply iterative methods.
In this work we propose a generalized view of compliance, and give a recipe for constructing the compliance form of an arbitrary material model given its strain-energy density function in terms of principle stretches, strains, or other parameterization. As an example we show how to formulate the stable Neo-Hookean material proposed by Smith et al. . Liu et al.  propose a quasi-Newton method for hyperelastic materials based on Projective Dynamics [Bouaziz et al., 2014] and model contact through stiff penalty forces. In contrast we model contact through complementarity constraints that naturally fit into existing multi-body simulations.
|Generalized system coordinates|
|First time derivative of generalized coordinates|
|Second time derivative of generalized coordinates|
|Predicted or inertial system configuration|
|Generalized force function|
Generalized velocity vector
|System mass matrix|
|Geometric stiffness matrix|
|Friction force basis|
|Friction compliance matrix|
|Kinematic mapping transform|
|Material stiffness matrix|
|Strain energy density function|
|Bilateral constraint vector|
|Normal contact constraint vector|
|Frictional contact constraint vector|
|Separation distance for a contact|
|Normal contact NCP constraint vector|
|Frictional contact NCP constraint vector|
|Jacobian of bilateral constraints|
|Jacobian of contact normal NCP constraints|
|Jacobian of contact friction NCP constraints|
|Lagrange multiplier vector for bilateral constraints|
|Lagrange multipliers for normal contact constraints|
|Lagrange multipliers for friction contact constraints|
Number of system degrees of freedom
|Number of bilateral constraints|
|Number of contacts|
At the most general level, the dynamics of the systems we wish to simulate can be described through the following second order differential equation:
where for a system with degrees of freedom, is the system mass matrix, is a generalized force vector, and is a vector of generalized coordinates describing the system configuration with the first and second time derivatives respectively. We refer the reader to Table 1 for a list of symbols used in this paper.
3.1. Hard Kinematic Constraints
We impose hard bilateral (equality) constraints on the system configuration through constraint functions of the form . Using D’Alembert’s principle [Lanczos, 1970] the force due to such a constraint is of the form where is the constraint’s gradient with respect to the system coordinates, and is a Lagrange multiplier variable used to enforce the constraint. Combining these algebraic constraints with the differential equation (1) gives the following Differential Algebraic Equation (DAE):
For a system with equality constraints we define the constraint vector , its gradient with respect to system coordinates, and the vector of Lagrange multipliers. In general bilateral constraints may depend on velocity and time, we assume scleronomic constraints for the remainder of the paper to simplify the exposition.
3.2. Soft Kinematic Constraints
In addition to hard constraints, we make extensive use of the compliance form of elasticity [Servin et al., 2006]. This may be derived by considering a quadratic energy potential defined in terms of a constraint function and stiffness parameter
The force due to this potential is then given by
Here we have introduced the variable , defined as
We can move all terms to one side to write this as a constraint equation,
where is the compliance, or inverse stiffness coefficient. We can incorporate this into our equations of motion by defining the diagonal matrix , and updating (3) as follows,
This form is mathematically equivalent to including quadratic energy potentials defined in terms of a stiffness. The benefit of the compliance form is that it remains numerically stable even as [Tournier et al., 2015]. We describe how to extend this model to handle general energy potentials in Section 9.
We treat contacts as inelastic and prevent interpenetration between bodies through inequality constraints of the form
Here is the contact plane normal. We use the normalized direction vector between closest points of triangle-mesh features as the normal vector. The points and may be points defined in terms of a rigid body frame, or particle positions in the case of a deformable body. The constant is a thickness parameter that represents the distance along the contact normal we wish to maintain. Non-zero values of can be used to model shape thickness, as illustrated in Figure 4. The normal force for a contact is given by , with the additional Signorini-Fichera complementarity condition
which ensures contact forces only act to separate objects [Stewart, 2000]. We treat the contact normal as fixed in world-space. For a system with contacts, we define the vector of contact constraints as , their gradient , and the associated Lagrange multipliers as . In our implementation contact constraints are created when body features come within some fixed distance of each other. This approach works well for reasonably small time-steps, but can lead to over-constrained configurations. More sophisticated non-interpenetration constraints can be formulated to avoid this problem [Williams et al., 2017].
We derive our friction model from a principle of maximal dissipation that requires the frictional forces remove the maximum amount of energy from the system while having their magnitude bounded by the normal force [Stewart, 2000]. For each contact we define a two-dimensional basis . The contact basis vectors are lifted from spatial to generalized coordinates by the transform using the notation of Kaufman et al. . The generalized frictional force for a single contact is then , where is the solution to the following minimization
Here is the coefficient of friction, and
is the Lagrange multiplier for the normal force at this contact which for the moment we assume is known. This minimization defines an admissible cone that the total contact force must lie in, as illustrated in Figure5. The Lagrangian associated with this minimization is
where is a slack variable used to enforce the Coulomb constraint that the friction force magnitude is bounded by the times the normal force magnitude. When the problem satisfies Slater’s condition [Boyd and Vandenberghe, 2004] and we can use the first-order Karush-Kuhn-Tucker (KKT) conditions for (11) given by
Equation (13) requires that the frictional force act in a direction opposite to any relative tangential velocity. Equation (14) is a complementarity condition that describes the sticking and slipping behavior characteristic of dry friction. In the case that Slater’s condition is not satisfied, however in this case the only feasible point is , which we handle explicitly. We now make a transformation to remove the slack variable . To do this we observe that the derivative of a vector’s length is the vector normalized, so we have , which by definition is a unit-vector. We can use this fact to express the optimality conditions in terms of and only:
Finally we combine the frictional basis vectors and multipliers for all contacts in the system as a single matrix and vector .
3.5. Governing Equations
Assembling the above components, our continuous equations of motion are given by the following nonlinear system of equations:
where is the set of all contact indices where the normal contact force is active, and is its complement. This combination of a differential equation with equality and complementarity conditions is known as Differential Variational Inequality (DVI) [Stewart, 2000]. The coupling between normal and frictional complementarity problems makes the problem non-convex, and in the case of implicit time-integration leads to an NP-hard optimization problem [Kaufman et al., 2008]. In the next section we develop a practical method to solve this problem.
4. Nonlinear Complementarity
One successful approach [Ferris and Munson, 2000] to solving nonlinear complementarity problems is to reformulate the problem in terms of a NCP-function whose roots satisfy the original complementarity conditions, i.e.: functions where the following equivalence holds
Combined with an appropriate time-discretization, such a NCP-function turns our DVI problem into a root finding one. In general the functions are non-smooth, but allow us to apply a wider range of numerical methods [Munson et al., 2001].
4.1. Minimum-Map Formulation
The first such function we consider is the minimum-map defined as
The equivalence of this function to the original NCP can be verified by examining the values associated with each conditional case [Cottle, 2008]. We now show how this reformulation applies to our unilateral contact constraints. Recall that the complementarity condition associated with a contact constraint and its associated Lagrange multiplier is
We can write this in the equivalent minimum-map form as
which has the following derivatives,
From these cases we can see that the minimum-map gives rise to an active-set style method where a contact is considered active if . Active contacts are treated as equality constraints, while for inactive contacts the minimum-map enforces that the constraint’s Lagrange multiplier is zero.
4.2. Fischer-Burmeister Formulation
An alternative NCP-function is given by Fischer-Burmeister , who observe the roots of the following equation satisfy complementarity:
The Fischer-Burmeister function is interesting because, unlike the minimum-map, it is smooth everywhere apart from the point . For the partial derivatives of the Fisher-Burmeister function are given by:
At the point the derivative is set-valued (see Figure 8). For Newton methods it suffices to choose any value from this subgradient. Erleben  compared how the choice of derivative at the non-smooth point affects convergence for LCP problems and found no overall best strategy. Thus, for simplicity we make the arbitrary choice of
For a contact constraint , with Lagrange multiplier we may then define our contact function alternatively as,
with derivatives given by
4.3. Frictional Constraints
We now formulate our friction model in terms of non-smooth functions. Our first step is to rewrite the optimality conditions (15)-(16) in terms of an NCP-function. For a single contact with the following must hold:
where may be either the minimum-map or Fischer-Burmeister function. We now introduce a new quantity with the goal of linearizing the relationship between and such that upon convergence of a fixed-point iteration the initial complementarity problem is solved. We do this by defining the following fixed-point iteration based on the relationship between and given by ,
By construction, a fixed-point of this iteration satisfies the original complementarity condition. We use these expressions as replacements for in (31), by defining the following coefficient,
Inserting this into (31) allows us to write it in the compact form,
Here can be interpreted as acting as a compliance term that works to project the frictional force back onto the friction cone as illustrated in Figure 6. The derivatives of are then given by,
where we have treated as a constant. Replacing the complementarity condition by a fixed-point iteration ensures that we obtain a symmetric system of equations in the following section. In Appendix A we derive the exact form of for both minimum-map and Fischer-Burmeister functions.
We note that conical equivalents of the Fischer-Burmeister function exist and have been used to model smooth isotropic friction [Fukushima et al., 2002; Daviet et al., 2011]. One advantage of our method being based on a fixed-point iteration is that it can be extended to arbitrary friction surfaces for e.g.: anisotropic or even non-symmetric friction cones.
5. Implicit Time-Stepping
The continuous equations of motion are expressed purely in terms of our generalized coordinates , their derivatives, and the Lagrange multipliers. Although it is possible to discretize and solve these equations for directly, it is convenient to introduce an additional re-parameterization in terms of a generalized velocity vector :
Here is referred to as the kinematic map, and its components depend on the choice of system coordinates. For simple particles is an identity transform, however in Section 9.4 we discuss the mapping of a rigid body’s angular velocity to the corresponding quaternion time-derivative. We use the the kinematic map to obtain the following mass matrix,
and define the bilateral constraint Jacobians with respect to the generalized velocity as follows:
We group the normal and friction NCP-functions for all contacts into two vectors,
and likewise define their Jacobians with respect to the generalized velocity as
Using a first-order backwards time-discretization of gives the following update for the system’s generalized coordinates in terms of generalized velocities over a time-step of length ,
The superscripts represent a variable’s value at the end and beginning of the time-step respectively. Discretizing our continuous equations of motion from the previous section gives the implicit time-stepping equations,
Here are the unknown velocities and multipliers at the end of the time-step. The constant is the unconstrained velocity that includes the external and gyroscopic forces integrated explicitly. Observe that through this time-discretization the original force level model of friction has changed into an impulsive model.
We highlight a few differences from common formulations. First, the equality and inequality constraints have not been linearized through an index reduction step [Hairer and Wanner, 2010]. Index reduction is a common practice that reduces the order of the DAE by solving only for the constraint time-derivatives, e.g.: . Index reduction results in a linear problem, but requires additional stabilization terms to combat drift and move the system back to the constraint manifold. These stabilization terms are often applied as the equivalent of penalty forces [Ascher et al., 1995] and are known as a source of instability and tuning issues. Work has been done to add additional post-stabilization passes [Cline and Pai, 2003], however these require solving nonlinear projection problems, which gives up the primary benefit of performing the index reduction step. Our discrete equations of motion are also nonlinear, but they require no additional stabilization terms or projection passes.
A second point we highlight is that the friction cone defined through (11) has not been linearized into a faceted cone as is commonly done [Stewart and Trinkle, 2000; Kaufman et al., 2008]. The faceted cone approximation provides a simpler linear complementarity problem (LCP), but increases the number of Lagrange multipliers required (one per-facet), and introduces an approximation bias where frictional effects are stronger in some directions than others. In the next section we propose a method that solves the NCP corresponding to smooth isotropic friction using a fixed number of Lagrange multipliers.
The implicit time-discretization above treats contact forces as impulsive, meaning they are able to prevent sliding and interpenetration instantaneously (over a single time-step in the discrete setting). This avoids the inconsistency raised by Painlevé in the continuous setting.
6. Non-Smooth Newton
We now develop a method to solve the discretized equations of motion (46)-(50). Our starting point is Newton’s method which we briefly review here. Given a set of nonlinear equations in terms of a vector valued variable , Newton’s method gives a fixed point iteration in the following form:
where is the Newton iteration index. Newton’s method chooses specifically to be the matrix of partial derivatives evaluated at the current system state, i.e.:
Although Newton’s method is most commonly used for solving systems of smooth functions it may also be applied to non-smooth functions by generalizing the concept of a derivative [Pang, 1990]. Qi and Sun  showed that Newton will converge for non-smooth functions provided , where is the generalized Jacobian of at defined by Clarke . Intuitively, this is the convex hull of all directional derivatives at the non-smooth point, as illustrated in Figure 8. The derivatives of our NCP-functions given in the previous section belong to this subgradient, and we can use them in the fixed-point iteration of (51) directly. Algorithms of this type are sometimes referred to as semismooth methods, we refer the reader to the article by Hintermüller  for a more mathematical introduction and the precise definition of semismoothness. In many cases is not inverted directly, and the linear system for may only be solved approximately. When this is the case we refer to the method as an inexact Newton method. Additionally, when the partial derivatives in are also approximated we refer to it as a quasi-Newton method. In our method we make use of both approximations.
6.1. System Assembly
where is the geometric stiffness matrix arising from the spatial derivatives of constraint forces discussed in Section 8.3. The lower-diagonal blocks are the derivatives of our contact functions with respect to their Lagrange multipliers
Grouping the sub-block components such that
we can write the system compactly as,
and is a vector of constraint errors,
Here the left-hand side matrix should be considered acting block-wise on the constraint error. Since the frictional constraints are measured at the velocity level they are not scaled by like the positional constraints.
The mass block matrix is evaluated at the beginning of the time-step using , while the matrix is evaluated each Newton iteration as described in Section 8.3. The friction compliance block, , is evaluated at each Newton iteration using the current Lagrange multipliers. This means the frictional forces are bounded by the normal force from the previous Newton iteration. This is similar to the fixed point iteration described by Stewart and Trinkle  but with a smooth friction cone. It is also similar in principle to the Kaufman et al.’s Staggered Projections , with the difference that we update both normal and frictional forces during a single Newton iteration, rather than solving two separate minimizations. For inactive contacts with we conceptually disable their frictional constraint equation rows by removing them from the system. In practice this can be performed by zeroing the corresponding rows to avoid changing the system matrix structure.
The system (56) is a saddle-point problem [Benzi et al., 2005] that is indefinite and possibly singular. To obtain a reduced positive semi-definite system, we take the Schur-complement with respect to the mass-block to obtain
After solving this system for the velocity change may be evaluated directly,
and the system updated accordingly,
Here is a step-size determined by line-search or other means (see Section 8). We refer to (59) as the Newton compliance formulation. Under certain conditions we could alternatively have taken the Schur-complement of (56) with respect to the compliance block , instead of the mass block . This transformation is only possible if is non-singular, but it leads to what we call the Newton stiffness formulation,
This form is closely related to that of Projective Dynamics [Bouaziz et al., 2014], and arises from a linearization of the elastic forces due to a quadratic energy potential. Having be invertible corresponds to having compliance everywhere, or in other words, no perfectly hard constraints. For stiff materials, where is poorly conditioned, this approach leads to numerical problems in calculating . However, if the system has fewer degrees of freedom than constraints this transformation can result in a smaller system. In this work we are interested in methods that combine stiff constraints with deformable bodies, so we use the compliance form which accommodates both.
7. Complementarity Precondtioning
An interesting property of the non-smooth complementarity formulations is that their solutions are invariant up to a constant positive scale applied to either of the arguments. Specifically, a solution to , is also a solution to the scaled problem, . Alart  presented an analysis of the optimal choice of in the context of linear elasticity in a quasistatic setting. Erleben  also explored this free parameter in the context of proximal-map operators for rigid bodies.
In this section we propose a new complementarity preconditioning strategy to improve convergence by choosing based on the system properties and discrete time-stepping equations. To motivate our method, we make the observation that the two sides of the complementarity condition typically have different physical units. For example, a contact NCP function
has the units of meters for the first parameter, and units of Newtons for the second parameter. This mismatch can lead to poor convergence in a manner similar to the effect of row-scaling in traditional iterative linear solvers.
Inspired by the use of diagonal preconditioners for linear solvers, our idea is to use the effective system mass and time-stepping equations to put both sides of the complementarity condition into the same units. The appropriate -factor to perform this scaling comes from the relation between and our constraint functions, given through the equations of motion. Specifically, for a unilateral constraint with index we set to be,
Intuitively, this is the time-step scaled effective mass of the system, and relates how a change in Lagrange multiplier affects the corresponding constraint value. This has the effect of making both sides of the complementarity equation have the same slope, as illustrated in Figure 9. For a friction constraint the correct scaling factor is since this is a velocity-force relationship. When using a Jacobi preconditioner the diagonal of the system matrix is already computed, so applying to the NCP function incurs little overhead. We discuss the effect of preconditioning strategies in Section 10.2.
8.1. Line Search and Starting Iterate
We implemented a back-tracking line-search based on a merit function defined as the norm of our residual vector. For frictionless contact problems we found this worked well to globalize the solution. However, for frictional problems we found line search would often cause the iteration to stall and make no progress. We believe this is related to the fact that the frictional problem is non-convex and our search direction is not necessarily a descent direction. Instead, all our results use a simple damped Newton strategy that accepts a constant fraction of the full Newton step with a factor . This strategy may cause a temporary increase in the merit function, but can result in overall better convergence [Maratos, 1978]. Watchdog strategies [Nocedal and Wright, 2006] may be employed to make stronger convergence guarantees, however we did not find them necessary. We have used a value of for all examples unless otherwise specified. This value is ad-hoc, but we have not found our method to be particularly sensitive to its setting.
Starting iterates have a strong effect on most optimization methods, and ours is no different. Although it is common to initialize solvers with the unconstrained velocity at the start of the time-step we found the most robust method was to use the zero-velocity solution as a starting iterate, i.e.:
This choice is robust in the case of reinforcement-learning, where large random external torques are applied to bodies that can lead to initial points far from the solution. Warm-starting for constraint forces is possible, and our tests indicate this can give a good improvement in efficiency, however due to the additional book-keeping we have not used warm-starting in our reported results.
8.2. Preconditioned Conjugate Residual
A key advantage of our non-smooth formulation is that it allows black-box linear solvers to be used for nonlinear complementarity problems. Nevertheless, the choice of solver is still an important factor that affects performance and robustness. A common issue we encountered is that even simple contact problems may lead to singular Newton systems. Consider for example a tessellated cylinder resting flat on the ground. In a typical simulator a ring of contact points would be created, leading to an under-determined system, i.e.: there are infinitely many possible contact force magnitudes that are valid solutions. This indeterminacy results in a singular Newton system and causes problems for many common linear solvers. The ideal numerical method should be insensitive to these poorly conditioned problems.
In addition, we seek a method that allows solving each Newton system only inexactly. Since the linearized system is only an approximation it would be wasteful to solve it accurately far from the solution. Thus, another trait we seek is a method that smoothly and monotonically decreases the residual so that it can be terminated early e.g.: after a fixed computational time budget. Finally, since our application often requires real-time updates we also look for a method that is amenable to parallelization.
The conjugate gradient (CG) method [Hestenes and Stiefel, 1952] is a popular method for solving linear systems in computer graphics. It is a Krylov space method that minimizes the quadratic energy . One side effect of this structure is that the residual is not monotonically decreasing. This behavior leads to problems with early termination since, although a given iterate may be closer to the solution, the true residual may actually be larger. This manifests as constraint error changing unpredictably between iterations.
A related method that does not suffer from this problem is the conjugate residual (CR) algorithm [Saad, 2003]. It is a Krylov space method similar to conjugate gradient (CG), however, unlike CG each CR iteration minimizes the residual
within the Krylov space . A remarkable side effect of this minimization property is that CR is monotonically decreasing in the residual norm , and monotonically increasing in the solution variable norm [Fong and Saunders, 2012]. These are exactly the properties we would like for an inexact Newton method. From a computational viewpoint it is nearly identical to CG, requiring one matrix–vector multiply, and two vector inner products per-iteration. CR requires an additional vector multiply-add per-iteration, but each stage is fully parallelizable, and in our experience we did not observe any performance difference to standard CG.
In Figure 11 we compare the convergence of four common linear solvers, Jacobi, Gauss-Seidel, preconditioned conjugate gradient (PCG), and preconditioned conjugate residual (PCR). We use a diagonal preconditioner for both PCR and PCG. Our surprising finding is that PCR often has an order of magnitude lower residual for the same iteration count compared to other methods. A similar result was reported by Fong et al.  in the numerical optimization literature. For symmetric positive definite (SPD) systems CR will generate the same iterates as MINRES, however it does not handle semidefinite or indefinite problems in general. In practice we observed that CR will converge for close to singular contact systems when CG and even direct Cholesky solvers may fail. We evaluate and discuss the effect of linear solver on a number of test cases in Section 10.4.
8.3. Geometric Stiffness
The upper-left block of the system matrix, , consists of the mass matrix and a second term , referred to as the geometric stiffness of the system. This is defined as:
is not a physical quantity in the sense that it does not appear in the continuous or discrete equations of motion. It appears only as a side-effect of the numerical method, in this case the linearization performed at each Newton iteration. In practice is often dropped from the system matrix, leading to a quasi-Newton method. Tournier et al.  observed that ignoring geometric stiffness can lead to instability, as the Newton search direction is free to move the iterate away from the constraint manifold in the transverse direction, eventually causing the iteration to diverge. However, including directly is also problematic because the mass block is no longer simple block-diagonal, and in addition, the possibly indefinite constraint Hessian restricts the numerical methods that can be applied. Inspired by the work of Andrews et al.  we introduce an approximation of geometric stiffness that improves the robustness of our Newton method without these drawbacks.
|Example||Bodies||Joints||Tetra||Contacts||Newton Iters||Linear Iters||Assembly (ms)||Solve (ms)|
|#||#||#||# (island)||#||#||Avg||Std. Dev.||Avg||Std. Dev.|
|Allegro DexNet||21||21||0||104 (104)||8||20||0.4||0.1||10.0||1.5|
|Allegro Ball||21||21||427||409 (409)||6||50||0.35||0.05||11.4||1.2|
|Fetch Tomato||29||28||319||371 (371)||4||50||0.45||0.05||12.2||1.25|
|Fetch Beam||42||42||0||300 (300)||6||50||0.2||0.1||18.6||2.25|
|Table Pile||54||0||0||936 (736)||4||20||0.1||0.05||3.7||0.95|
|Humanoid Run||5200||4800||0||10893 (33)||4||10||0.8||0.05||8.9||0.25|
|Yumi Cabinet||6400||6600||0||4082 (55)||5||25||1.8||0.25||56.0||6.4|
We now present a method for approximating that is related to the method of Broyden . This method builds the Jacobian iteratively through successive finite differences. We apply this idea to build a diagonal approximation to the geometric stiffness matrix. Using the last two Newton iterates, we can define the following first-order approximation:
where is the unknown matrix we wish to find, and is the known difference between the last two iterates. This problem is under-determined since has entries, while (71) only provides equations, however if we assume has the special form:
where is a diagonal approximation to , then the individual entries of are easily determined from (71) by examining each entry:
Each Newton iteration we then update the mass block’s diagonal entries to form our approximate as
For most models is block-diagonal, e.g.: 3x3 blocks for rigid body inertias, and so is may be easily computed to form the Schur complement. To ensure remains positive-definite we clamp the shift to be positive. (59). This diagonal approximation is inspired by the method presented by Andrews et al.  however this approach has several advantages. First, we do not have to derive or explicitly evaluate the constraint Hessians for different constraint types. This means our method works automatically for contacts and complex deformable models. Second, we do not need to track the Lagrange multipliers from the previous frame, as our method updates the geometric stiffness at each Newton iteration. Finally, since our method does not change the solution to the problem it does not introduce any additional damping provided the solver is run to sufficient convergence. Our approach bears considerable resemblance to a L-BFGS method [Nocedal, 1980], however the use of a diagonal update means the system matrix structure is constant, allowing us to use the efficient block-diagonal inverse for the Schur complement.
Due to contact and constraint redundancy it is often the case that the system matrix is singular, e.g.: for a chair resting with four legs on the ground the contact constraints are linearly dependent, meaning the system is underdetermined, and the Newton matrix becomes singular. To improve conditioning we optionally apply an identity shift to the system matrix (59). This type of regularization bears some resemblance to implicit penalty methods, however unlike penalty-based approaches it does not change the underlying physical model. The regularization only applies to the error at each Newton iteration, and the solution approaches the original one as the Newton iterations progress. For the examples shown here we set , which is equivalent to adding a small amount of compliance to the system matrix.
In the following sections we show how to generalize the compliance form of elasticity and discuss some of the physical models we have used in our examples.
9.1. Generalized Compliance
To extend our compliance form of elasticity to arbitrary material models and forces we make a new interpretation of the compliance transformation as a force factorization. Specifically, given a force we factor it as:
where and may be scalar, vector, or matrix valued functions. The only requirements we place on the factors are that