Augmenting Differentiable Simulators with Neural Networks to Close the Sim2Real Gap

07/12/2020 ∙ by Eric Heiden, et al. ∙ Google 0

We present a differentiable simulation architecture for articulated rigid-body dynamics that enables the augmentation of analytical models with neural networks at any point of the computation. Through gradient-based optimization, identification of the simulation parameters and network weights is performed efficiently in preliminary experiments on a real-world dataset and in sim2sim transfer applications, while poor local optima are overcome through a random search approach.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction and Related Work

Simulators are crucial tools for planning and control algorithms to tackle difficult real world robotics problems. In many cases, however, such models diverge from reality in important ways, leading to algorithms that work well in simulation and fail in reality. Closing the sim2real gap has gained significant interest, and various dynamics modeling approaches have been proposed (Figure 2 left).

Various methods learn system dynamics from time series data of a real system. Such “intuitive physics” models often use deep graph neural networks to discover constraints between particles or bodies [5, 34, 14, 30, 28, 7, 24, 31]. We propose a general-purpose hybrid simulation approach that combines analytical models of dynamical systems with data-driven residual models that learn parts of the dynamics unaccounted for by the analytical simulation models.

Originating from traditional physics engines [8, 33, 23], differentiable simulators have been introduced that leverage automatic, symbolic or implicit differentiation to calculate parameter gradients through the analytical physics models for rigid-body dynamics [11, 6, 10, 22, 17, 16], light propagation [29, 15], and other phenomena [18, 25].

Residual physics models [35, 2, 20, 12] augment physics engines with learned models to reduce the sim2real gap. Most of them introduce residual learning applied to the output of the physics engine, while we propose a more fine-grained approach, similar to Hwangbo et al. [20], where only at some parts in the simulator data-driven models are introduced. While in [20]

the network for actuator dynamics is trained through supervised learning, our end-to-end differentiable model allows backpropagation of gradients from high-level states to any part of the computation graph, including neural network weights, so that these parameters can be optimized efficiently, for example from end-effector trajectories.

Fig. 1: Left: Trajectories from rigid body contact simulation of a cube thrown to the right. Starting with poor model parameters, the box falls through the ground (blue). After optimizing Equation 1, our simulation (orange) closely matches the target trajectory (green). Right: After system identification of a real double pendulum [4], the sim2real gap is strongly reduced.
Fig. 2: Left: comparison of various model architectures (cf. Anurag et al. [2]). Right: augmentation of differentiable simulators with our proposed neural scalar type where variable becomes a combination of an analytical model with inputs and , and a neural network whose inputs are , , and .

Ii Approach

We propose a technique for hybrid simulation that leverages differentiable physics models and neural networks to allow for efficient system identification, design optimization, and gradient-based trajectory planning. By enabling any part of the simulation to be replaced or augmented by neural networks, we can learn unmodeled effects from data. Through template meta-programming, our open-source C++ implementation

111https://github.com/google-research/tiny-differentiable-simulator allows any variable participating in the simulation to be augmented by neural networks that accept input connections from any other variable. In the simulation code, such neural scalars (Figure 2 right) are assigned a unique name, so that in a separate experiment code a “neural blueprint” is defined that declares the neural network architectures and sets the network weights. We compute gradients of the weights and analytical simulation parameters using the multi-dimensional dual number implementation from Ceres [1] and have support for many other automatic differentiation libraries.

Analytical Data-driven End2end Hybrid
Physics engine [8, 33, 23]
Residual physics [20, 2, 35, 12]
Learned physics [32, 5, 24, 21]
Differentiable sim. [11, 18, 6, 10]
Ours
Table I: Comparison of dynamics modeling approaches (only selected works) along the axes of analytical and data-driven modeling, end-to-end differentiability, and hybrid approaches.

Iii System Identification

Given the state trajectory from the target system, we optimize the following loss for system identification:

(1)

where is the discrete dynamics function mapping from the previous simulation state to the current state

which is implemented by our physics engine given the parameter vector

that consists of the parameters for the analytical model, plus the parameters that correspond to the weights of the neural networks in the simulation. To ensure the residual dynamics learned by the neural networks are minimal, we regularize the network weights by factor which penalizes large state contributions.

Iv Overcoming Local Optima

We solve the nonlinear least squares problem from Equation 1

using the Levenberg-Marquardt algorithm (LMA). Such a gradient-based optimization method quickly finds local optima, but due to the highly nonconvex loss landscapes commonly encountered in system identification problems for nonlinear dynamics, the resulting parameter estimates often exhibit a poor fit to the real world data. To escape such poor local minima, we adapt a random search strategy,

parallel basin hopping (PBH) [27], that, in our instantiation, runs multiple LMA solver and simulation instances in parallel while continuously randomizing the initial parameters from which the local solvers are restarted after convergence criteria, time limits, or maximum iteration counts are met.

V Results

We present preliminary results for sim2sim transfer to match the hybrid dynamics model to richer analytical simulations. Additionally, we demonstrate our system identification approach on a real-world dataset.

In our first experiment, we transfer rigid-body contact dynamics simulated using a velocity-level contact model formulated as a linear complementarity problem [3]. Our hybrid simulator uses a point-based nonlinear-spring contact model where the normal force is solved analytically through the Hunt-Crossley model [19] and the friction force is learned by a neural network that receives the relative velocities, contact normal force and penetration depth as input. Before optimizing the analytical and neural model parameters, the trajectories of a cube thrown horizontally on the ground differ dramatically. After system identification using PBH applied on Equation 1 given trajectories of positions and velocities from the target system, the gap is significantly reduced (Figure 1 left).

In the next experiment, we apply our approach to a real-world dynamical system. Given joint position trajectories from the double-pendulum dataset provided by Asseman et al. [4], we optimize inertia, masses, and link lengths of our simulated model and achieve a minimal sim2real gap (Figure 1 right).

Vi Conclusion

We have demonstrated a simulation architecture that allows us to insert neural networks at any place in a differentiable physics engine to augment analytical models with the ability to learn dynamical effects from data. In our preliminary experiments, efficient gradient-based optimizers quickly converge to simulations that closely follow the observed trajectories from the target systems, while poor local minima are overcome through a random search strategy.

Future research is directed towards more automated ways to identify where such extra degrees of freedom are needed to close the sim2real gap given a few trajectories from the real system. Our loss function in

Equation 1 regularizes the contributions of the neural networks to the overall system dynamics. Nonetheless, this approach does not prevent violating basic laws of physics, such as energy and momentum preservation. Hamiltonian [13] and (Deep) Lagrangian neural networks [26, 9] explicitly constrain the solution space to remain consistent with such principles but need to be further investigated in the context of residual models in hybrid simulators.

References

  • [1] S. Agarwal, K. Mierle, et al. Ceres solver. Note: http://ceres-solver.org Cited by: §II.
  • [2] A. Ajay, J. Wu, N. Fazeli, M. Bauza, L. P. Kaelbling, J. B. Tenenbaum, and A. Rodriguez (2018) Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Cited by: Fig. 2, §I, Table I.
  • [3] M. Anitescu and F. A. Potra (1997) Formulating dynamic multi-rigid-body contact problems with friction as solvable linear complementarity problems. Nonlinear Dynamics 14 (3), pp. 231–247. Cited by: §V.
  • [4] A. Asseman, T. Kornuta, and A. Ozcan (2018) Learning beyond simulated physics. In Modeling and Decision-making in the Spatiotemporal Domain Workshop, External Links: Link Cited by: Fig. 1, §V.
  • [5] P. Battaglia, R. Pascanu, M. Lai, D. J. Rezende, et al. (2016) Interaction networks for learning about objects, relations and physics. In Advances in Neural Information Processing Systems, pp. 4502–4510. Cited by: §I, Table I.
  • [6] J. Carpentier and N. Mansard (2018) Analytical derivatives of rigid body dynamics algorithms. In Robotics: Science and Systems, Cited by: §I, Table I.
  • [7] T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud (2018)

    Neural ordinary differential equations

    .
    In Advances in Neural Information Processing Systems, pp. 6571–6583. Cited by: §I.
  • [8] E. Coumans et al. (2013) Bullet physics library. Open source: bulletphysics.org 15 (49), pp. 5. Cited by: §I, Table I.
  • [9] M. Cranmer, S. Greydanus, S. Hoyer, P. Battaglia, D. Spergel, and S. Ho (2020) Lagrangian neural networks. External Links: 2003.04630 Cited by: §VI.
  • [10] F. de Avila Belbute-Peres, K. Smith, K. Allen, J. Tenenbaum, and J. Z. Kolter (2018) End-to-end differentiable physics for learning and control. In Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), pp. 7178–7189. External Links: Link Cited by: §I, Table I.
  • [11] M. Giftthaler, M. Neunert, M. Stäuble, M. Frigerio, C. Semini, and J. Buchli (2017) Automatic differentiation of rigid body dynamics for optimal control and estimation. Advanced Robotics 31 (22), pp. 1225–1237. External Links: Document Cited by: §I, Table I.
  • [12] F. Golemo, A. A. Taiga, A. Courville, and P. Oudeyer (2018-29–31 Oct) Sim-to-real transfer with neural-augmented robot simulation. In Proceedings of The 2nd Conference on Robot Learning, A. Billard, A. Dragan, J. Peters, and J. Morimoto (Eds.),

    Proceedings of Machine Learning Research

    , Vol. 87, , pp. 817–828.
    External Links: Link Cited by: §I, Table I.
  • [13] S. Greydanus, M. Dzamba, and J. Yosinski (2019) Hamiltonian neural networks. arXiv preprint arXiv:1906.01563. Cited by: §VI.
  • [14] S. He, Y. Li, Y. Feng, S. Ho, S. Ravanbakhsh, W. Chen, and B. Póczos (2019-07) Learning to predict the cosmological structure formation. Proceedings of the National Academy of Sciences of the United States of America 116 (28), pp. 13825–13832 (en). Cited by: §I.
  • [15] E. Heiden, Z. Liu, R. K. Ramachandran, and G. S. Sukhatme (2020) Physics-based simulation of continuous-wave LIDAR for localization, calibration and tracking. In International Conference on Robotics and Automation (ICRA), Cited by: §I.
  • [16] E. Heiden, D. Millard, and G. S. Sukhatme (2019) Real2Sim transfer using differentiable physics. R:SS Workshop on Closing the Reality Gap in Sim2real Transfer for Robotic Manipulation. Cited by: §I.
  • [17] E. Heiden, D. Millard, H. Zhang, and G. S. Sukhatme (2019) Interactive differentiable simulation. CoRR abs/1905.10706. External Links: Link, 1905.10706 Cited by: §I.
  • [18] Y. Hu, L. Anderson, T. Li, Q. Sun, N. Carr, J. Ragan-Kelley, and F. Durand (2020) DiffTaichi: differentiable programming for physical simulation. ICLR. Cited by: §I, Table I.
  • [19] K. H. Hunt and F. R. E. Crossley (1975) Coefficient of restitution interpreted as damping in vibroimpact. Cited by: §V.
  • [20] J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso, V. Tsounis, V. Koltun, and M. Hutter (2019) Learning agile and dynamic motor skills for legged robots. Science Robotics 4 (26), pp. eaau5872. Cited by: §I, Table I.
  • [21] Y. Jiang and C. K. Liu (2018) Data-augmented contact model for rigid body simulation. CoRR abs/1803.04019. External Links: Link, 1803.04019 Cited by: Table I.
  • [22] T. Koolen and R. Deits (2019-05) Julia for robotics: simulation and real-time control in a high-level programming language. In International Conference on Robotics and Automation, pp. . Cited by: §I.
  • [23] J. Lee, M. X. Grey, S. Ha, T. Kunz, S. Jain, Y. Ye, S. S. Srinivasa, M. Stilman, and C. K. Liu (2018) DART: dynamic animation and robotics toolkit. Journal of Open Source Software 3 (22), pp. 500. External Links: Document, Link Cited by: §I, Table I.
  • [24] Y. Li, J. Wu, R. Tedrake, J. B. Tenenbaum, and A. Torralba (2019) Learning particle dynamics for manipulating rigid bodies, deformable objects, and fluids. In International Conference on Learning Representations, External Links: Link Cited by: §I, Table I.
  • [25] J. Liang, M. Lin, and V. Koltun (2019) Differentiable cloth simulation for inverse problems. In Advances in Neural Information Processing Systems, pp. 771–780. Cited by: §I.
  • [26] M. Lutter, C. Ritter, and J. Peters (2019)

    Deep lagrangian networks: using physics as model prior for deep learning

    .
    arXiv preprint arXiv:1907.04490. Cited by: §VI.
  • [27] S. L. McCarty, L. M. Burke, and M. McGuire (2018) Parallel monotonic basin hopping for low thrust trajectory optimization. In 2018 Space Flight Mechanics Meeting, pp. 1452. Cited by: §IV.
  • [28] D. Mrowca, C. Zhuang, E. Wang, N. Haber, L. Fei-Fei, J. B. Tenenbaum, and D. L. K. Yamins (2018-06) Flexible neural representation for physics prediction. Advances in Neural Information Processing Systems. Cited by: §I.
  • [29] M. Nimier-David, D. Vicini, T. Zeltner, and W. Jakob (2019-11) Mitsuba 2: a retargetable forward and inverse renderer. Transactions on Graphics (Proceedings of SIGGRAPH Asia) 38 (6). External Links: Document Cited by: §I.
  • [30] M. Raissi, H. Babaee, and P. Givi (2018) Deep learning of turbulent scalar mixing. Technical report Cited by: §I.
  • [31] A. Sanchez-Gonzalez, J. Godwin, T. Pfaff, R. Ying, J. Leskovec, and P. W. Battaglia (2020) Learning to simulate complex physics with graph networks. External Links: 2002.09405 Cited by: §I.
  • [32] A. Sanchez-Gonzalez, J. Godwin, T. Pfaff, R. Ying, J. Leskovec, and P. W. Battaglia (2020) Learning to simulate complex physics with graph networks. arXiv preprint arXiv:2002.09405. Cited by: Table I.
  • [33] E. Todorov, T. Erez, and Y. Tassa (2012) Mujoco: a physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033. Cited by: §I, Table I.
  • [34] Z. Xu, J. Wu, A. Zeng, J. B. Tenenbaum, and S. Song (2019-06) DensePhysNet: learning dense physical object representations via multi-step dynamic interactions. Robotics: Science and Systems. Cited by: §I.
  • [35] A. Zeng, S. Song, J. Lee, A. Rodriguez, and T. Funkhouser (2019) TossingBot: learning to throw arbitrary objects with residual physics. Cited by: §I, Table I.