Log In Sign Up

Data-driven discovery of non-Newtonian astronomy via learning non-Euclidean Hamiltonian

Incorporating the Hamiltonian structure of physical dynamics into deep learning models provides a powerful way to improve the interpretability and prediction accuracy. While previous works are mostly limited to the Euclidean spaces, their extension to the Lie group manifold is needed when rotations form a key component of the dynamics, such as the higher-order physics beyond simple point-mass dynamics for N-body celestial interactions. Moreover, the multiscale nature of these processes presents a challenge to existing methods as a long time horizon is required. By leveraging a symplectic Lie-group manifold preserving integrator, we present a method for data-driven discovery of non-Newtonian astronomy. Preliminary results show the importance of both these properties in training stability and prediction accuracy.


page 1

page 2

page 3

page 4


Dissipative SymODEN: Encoding Hamiltonian Dynamics with Dissipation and Control into Deep Learning

In this work, we introduce Dissipative SymODEN, a deep learning architec...

Lie Group Forced Variational Integrator Networks for Learning and Control of Robot Systems

Incorporating prior knowledge of physics laws and structural properties ...

Hamiltonian Neural Networks with Automatic Symmetry Detection

Recently, Hamiltonian neural networks (HNN) have been introduced to inco...

Structure-preserving Sparse Identification of Nonlinear Dynamics for Data-driven Modeling

Discovery of dynamical systems from data forms the foundation for data-d...

A Preliminary Review of Influential Works in Data-Driven Discovery

The Gordon and Betty Moore Foundation ran an Investigator Competition as...

Constrained Generalized Additive 2 Model with Consideration of High-Order Interactions

In recent years, machine learning and AI have been introduced in many in...

k-simplex2vec: a simplicial extension of node2vec

We present a novel method of associating Euclidean features to simplicia...

1 Introduction


Figure 1: One planet’s orbit around a star: rigid body correction results in a precession, i.e. slow rotation of the orbital axis. Our method, ‘Lie T2’, learns from data and predicts a trajectory that matches the ground truth with the rigid body potential.

Deep Neural Networks (DNN) have been demonstrated to be effective tools for learning dynamical systems from data. One important class of systems to be learned have dynamics described by physical laws, whose structure can be exploited by learning the Hamiltonian of the system instead of the vector field

Greydanus et al. (2019); Bertalan et al. (2019). An appropriately learned Hamiltonian can endow the learned system with properties such as superior long prediction accuracy Chen and Tao and applicability to chaotic systems Choudhary et al. (2020); Chen and Tao ; Han et al. (2021).

To learn continuous dynamics from discrete data, one important step is to bridge the continuous and discrete times. Seminal work initially approximated the time derivative via finite differences and then matched it with a learned (Hamiltonian) vector fieldGreydanus et al. (2019); Bertalan et al. (2019). Recent efforts avoid the inaccuracy of finite difference by numerically integrating the learned vector field. Especially relevant here is SRNN Chen et al. (2019), which uses a symplectic integrator to ensure the learned dynamics is symplectic (a necessity for Hamiltonian systems). Although SRNN only demonstrated learning separable Hamiltonians, breakthrough in symplectic integration of arbitrary Hamiltonians Tao (2016) was used to extend SRNN Xiong et al. (2020). Further efforts on improving the time integration error have also been made DiPietro et al. (2020); David and Méhats (2021); Mathiesen et al. (2022). Meanwhile, alternative approaches based on learning a symplectic map instead of the Hamiltonian also demonstrated efficacy Jin et al. (2020); Chen and Tao , although these approaches have not been extended to non-Euclidean problems.

In fact, one relatively under-explored area is learning Hamiltonian dynamics on manifolds like the Lie group manifold family111We note extensions to include holonomic constraints in Finzi et al. (2020) and to handle contact in Zhong et al. (2021a)).. One important member of this family is , which describes isometries in and is important for, e.g., dynamical astronomy. The evolution of celestial bodies correspond to a mechanical system, and the 2- and 3-body problems have been a staple problem in works on learning Hamiltonian (e.g., Greydanus et al. (2019); Chen et al. (2019); Chen and Tao ; Lu et al. (2022)); however, the Newtonian (point-mass) gravity considered is already well understood. Practical problems in planetary dynamics are complicated by higher-order physics such as planet spin-orbit interaction, tidal dissipation, and general relativistic correction. While it is unclear what would be a perfect scientific model for these effects, planetary rotation is a necessary component to account for spin-orbit interaction and tidal forcing, creating an component of the configuration space. To learn these physics from data, we need to learn on the Lie group.

Rigid body dynamics also play important roles in other applications such as robotics. In a seminal work Duong and Atanasov (2021), Hamiltonian dynamics on are used to learn rigid body dynamics for a quadrotor. In that work, Runge-Kutta 4 integrator is used. Consequently, the method is applicable to short time-horizon (see Sec.3 and last paragraph of Sec.2).

For our problem of learning non-Newtonian astronomy, the time-horizon has to be long. Hence, we use a different approach by leveraging a Lie-group preserving symplectic integrator. Structure-preserving integration of dynamical systems on manifolds has been extensively studied in literature, for example for Lie groups Iserles et al. (2000); Bou-Rabee and Marsden (2009); Celledoni et al. (2014); Chen et al. (2021); Celledoni et al. (2022) and more broadly, geometric integration Haier et al. (2006); Leimkuhler and Reich (2004); Blanes and Casas (2017); Sanz-Serna and Calvo (2018).

In summary, we propose a deep learning methodology for performing data-driven discovery of non-Newtonian astronomy. By leveraging the use of a symplectic Lie-group manifold preserving integrator, we show how a non-Euclidean Hamiltonian can be learned for accurate prediction of non-Newtonian effects. Moreover, we provide insights that show the importance of both symplecticity and exact preservation of the Lie-group manifold in training stability.

2 Method

Given observations of a dynamically evolving system, our goal is to learn the physics that governs its evolution from the data. Denote by a dataset of snapshots of continuous-time trajectories of a system with interacting rigid bodies. That is,

where is a solution of some latent Hamiltonian ODE to be learned corresponding to mechanical dynamics on . is a (possibly large) observation timestep, is the rotational configuration of the rigid bodies, and denotes each’s angular momentum in their respective body frames.

Importantly, since the configuration space is not flat, the mechanical dynamics are not given by for some Hamiltonian that depends on the generalized coordinates and generalized momentum . Instead, the equations of motion can be derived via either Lagrange multipliers Chen et al. (2021); Hairer et al. or a Lie group variational principle Chen et al. (2021); Lee et al. , which will be equationparentequation

(1a) (1b)
(1c) (1d)

assuming a physical Hamiltonian that sums total (translation and rotational) kinetic energy and interaction potential , where

denote the mass and inertial tensor of the

th body, and are forcing terms to model nonconervative forces. is a vector, is the map from to and is its inverse (Chen et al. (2021) for more details). By learning the potential , external forcing and torque , we can learn the physics of the system.

2.1 Machine Learning Challenges Posed by Dynamical Astronomy

We study this setup because it helps answer scientific questions like: what physics governs the motions of celestial bodies, such as planets in a planetary system? The leading order physics is of course already well known, namely these bodies can be approximated by point masses that are interacting through a gravitational potential. However, planets are not point masses, and their rotations matter because they shape planetary climates Quarles et al. (2022); Chen et al. (2022) and even feedback to their orbits Millholland and Laughlin (2019). This already starts to alter even if one only considers classical gravity. For example, the gravitational potential for interacting bodies of finite sizes should be , where


Working with the full potential is complicated since is not known and the integral is not analytically known. Can we directly learn from time-series data?

Classical gravity (i.e. Newtonian physics) is not the only driver of planetary motion — tidal forces and general relativity (GR) matter too. The former provides a dissipation mechanism and plays critical roles in altering planetary orbits Mardling and Lin ; Naoz et al. (2011); the latter doesn’t need much explanation and has been demonstrated by, e.g., Mercury’s precessing orbit Clemence (1947). Tidal forces depend on celestial bodies’ rotations Hut (1981) and thus is a function of both . GR’s effects cannot be fully characterized with classical coordinates , but post-Newtonian approximations based purely on these coordinates are popular Blanchet (2014). Can we learn both purely from data if we did not have theories for either?

In addition to the scientific questions, there are also significant machine learning challenges:

Multiscale dynamics. Rigid-body correction (), tidal force, and GR correction are all much smaller forces compared to point-mass gravity. Consequently, their effects do not manifest until long time. Thus, one challenge for learning them is that the dynamical system exhibits different behaviors over multiple timescales. It is reasonable to require long time series data for the small effects to be learned; meanwhile, when observations are expensive to make, the observation time step can be much longer than the smallest timescales. Can we still learn the physics in this case? We will leverage symplectic integrator and its mild growth of error over long time Hairer et al. ; Tao (2016) to provide a positive answer.

Respecting the Lie group manifold. However, even having a symplectic integrator is not enough because the position variable of the latent dynamics (i.e. truth) stays on . If the integrated solution falls off this manifold such that no longer holds, it is not only incorrect but likely misleading for the learning of . Popular integrators such as forward Euler, Runge-Kutta 4 (RK4) and Leapfrog Greydanus et al. (2019); Chen et al. (2019); Gruver et al. (2022) unfortunately do no maintain the manifold structure.

2.2 Learning with Lie Symplectic RNNs

Our method can be viewed as a Lie-group generalization of the seminal work of SRNN Chen et al. (2019), where a good integrator that is both symplectic and Lie-group preserving is employed as a recurrent block.

Lie : A Symplectic Lie-Group Preserving Integrator. To construct an integrator that achieves both properties, we borrow from Chen et al. (2021) the idea of Lie-group and symplecticity preserving splitting, and split our Hamiltonian as , which contains the axial-symmetric kinetic energy, potential energy and asymmetric kinetic energy correction terms. This enables computing the exact integrators and (see App B for details). We then construct a 2nd-order symplectic integrator Lie by applying the Strang composition scheme. To account for non-conservative forces, the corresponding non-conservative momentum update is inserted in the middle of the composition Chen et al. (2021). This gives for stepsize as



Figure 2: Inputs are fed through a recurrent layer with Lie . Prediction error is used as a loss on .

A Recurrent Architecture for Nonlinear Regression. Given the simplicity of , we assume this is known and learn and

with multi-layer perceptron (MLP)

and without assuming any pairwise structure (see App C for discussion). We then use to integrate dynamics forward, where denotes the dependence on the networks. However, when the temporal spacing between observations is large, using a single will result in large errors for the fast timescale dynamics. Instead, we compose times as , where determines the integration stepsize . We perform training by minimizing the following empirical loss over random minibatches of size


Note that we do not assume access to the true derivatives and

used in the loss function of some works

Greydanus et al. (2019); Greydanus and Sosanya (2022); Cranmer et al. (2020). Our training process in summarized in Fig. 2 (see App C for details).

Benefit. Learning an accurate requires accurate numerical simulation which also leads to a trainable model. Without preservation of the manifold structure, training can lead to ‘shortcuts’ outside the manifold that seemingly match the data but completely mislead the learning. Symplecticity also plays a vital role in controlling the long time integration error — under reasonable conditions, a th-order symplectic integrator has linear error bound, whereas a th-order nonsymplectic one has an exponential error bound Hairer et al. ; Tao (2016). While these bounds do not matter for small , they are significant for multiscale problems where is macroscopic but

is microscopic. Consequently, improving error estimates for a nonsymplectic integrator by reducing

makes the RNN exponentially deep — this often renders training difficult Pascanu et al. (2013) and is not desirable.

3 Results

[width=]fig_tex/figs/trappist_easy2/RTR_err.pdf   [width=]fig_tex/figs/trappist_easy2/true_H_err.pdf

Figure 3: Results on TRAPPIST-1 with short data separation. Top: manifold error. Bottom: Hamiltonian error over the integrated trajectory. Only Lie T2 achieves low errors in both metrics.

We aim to answer two questions. Q1 Can we learn multiscale physics? Q2 How important are symplecticity () and Lie-group preservation () for learning? The closest baseline for our problem is work of Duong and Atanasov (2021), which learns short timescale rigid-body Hamiltonian dynamics for robotics. Placed in our framework, their work corresponds to using RK4 for the recurrent block, which is neither nor . Therefore, to investigate Q2, we vary the choice of integrator in our framework as follows: Normal: Explicit Euler, RK4. : Verlet. : Lie RK2(CF2) and Lie RK4(CF4) Celledoni et al. (2022). We leave the precise details to Apps D and C.

Toy Two-Body Problem. We consider an illustrative two-body problem to demonstrate the effects of . In Fig. 1 ‘Point’ & ‘Rigid’ denote exact solutions for a point-mass and rigid-body potential, and ‘Lie T2’ the prediction of our method based on a learned from data. Compared to ‘Point’, ‘Rigid’ induces an apsidal precession (rotation of the orbital axis) due to spin-orbit couplings. Our method successfully predict this interaction and matches the trajectory of ‘Rigid’.

We next test our method by learning the dynamics of the TRAPPIST-1 system Gillon et al. (2016) which consists of seven earth-sized planets and is notable for potential habitability for terrestrial forms of lives.

TRAPPIST-1, Large . To answer Q1, we choose a large data timestep . The closest planet has an orbital period of (), while the rigid body correction, tidal force and GR correction act on much longer scales. Only Lie successfully trains. All other methods diverge during training (denoted by ) despite attempts at stabilization with techniques such as normalization (LayerNorm Ba et al. (2016), GroupNorm Wu and He (2018)). Reducing improves integration accuracy, but increases the RNN depth and makes training more unstable. We compare with the solution for point-mass potential only (No Correction). Our method reduces the error up to two orders of magnitude in measures of trajectory error and potential gradients (Table 2). See App E for column definitions.

No Corrections Euler, RK4, Verlet, Lie RK2, Lie RK4 Lie (Ours) 6.1e-08 4.0e-02 1.2e-01 3.1e-03 Euler RK4 Verlet 1.6e+01 Lie RK2 Lie RK4 Lie (Ours) 8.0e-08 2.4e-01 1.6e+01 4.0e-01

Table 1: TRAPPIST-1 with long data separation evaluated after integrator steps. All methods except for Lie diverge during training.
Table 2: TRAPPIST-1 with short data separation evaluated after integrator steps.

TRAPPIST-1, Small . To gain more insight on Q2, we shrink until almost all methods can converge and only consider conservative forces (i.e. no tidal force or GR). The mean errors in the predicted trajectory and derivatives of the learned potential after integrator steps are shown in Table 2. Both methods achieve small errors in position related terms. Verlet has a large rotational error since it does not integrate on the rotation manifold. methods achieve lower rotational errors but are worse elsewhere. Lie being both achieves the lowest error on both fronts.

4 Broader Impact

Our work presents an approach for learning multiscale, higher order physics on the Lie-group manifold in the context of non-Newtonian astronomy. This research, though directly applicable to astronomy, can also be applied to perform data-driven discovery multiscale phenomena on Lie groups in other fields.


  • Greydanus et al. [2019] Samuel Greydanus, Misko Dzamba, and Jason Yosinski. Hamiltonian neural networks. Advances in neural information processing systems, 32, 2019.
  • Bertalan et al. [2019] Tom Bertalan, Felix Dietrich, Igor Mezić, and Ioannis G Kevrekidis. On learning hamiltonian systems from data. Chaos: An Interdisciplinary Journal of Nonlinear Science, 29(12):121107, 2019.
  • [3] Renyi Chen and Molei Tao. Data-driven prediction of general hamiltonian dynamics via learning exactly-symplectic maps. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 1717–1727. PMLR. URL
  • Choudhary et al. [2020] Anshul Choudhary, John F Lindner, Elliott G Holliday, Scott T Miller, Sudeshna Sinha, and William L Ditto. Physics-enhanced neural networks learn order and chaos. Physical Review E, 101(6):062207, 2020.
  • Han et al. [2021] Chen-Di Han, Bryan Glaz, Mulugeta Haile, and Ying-Cheng Lai. Adaptable hamiltonian neural networks. Physical Review Research, 3(2):023156, 2021.
  • Chen et al. [2019] Zhengdao Chen, Jianyu Zhang, Martin Arjovsky, and Léon Bottou. Symplectic recurrent neural networks. arXiv preprint arXiv:1909.13334, 2019.
  • Tao [2016] Molei Tao. Explicit symplectic approximation of nonseparable hamiltonians: Algorithm and long time performance. Physical Review E, 94(4):043303, 2016.
  • Xiong et al. [2020] Shiying Xiong, Yunjin Tong, Xingzhe He, Shuqi Yang, Cheng Yang, and Bo Zhu. Nonseparable symplectic neural networks. arXiv preprint arXiv:2010.12636, 2020.
  • DiPietro et al. [2020] Daniel DiPietro, Shiying Xiong, and Bo Zhu. Sparse symplectically integrated neural networks. Advances in Neural Information Processing Systems, 33:6074–6085, 2020.
  • David and Méhats [2021] Marco David and Florian Méhats. Symplectic learning for hamiltonian neural networks. arXiv preprint arXiv:2106.11753, 2021.
  • Mathiesen et al. [2022] Frederik Baymler Mathiesen, Bin Yang, and Jilin Hu. Hyperverlet: A symplectic hypersolver for hamiltonian systems. In

    Proceedings of the AAAI Conference on Artificial Intelligence

    , volume 36, pages 4575–4582, 2022.
  • Jin et al. [2020] Pengzhan Jin, Zhen Zhang, Aiqing Zhu, Yifa Tang, and George Em Karniadakis. Sympnets: Intrinsic structure-preserving symplectic networks for identifying hamiltonian systems. Neural Networks, 132:166–179, 2020.
  • Finzi et al. [2020] Marc Finzi, Ke Alexander Wang, and Andrew G Wilson. Simplifying hamiltonian and lagrangian neural networks via explicit constraints. Advances in neural information processing systems, 33:13880–13889, 2020.
  • Zhong et al. [2021a] Yaofeng Desmond Zhong, Biswadip Dey, and Amit Chakraborty. Extending lagrangian and hamiltonian neural networks with differentiable contact models. Advances in Neural Information Processing Systems, 34:21910–21922, 2021a.
  • Lu et al. [2022] Yupu Lu, Shijie Lin, Guanqi Chen, and Jia Pan. Modlanets: Learning generalisable dynamics via modularity and physical inductive bias. In International Conference on Machine Learning, pages 14384–14397. PMLR, 2022.
  • Duong and Atanasov [2021] Thai Duong and Nikolay Atanasov. Hamiltonian-based neural ode networks on the se (3) manifold for dynamics learning and control. arXiv preprint arXiv:2106.12782, 2021.
  • Iserles et al. [2000] Arieh Iserles, Hans Z Munthe-Kaas, Syvert P Nørsett, and Antonella Zanna. Lie-group methods. Acta numerica, 9:215–365, 2000.
  • Bou-Rabee and Marsden [2009] Nawaf Bou-Rabee and Jerrold E Marsden. Hamilton–pontryagin integrators on lie groups part i: Introduction and structure-preserving properties. Foundations of computational mathematics, 9(2):197–219, 2009.
  • Celledoni et al. [2014] Elena Celledoni, Håkon Marthinsen, and Brynjulf Owren. An introduction to lie group integrators–basics, new developments and applications. Journal of Computational Physics, 257:1040–1061, 2014.
  • Chen et al. [2021] Renyi Chen, Gongjie Li, and Molei Tao. Grit: A package for structure-preserving simulations of gravitationally interacting rigid bodies. The Astrophysical Journal, 919(1):50, 2021.
  • Celledoni et al. [2022] Elena Celledoni, Ergys Çokaj, Andrea Leone, Davide Murari, and Brynjulf Owren. Lie group integrators for mechanical systems. International Journal of Computer Mathematics, 99(1):58–88, 2022.
  • Haier et al. [2006] Ernst Haier, Christian Lubich, and Gerhard Wanner.

    Geometric Numerical integration: structure-preserving algorithms for ordinary differential equations

    Springer, 2006.
  • Leimkuhler and Reich [2004] Benedict Leimkuhler and Sebastian Reich. Simulating hamiltonian dynamics. Number 14. Cambridge university press, 2004.
  • Blanes and Casas [2017] Sergio Blanes and Fernando Casas. A concise introduction to geometric numerical integration. CRC press, 2017.
  • Sanz-Serna and Calvo [2018] Jesus-Maria Sanz-Serna and Mari-Paz Calvo. Numerical hamiltonian problems. Courier Dover Publications, 2018.
  • [26] E. Hairer, Christian Lubich, and Gerhard Wanner. Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations. Number 31 in Springer Series in Computational Mathematics. Springer, 2nd ed edition.
  • [27] Taeyoung Lee, N Harris McClamroch, and Melvin Leok. A lie group variational integrator for the attitude dynamics of a rigid body with applications to the 3D pendulum. In Proceedings of 2005 IEEE Conference on Control Applications, 2005. CCA 2005., pages 962–967. IEEE. doi: 10.1109/CCA.2005.1507254. URL
  • Quarles et al. [2022] Billy Quarles, Gongjie Li, and Jack J Lissauer. Milankovitch cycles for a circumstellar earth-analogue within centauri-like binaries. Monthly Notices of the Royal Astronomical Society, 509(2):2736–2757, 2022.
  • Chen et al. [2022] Renyi Chen, Gongjie Li, and Molei Tao. Low spin-axis variations of circumbinary planets. Monthly Notices of the Royal Astronomical Society, 515(4):5175–5184, 2022.
  • Millholland and Laughlin [2019] Sarah Millholland and Gregory Laughlin. Obliquity-driven sculpting of exoplanetary systems. Nature Astronomy, 3:424–433, March 2019. doi: 10.1038/s41550-019-0701-7.
  • [31] Rosemary A. Mardling and D. N. C. Lin. Calculating the Tidal, Spin, and Dynamical Evolution of Extrasolar Planetary Systems. 573(2):829–844. ISSN 0004-637X, 1538-4357. doi: 10.1086/340752. URL
  • Naoz et al. [2011] Smadar Naoz, Will M Farr, Yoram Lithwick, Frederic A Rasio, and Jean Teyssandier. Hot jupiters from secular planet–planet interactions. Nature, 473(7346):187–189, 2011.
  • Clemence [1947] Gerald Maurice Clemence. The relativity effect in planetary motions. Reviews of Modern Physics, 19(4):361, 1947.
  • Hut [1981] P Hut. Tidal evolution in close binary systems. Astronomy and Astrophysics, 99:126–140, 1981.
  • Blanchet [2014] Luc Blanchet. Gravitational radiation from post-newtonian sources and inspiralling compact binaries. Living reviews in relativity, 17(1):1–187, 2014.
  • Gruver et al. [2022] Nate Gruver, Marc Anton Finzi, Samuel Don Stanton, and Andrew Gordon Wilson. Deconstructing the inductive biases of hamiltonian neural networks. In International Conference on Learning Representations, 2022. URL
  • Greydanus and Sosanya [2022] Sam Greydanus and Andrew Sosanya. Dissipative hamiltonian neural networks: Learning dissipative and conservative dynamics separately. arXiv preprint arXiv:2201.10085, 2022.
  • Cranmer et al. [2020] Miles Cranmer, Sam Greydanus, Stephan Hoyer, Peter Battaglia, David Spergel, and Shirley Ho. Lagrangian neural networks. arXiv preprint arXiv:2003.04630, 2020.
  • Pascanu et al. [2013] Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio.

    On the difficulty of training recurrent neural networks.

    In International conference on machine learning, pages 1310–1318. PMLR, 2013.
  • Gillon et al. [2016] Michaël Gillon, Emmanuël Jehin, Susan M Lederer, Laetitia Delrez, Julien de Wit, Artem Burdanov, Valérie Van Grootel, Adam J Burgasser, Amaury HMJ Triaud, Cyrielle Opitom, et al. Temperate earth-sized planets transiting a nearby ultracool dwarf star. Nature, 533(7602):221–224, 2016.
  • Ba et al. [2016] Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
  • Wu and He [2018] Yuxin Wu and Kaiming He. Group normalization. In

    Proceedings of the European conference on computer vision (ECCV)

    , pages 3–19, 2018.
  • Marsden and Ratiu [2013] Jerrold E Marsden and Tudor S Ratiu. Introduction to mechanics and symmetry: a basic exposition of classical mechanical systems, volume 17. Springer Science & Business Media, 2013.
  • Holm et al. [2009] Darryl D Holm, Tanya Schmah, and Cristina Stoica. Geometric mechanics and symmetry: from finite to infinite dimensions, volume 12. Oxford University Press, 2009.
  • Bradbury et al. [2018] James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy programs, 2018. URL
  • Hennigan et al. [2020] Tom Hennigan, Trevor Cai, Tamara Norman, and Igor Babuschkin. Haiku: Sonnet for JAX, 2020. URL
  • Elfwing et al. [2018] Stefan Elfwing, Eiji Uchibe, and Kenji Doya.

    Sigmoid-weighted linear units for neural network function approximation in reinforcement learning.

    Neural Networks, 107:3–11, 2018.
  • Lee et al. [2019] Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. Set transformer: A framework for attention-based permutation-invariant neural networks. In International conference on machine learning, pages 3744–3753. PMLR, 2019.
  • Loshchilov and Hutter [2017] Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  • [50] Yaofeng Desmond Zhong, Biswadip Dey, and Amit Chakraborty. Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control. URL
  • Zhong et al. [2021b] Yaofeng Desmond Zhong, Biswadip Dey, and Amit Chakraborty. Benchmarking energy-conserving neural networks for learning dynamics from data. In Learning for Dynamics and Control, pages 1218–1229. PMLR, 2021b.
  • [52] Elena Celledoni, Arne Marthinsen, and Brynjulf Owren. Commutator-free Lie group methods. 19(3):341–352. ISSN 0167739X. doi: 10.1016/S0167-739X(02)00161-9. URL

Appendix A Rigid Body Equations of Motion

We present two derivations from Chen et al. [2021]. The first derivation derives a constrained Hamiltonian system via Lagrange multipliers, while the second uses a variational principle for mechanics on Lie groups. We also note that the same equations can be derived via the use of the Port-Hamiltonian framework as in Duong and Atanasov [2021], though they choose to express the linear momentum coordinates in the moving body frame rather than the inertial frame as is done in this work.

a.1 Constrained Hamiltonian System

Since the manifold is a product space of the individual manifolds , we consider the latter for brevity and drop indices for each body. Furthermore, can be seen as the product space of for which the former is unconstrained, so we focus our attention on the latter. We can view to be in the embedded Euclidean space and use as a holonomic constraint. Using Lagrange multipliers Hairer et al. for the constraint gives us the following Lagrangian



denotes the nonstandard moment of inertia

Chen et al. [2021], Lee et al.




is a 6-dimensional symmetric matrix of Lagrange multipliers. Performing the Legendre transform for (A5) gives us the conjugate rotational momentum as


and corresponding Hamiltonian


The constraint for can be obtained by taking the time derivative of according to Haier et al. [2006], i.e., . Hence, our equations of motion so far looks like


on the manifold


Let denote the body’s angular velocity in the inertial frame, where denotes the cross-product operation . Then, it can written using as


Taking the time derivative of gives


Since our goal is to represent the dynamics of , we convert the above to the body frame momentum using . Using properties of the hat map [Chen et al., 2021, Appendix A.1] gives us


Taking the time derivative of the above then gives


where the symmetric vanishes on the last line.

Since , applying properties of the hat map [Chen et al., 2021, Appendix A.1] further simplifies the above as


Thus, applying the vee map and converting back to and gives


Hence, the rotational equations of motion on are


a.2 Variational Principle for Mechanics on the Lie Group

We can also apply the Euler-Lagrange equations for Hamilton’s variational principle on a Lie group, a topic that has been well studied (e.g., Marsden and Ratiu [2013], Holm et al. [2009]) Chen et al. [2021], and we summarize the results for the special case of rigid bodies from the expository part of Lee et al. .

Denote the infinitesimally varied rotation by , with and , where is the exponential map from to . The varied angular velocity is


Consider the action


Taking the variation of the action , we have


Using Hamilton’s Principle, we have , i.e.,


for any . Hence, must be skew-symmetric, giving us


Thus, by definition of and applying the vee map, we recover the same update


Appendix B Details of the Lie splitting integrator

The full Hamiltonian takes the form


We borrow from Chen et al. [2021] the idea of Lie-group and symplecticity preserving splitting and split our Hamiltonian as , where


where we assume that is axis-symmetric, i.e.,


and and denote the axial-symmetric and residual terms of such that , i.e.,


Then, each of can be integrated exactly.

Exact integration of

For , we have the equations of motion (using (1a)-(1d) but with ) equationparentequation


The equation for (B39d) is the Euler equation for a free rigid body Chen et al. [2021]. It is exactly solvable with a simple expression for axial-symmetric bodies, since in this case simplifies as


Consequently, , meaning that we can express the above as the linear differential equation


where and has the solution


where denotes the rotation matrix around the axis. Taking the above back to (B39c) then gives us the solution for as well, giving the flow of as


Exact integration of

For , the equations of motion are Chen et al. [2021]


Since and stay constant, and change at constant rates. Hence, the flow of is given by


Exact integration of

Finally, the equations of motion of are given by (again adapting (1a)-(1d))


which can be solved to obtain the flow of as


where .

Having obtained analytical solutions for each of the flows , we then combine them with the non-conservative momentum update from the non-conservative forcing terms and with flow


Consequently, the full Lie integrator is obtained by applying the Strang composition scheme to obtain


Appendix C Training Details

We implement our method using Jax222 The repository is licensed under Apache-2.0. Bradbury et al. [2018] and use the Haiku framework Hennigan et al. [2020]333 The repository is licensed under Apache-2.0.

for constructing the deep neural networks. In all our experiments, we use a multilayer perceptron (MLP) with

hidden layers each of size with the SiLU activation Elfwing et al. [2018] for all networks (

). Each method is run until convergence. Other common training hyperparameters used are summarized in

Table 3.

Structure of

Note that we do not assume prior knowledge on the pairwise structure of the potential function or of the forcing terms . More specifically, since the true Gravitational potential (2) only acts pairwise between rigid bodies, the rigid body correction potential also acts pairwise and has the structure


For the forcing terms, the tidal forcing term also acts pairwise with coupled effects on and due the relationship between forces and torques, while the post-Newton general-relativity correction term acts on each planet individually and affects only . However, in this work, we assume that none of this prior knowledge is available and aim to learn everything purely from data. Hence, we choose to learn the high dimensional forms of . The fact that we were able to obtain improvements despite the high dimensionality of the input space ( planets each with