I Introduction
The connection between dynamical systems and neural network models has been widely studied in the literature, see, for example, [4, 11, 16, 25, 24]
. In general, neural networks can be considered as discrete dynamical systems with the basic dynamics at each step being a linear transformation followed by a componentwise nonlinear (activation) function. In
[5] the neural ODE is introduced as a continuousdepth model instead of specifying a discrete sequence of hidden layers. Even before the introduction of neural ODEs, a series of models with similar architectures had already been proposed to learn the hidden dynamics of a dynamical system in [30, 14, 27]. Theoretical results of the discovery of dynamics are established and enriched in [38], where the inverse modified differential equations are introduced to understand the true dynamical system that is learned when the time derivatives are approximated by numerical schemes. The methods above do not assume the specific form of the equation a priori, however, as the physical systems usually possess intrinsic prior properties, other approaches also take into account the prior information of the systems.For many physical systems from classical mechanics, the governing equation can be expressed in terms of Hamilton’s equation. We denote the byidentity matrix by , and let
which is an orthogonal, skewsymmetric real matrix, so that
. The canonical Hamiltonian system can be written as(1) 
where , and is the Hamiltonian, typically representing the energy of the system. It is well known that the phase flow of a Hamiltonian system is symplectic. Based on that observation, several numerical schemes which preserve symplecticity have been proposed to solve the forward problem in [12, 17, 26]. In recent works [3, 15, 29, 32, 6, 34], the primary focus has been to solve the inverse problem, i.e., identifying Hamiltonian systems from data, using structured neural networks. For example, HNNs [15] use a neural network to approximate the Hamiltonian in (1), then learn
by reformulating the loss function. Based on HNNs, other models were proposed to tackle problems in generative modeling
[29, 34] and continuous control [37]. Another line of approach is to learn the phase flow of the system directly, while encoding the physical prior as symplecticity in the flow. Recently, we introduced symplectic networks (SympNets) with theoretical guarantees that they can approximate arbitrary symplectic maps [19]. Other networks of this type are presented in [10, 36].In practice, requiring a dynamical system to be Hamiltonian could be too restrictive, as the system has to be described in canonical coordinates. In [7] Lagrangian Neural Networks (LNNs) were introduced, which allow the system to be expressed in Cartesian coordinates. In [13]
the models of HNNs and LNNs were generalized to Constrained Hamiltonian Neural Networks (CHNNs) and Constrained Lagrangian Neural Networks (CLNNs), enabling them to learn constrained mechanical systems written in Cartesian coordinates. In other developments, autoencoderbased HNNs (AEHNNs)
[15] and Hamiltonian Generative Networks (HGNs) [34] were proposed to learn and predict the images of mechanical systems, which can be seen as Hamiltonian systems on manifolds embedded in highdimensional spaces. Theoretically, Hamiltonian systems on manifolds written in noncanonical coordinates are equivalent to an important class of dynamical systems, namely, the Poisson systems. To wit, the Poisson systems take the form ofwhere , is not necessarily an even number; is the Hamiltonian of the Poisson system, and the matrixvalued function plays the role of in (1), which induces a general Poisson bracket as defined in Section II. The DarbouxLie theorem states that a Poisson system can be turned into a Hamiltonian system by a local coordinate transformation. As a consequence, structurepreserving numerical schemes for Poisson systems are normally developed by finding the coordinate transformation manually, then applying symplectic integrators on the transformed systems, see [33, 17].
Inspired by the DarbouxLie theorem, we propose a novel neural network architecture, the Poisson neural network (PNN), to learn the phase flow of an arbitrary Poisson system, In other words, PNNs can learn any unknown diffeomorphism of Hamiltonian systems. The coordinate transformation and the phase flow of the transformed Hamiltonian system are parameterized by structured neural networks with physical priors. Specifically, in the general setting, we use invertible neural networks (INNs) [8, 9] to represent the coordinate transformation. If all the data reside on a submanifold with dimension , autoencoders (AEs) can be applied to approximate the coordinate transformation from the coordinates in to its local coordinates as an alternative choice. This strategy is similar to [31], which learns dynamics in the latent space discovered through an autoencoder. Compared to LNNs, PNNs are able to work on a more general coordinate system. Moreover, INNbased PNNs are able to learn multiple trajectories of a Poisson system on the whole data space simultaneously, while AEHNNs, HGNs, CHNNs and CLNNs are only designed to work on lowdimensional submanifolds of . Further, our work lays a solid theoretical background for all the aforementioned models, suggesting that they are learning a Poisson system explicitly or implicitly.
Another intriguing property of PNNs is that they are not only capable of learning Poisson systems, but are able to approximate an unknotted trajectory of an arbitrary autonomous system. We present related theorems, which indicate the great expressivity of PNNs. We demonstrate through computational experiments that PNNs can be practically useful in terms of learning highdimensional autonomous systems, longtime prediction, as well as frame interpolation. PNNs also enjoy all the advantages listed in
[19] as they are implemented based on the SympNets in this work. However, PNNs as a highlevel architecture can also employ other modern symplectic neural networks.The rest of the paper is organized as follows. Section II introduces some necessary notation, terminology and fundamental theorems that will be used. The learning theory for PNNs is presented in Section III. Section IV presents the experimental results for several Poisson systems and autonomous systems. A summary is given in the last section. Supporting materials, including the detailed implementation of PNNs, are included in the appendix.
Ii Preliminaries
The material required for this work is based on the mathematical background of Hamiltonian system and its noncanonical form, i.e., the Poisson system. We refer the readers to [17] for more details.
Iia Hamiltonian and Poisson systems
First, we formally present the definitions of the Hamiltonian system and the Poisson system. We assume that all the functions or maps involved in this paper are as smooth as needed.
Definition 1.
The canonical Hamiltonian system takes the form
(2) 
where is the Hamiltonian typically representing the energy of the system defined on the open set .
System (2) can also be written in a general form by introducing the Poisson bracket.
Definition 2.
Let be an open set. The Poisson bracket is a binary operation satisfying

(anticommutativity)
, 
(bilinearity)
, 
(Leibniz’s rule)
, 
(Jacobi identity)
,
for .
Consider the bracket
(3) 
where . One can check that (3) is indeed a Poisson bracket. Then system (2) can be written as
where in the bracket denotes the map for by a slight abuse of notation. Now we extend the bracket (3) to a general form as
(4) 
where is a smooth matrixvalued function. Note that here we do not require to be an even number. As many crucial properties of Hamiltonian systems rely uniquely on the conditions  in Definition 2, we naturally expect the bracket (4) to be a Poisson bracket.
Lemma 1.
The bracket defined in (4) is anticommutative, bilinear and satisfies Leibniz’s rule as well as the Jacobi identity if and only if
and for all
Lemma 1 provides verifiable equivalence conditions for (4) to become a Poisson bracket. Then, we can give the definition of Poisson system, which is actually the generalized form of the Hamiltonian system.
Definition 3.
Up to now, the Hamiltonian system and the Poisson system have been unified as
(5) 
for satisfying Lemma 1, and the system becomes Hamiltonian when .
IiB Symplectic map and Poisson map
The study of the phase flows of the Hamiltonian and Poisson systems focuses on the symplectic map and the Poisson map.
Definition 4.
A transformation (where is an open set in ) is called a symplectic map if its Jacobian matrix satisfies
Definition 5.
A transformation (where is an open set in ) is called a Poisson map with respect to the Poisson system (5) if its Jacobian matrix satisfies
In fact, the phase flow of the Hamiltonian system is a symplectic map, while the phase flow of the Poisson system is a Poisson map, i.e., and satisfy Definition 4 and 5, respectively. Based on these facts, we naturally expect the numerical methods or learning models for Hamiltonian systems and Poisson systems to preserve the intrinsic properties that and possess. So far the numerical techniques for Hamiltonian systems have been well developed [12, 17, 26], however, research on the Poisson systems is ongoing due to its complexity.
IiC Coordinate changes and the Darboux–Lie theorem
The main idea in studying Poisson systems is to find the connection to Hamiltonian systems, which are easier to deal with. In fact, a Poisson system expressed in arbitrary new coordinates is again a Poisson system, hence we naturally tend to simplify a given Poisson structure as much as possible by coordinate transformation.
Theorem 1 (Darboux 1882, Lie 1888).
Suppose that the matrix defines a Poisson bracket and is of constant rank in a neighbourhood of . Then, there exist functions , , and satisfying
on a neighbourhood of , where equals to 1 if else 0. The gradients of are linearly independent, so that constitutes a local change of coordinates to canonical form.
Corollary 1 (Transformation to canonical form).
Let us denote the transformation of Theorem 1 by . With this change of coordinates, the Poisson system becomes
where . Writing , this system becomes
Corollary 1 reveals the connection between Poisson systems and Hamiltonian systems via coordinate changes. In a forward problem, i.e., solving the Poisson system by numerical integration, transformations are available for many wellknown systems to perform structurepreserving calculations [33, 17], but there does not exist a general method to search for the new coordinates of an arbitrary Poisson system, which is still an open research issue. However, the inverse problem, i.e., learning an unknown Poisson system based on data, is an easier task.
Iii Learning theory for Poisson systems and trajectories of autonomous systems
Assume that there is a dataset () from an unknown autonomous dynamical system, that could be a Poisson system or not, satisfying for time step and phase flow . We aim to discover the dynamics using learning models, so that we can make predictions into future or perform some other computational tasks. To describe things clearly, we first give the definition of the extended symplectic map.
Definition 6.
A transformation (where is an open set in ) is called an extended symplectic map with latent dimension if it can be written as
where is differentiable, and is a symplectic map for each fixed . Note that degenerates to a general symplectic map when .
Iiia Poisson neural networks
We propose a highlevel network architecture, i.e., the Poisson neural networks (PNNs), to learn Poisson systems or autonomous flows based on the Darboux–Lie theorem. Theorem 1 and Corollary 1
indicate that any Poisson system in dimensional space can be transformed to a “piecewise” Hamiltonian system, where is the latent dimension determined by the rank of . The architecture is composed of three parts: (1) a transformation, (2) an extended symplectic map, (3) the inverse of the transformation, denoted by , and , respectively. For the construction of extended symplectic neural networks we refer the readers to Appendix CA, which is an important part of this work. The transformations and can be implemented using two alternative approaches.
Primary architecture. We model as an invertible neural network, as in [8, 9], to automatically obtain its inverse . Then, we learn the data by optimizing the meansquarederror loss
where is an extended symplectic neural network with latent dimension .
Alternative architecture. We exploit an autoencoder to parameterize , with two different neural networks. Then, the loss is designed as
where is a symplectic neural network and
is a hyperparameter to be tuned. Note that in this case
is not intrinsically equivalent to the identity map. Basically, this architecture only learns the Poisson map limited on a submanifold embedded in the whole phase space.In both cases we perform predictions by , which gives the th step. We prefer the invertible neural networks because the reconstruction loss will disappear compared to the autoencoder, and we also expect to impose further prior information on the transformation. For example, in Section IVC, we adopt a volumepreserving network as the invertible neural network, the socalled volumepreserving Poisson neural network (VPPNN), which achieves better generalization compared to the nonvolumepreserving Poisson neural network (NVPPNN), since the original Poisson system has a volumepreserving phase flow. More crucially, autoencoderbased PNNs are unable to learn data lying on the whole space, rather than a dimensional submanifold, when . However, autoencoderbased PNNs can perform better in some situations, such as in the numerical case in Section IVE. Intuitively, the alternative architecture outperforms the primary architecture when . An illustration of PNNs is presented in Fig. 1.
IiiB Learning Poisson systems
Consider the case that consists of data points from a Poisson system. Next, we present the approximation properties of PNNs.
Theorem 2.
Suppose that (i) the extended symplectic neural networks are universal approximators within the space of extended symplectic maps in topology, (ii) (primary architecture) the invertible neural networks are universal approximators within the space of invertible differentiable maps in the
topology, and (iii) (alternative architecture) the transformation neural networks
and are universal approximators within the space of continuous maps in the topology. Then, the corresponding Poisson neural networks are universal approximators within the space of Poisson maps in topology for the primary architecture, and are able to approximate arbitrary Poisson maps (limited on submanifolds) within the space of continuous maps in topology for the alternative architecture. The approximations are considered on compact sets.Notice that in the alternative architecture, the PNN itself is not intrinsically a Poisson map, however, by using for multistep prediction, it can also preserve geometric structure and enjoy stable long term performance in practice.
IiiC Learning trajectories of autonomous systems
Now consider the case that consists of a series of data points on a single trajectory (maybe not from a Poisson system), i.e, , where for time step and , which is the phase flow of an unknown autonomous system . Unlike the theory for learning Poisson systems, the use of PNNs to learn autonomous flows is quite novel. The theories are driven by the observation that symplectic neural networks can learn a trajectory from a nonHamiltonian system, see Section IVA for details. The next theorem reveals the internal mechanism.
Theorem 3.
Suppose that is a simply connected open set, and the periodic solution is from an autonomous dynamical system
then there exists a Hamiltonian , such that also satisfies the Hamiltonian system
Proof.
The proof can be found in Appendix BA. ∎
Based on above theorem, one may be able to apply symplectic neural networks to arbitrary periodic solution to autonomous system in . Naturally, we tend to explore similar results in highdimensional space. We intuitively expect to transform any highdimensional periodic solution to autonomous system into one lying on a plane with the help of coordinate changes, and subsequently, the original trajectory can be learned via PNN. In fact, this conjecture is almost right, except for the case when the orbit of the considered motion is a nontrivial 1knot in . For the theory on knots we refer to [23, 2, 28], and we briefly present the basic concepts in Appendix A.
Theorem 4.
Suppose that is a contractible open set, periodic solution is from an autonomous dynamical system
and the orbit of is unknotted. Then, there exists a Hamiltonian and a satisfying Lemma 1 with rank of , such that also satisfies the Poisson system
Note that nontrivial knots exist only in .
Proof.
The proof can be found in Appendix BB. ∎
Up to now, we have shown that PNNs can be used to learn almost any periodic solution to autonomous systems, and the latent dimension is actually fixed as 2. The symplectic structure embedded in PNNs will endow the predictions with long term stability and more accuracy, for periodic solutions. Nevertheless, there is still a limitation of this method, as one can see, PNNs are allowed to learn only a single trajectory upon training. Basically, the limitation is inevitable as we have already got rid of most of the constraints on the vector field
, which is in fact a tradeoff between data and systems. In spite of this fact, we still expect to further develop a strategy to relax the requirements of “single” and “periodic”, by increasing the latent dimension.Conjecture 1.
Suppose that is a contractible open set, is a vector field, and is a smooth trivial knot embedded in . If is a smooth tangent vector field on , then there exists a smooth singlevalued function and a smooth matrixvalued function satisfying Lemma 1 with rank of , such that .
The conjecture provides a more general insight into the above theoretical results on single trajectory. If it holds, one may learn several trajectories lying on a higherdimensional trivial knot simultaneously upon training, with higher latent dimension. Unfortunately, the proofs of 1knot case cannot be easily extended to the general case, since the fact that a solenoidal vector field on is exactly a field of Hamiltonian system does not hold for higherdimensional space. A more thorough explanation of this conjecture is needed in future works.
Iv Simulation results
In this section, we present several simulation cases to verify our theoretical results, and indicate the potential application of PNNs in the field of computer vision. All the hyperparameters for detailed architectures and training settings are shown in Appendix
C and Table III. For each detailed Poisson system involved, we obtain the ground truth and data using a high order symplectic integrator with its corresponding coordinate transformation, which is listed in Appendix D.Iva Lotka–Volterra equation
The Lotka–Volterra equation can be written as
(6) 
where . As a Poisson system, we are able to discover the underlying symplectic structure using PNNs. The data consist of three trajectories, starting at , respectively. We generate 100 training points with time step for each trajectory. Besides the PNN, we also use a SympNet to learn the three trajectories simultaneously, as well as learn the single trajectory starting at . We perform predictions for 1000 steps starting at the end points of the training trajectories, and the results of the three cases are presented in Fig. 2. As shown in the left figure, the PNN successfully learns the system and achieves a stable long time prediction, compared to the classical RungeKutta method of order four (RK45). Meanwhile, the SympNet [19] fails to fit the three trajectories simultaneously, since the data points are not from a Hamiltonian system, as shown in the middle figure. However, the right figure reveals that the SympNet is indeed able to learn a single trajectory, even though it is not from a Hamiltonian system, which is consistent with Theorem 3.
IvB Extended pendulum system
We test the performance of a PNN on odddimensional Poisson systems. The motion of the pendulum system is governed by
where . This is a canonical Hamiltonian system, and we subsequently extend this system to threedimensional space:
where . To make the data more difficult to learn, a nonlinear transformation is applied to the extended phase space. The governing equation for the transformed system is as follows:
(7) 
where . One may readily verify that satisfies Lemma 1, therefore (7) is a Poisson system.
Three trajectories are simulated with initial conditions , , , and time step . We use data points obtained in the first 100 steps as our training set. Then we perform predictions for 1000 steps starting at the end points of training set; the results are shown in Fig. 3. From the left figure, it can be seen that the predictions made by the PNN match the ground truth perfectly, remaining on the true trajectories after long times. Meanwhile, the PNN is able to recover the underlying structure of the system, as shown on the right hand side. The trajectories of the system in the latent space are recovered as trajectories on parallel planes, which matches the fact that the trajectories are generated from several different twodimensional symplectic submanifolds.
IvC Charged particle in an electromagnetic potential
We consider the dynamics of the charged particle in an electromagnetic potential governed by the Lorentz force
where is the mass, denotes the particle’s position, is the electric charge, denotes the magnetic field, and is the electric field with being the potentials. Let be the velocity of the charged particle, then the governing equations of the particle’s motion can be expressed as
(8) 
where
for . Here we test the dynamics with , , and
for . Then
The initial state is chosen to be , , in which case the system degenerates into fourdimensional dynamics, i.e., the motion of the particle is always on a plane. For simplicity, we also denote them by , , and study the dimensionreduced system. We then generate a trajectory of 1500 training points followed by 300 test points with time step . Subsequently, a volumepreserving PNN (VPPNN) is trained to learn the training set, and we perform predictions for 2000 steps starting at the end point of the training set, as shown in Fig. 4. It can be seen that the VPPNN perfectly predicts the trajectory without deviation. Furthermore, we also train a nonvolumepreserving PNN (NVPPNN) and a volumepreserving neural network (VPNN) to compare with the above model. After sufficient training, the three models make predictions starting at the initial state to reconstruct trajectories, as shown in Fig. 5. As one can see, the VPPNN performs slightly better than the NVPPNN, while the NVPPNN is much better than the VPNN. The quantitative results shown in Table I also support this observation. Although the VPPNN has larger training MSE and one step test MSE than the NVPPNN, its long time test MSE is instead less, which is not surprising because NVPPNNs and VPNNs possess the prior information of symplectic structure and volume preservation respectively, while VPPNNs has both of them. Note that the considered dimensionreduced system of (8) is sourcefree hence its phase flow is intrinsically volumepreserving on the fourdimensional space.
VPPNN  NVPPNN  VPNN  

Training MSE  
Test MSE (One step)  
Test MSE (Long time) 
IvD Nonlinear Schrödinger equation
We consider the nonlinear Schrödinger equation
where is a complex field and the boundary condition is periodic, i.e., for . An interesting space discretization of the nonlinear Schrödinger equation is the Ablowitz–Ladik model
with , . Letting , we obtain
With , this system can be written as
(9) 
where is the diagonal matrix with entries , and the Hamiltonian is
We thus get a Poisson system. In the experiment, we choose the boundary condition , and set , hence (9) is a Poisson system of dimension 40. We then generate 500 training points followed by 100 test points with time step . That means the solution to this equation during the time interval is treated as the training set, and then we learn the data using a PNN and predict the solution between . The result is shown in Fig. 6: both of the real part and imaginary part match the ground truth well.
IvE Pixel observations of twobody problem
We consider the pixel observations of the twobody problem, which are the images of two balls in black background, as shown in Fig. 7. Time series of the images form a movie of the motion of two balls governed by gravitation. Here we intend to learn the phase flow on a coarse time grid while making predictions on a finer time grid, to forecast and smoothen the movie simultaneously. To achieve this goal, a simple recurrent training scheme is applied to our method. Similar treatments can be found in [1].
Suppose the training dataset is . We set our goal to be making predictions on at , . Denote the PNN to be trained as . Then we train
to approximate the flow from to , and is set to 2 in this case. Since we are learning a single trajectory on a submanifold of the highdimensional space, autoencoders are used to approximate . in the corresponding loss function is chosen to be 1. After training, is used to generate predictions for .
A single trajectory of the system is generated with time step , as shown in Fig. 7. The training dataset contains images of size . Let and denote the ground truth and predictions made by the PNN at grid points. Let and denote the ground truth and predictions made by the PNN on middle points of the grids. The test loss on grids is calculated as the mean squared error between and while the test loss on middle points is calculated as the mean squared error between and . We use a similar definition as in [35] to compute the valid prediction time . Suppose we are given the ground truth dataset and prediction , starting from . Let the root mean square error (RMSE) be
where stands for spatial average. The valid prediction time (VPT) is defined to be
where is a hyperparameter to be chosen. Here we set .
Low error is obtained both on the grid points and the middle points, as shown in Table II, which indicates that PNNs can handle prediction and interpolation simultaneously. The VPT is much longer than the time scale of the training window, further suggesting that PNNs are good at longtime predictions and intrinsically structurepreserving. It can be seen in Fig. 7 that the prediction matches the ground truth perfectly even after .
Note that according to Theorem 4, the periodic solution to an autonomous dynamical system in can always be learned by PNNs when . Since this trajectory of twobody system is periodic in and the image at step is uniquely determined by the image at step , we can assume without loss of generality that the pixel observations of a twobody system form a periodic solution to an autonomous dynamical system, regardless of whether the internal mechanism is Hamiltonian.
Train MSE  Test MSE (Grid)  Test MSE (Middle)  VPT 

6308 
V Summary
The main contribution of this paper is to provide a novel highlevel network architecture, PNN, to learn the phase flow of an arbitrary Poisson system. Since a single periodic solution to an autonomous system can be proven to be a solution to a Poisson system if the orbit is unknotted, PNNs can be directly applied to a much broader class of systems without modification. From this perspective, theoretical results regarding the approximation ability of PNNs are presented. Several simulations including the LotkaVolterra equation, an extended pendulum system, charged particles in the electromagnetic potential, a nonlinear Schrödinger equation and a trajectory of the twobody problem support our theoretical findings and illustrate the advantages of PNNs on long time prediction and frame interpolation. Even though not explicitly mentioned in the paper, PNNs can be easily extended to learn phase flows from irregularly sampled data. Interested readers may refer to [19] for more details. PNNs can also learn the Hamiltonian systems on low dimensional submanifolds or constrained Hamiltonian systems, which can be expressed as a Poisson system on local coordinates [17, Chapter VII.1].
Despite the great expressivity, stability and interpretability of PNNs, an open issue is whether one can use it to infer and make predictions on multiple trajectories of an autonomous system without generalized Poisson structure under certain circumstances. We conjectured in the paper that the solution to an arbitrary autonomous system lying on a smooth trivial knot matches the solution to a Poisson system. If this holds, the use of PNNs on multiple trajectories of a general autonomous system would be theoretically justified. We leave the proof or counterexamples of this conjecture as future work.
Appendix A Introduction to knot
Consider the embeddings of in , . Two embeddings are equivalent if there is a homeomorphism of such that . An embedding is unknotted if it is equivalent to the trivial knot defined by the standard embedding
In fact, the embedding is always unknotted when . Therefore an knot is defined as an embedding of in . For convenience, let us make a little change in the terminology: an knot means an embedding of in for , since we are more concerned about the trivial knot in this work.
Appendix B Proofs for theorems
Ba Proof of Theorem 3
In twodimensional space, consider the system
with boundary condition
where is the orbit of and is a vector field. [22, p. 60, Theorem 1] shows that there exists a solution to the system above given a suitable singlevalued function and any continuous satisfying
where is the exterior normal with respect to the domain inside . Set , we have
hence there exists a such that
Note that is a solenoidal vector field, which leads to
The defined above is exactly the Hamiltonian we are looking for.
BB Proof of Theorem 4
Since is unknotted, there exist a periodic and a homeomorphism such that
which is exactly the coordinate transformation deforming into . Theorem 3 shows that is also a solution to a Hamiltonian system, hence the extended solution satisfies
(10) 
for a singlevalued function . In fact, a Poisson system expressed in an arbitrary new coordinate immediately becomes a new Poisson system with a new whose rank is equivalent to the original one [17, p. 265]. Therefore, is also a solution to a Poisson system obtained by expressing system (10) in new coordinate via transformation . Furthermore, they share the same latent dimension of 2.
Appendix C Implementation of architecture
Problem  Section IVA  Section IVB  Section IVC  Section IVD  Section IVE  
PNN  SympNet  PNN  VPNN  
Type  NVP    NVP  VP/NVP  VP  NVP  AE  
Partition  1    2  2  2  20    
Layers  3    3  10  20  3  2  
Sublayers  2    2  3  3  2    
Width  30    30  50  50  100  50  
Type  G  G  E  G    G  LA  
Layers  3  6  3  10    10  3  
Sublayers              2  
Width  30  30  30  50    100   
Ca Extended symplectic neural networks
We adopt SympNets [19] as the universal approximators for symplectic maps. The architecture of SympNets is based on three modules, i.e.,

Linear modules.
where are symmetric, is the bias, while the unit upper triangular symplectic matrices and the unit lower triangular symplectic matrices appear alternately. In this module, (represented by in practice) and are parameters to learn. In fact, can represent any linear symplectic map [18].

Activation modules.
where for . Here is the elementwise product, is the activation function, and is the parameter to learn.

Gradient modules.
where for . Here , are the parameters to learn, and is a positive integer regarded as the width of the module.
The SympNets are the composition of above three modules. In particular, we use two classes of SympNets: the LASympNets composed of linear and activation modules, and the GSympNets composed of gradient modules. Notice that both LA and G SympNets are universal approximators for symplectic maps as shown in [19]. For convenience, we clarify the terminology for describing a detailed LA(G)SympNet: an LASympNet of layers with sublayers means it is the composition of linear modules and activation modules, where linear and activation modules appear alternatively like the architecture of fullyconnected neural network, and each linear module is composed of alternated triangular symplectic matrices; a GSympNet of layers with width means it is composed of alternated gradient modules, and the width is defined as above.
In this works we further develop the extended symplectic neural networks by extending the gradient modules.

Extended modules.
where for , . Here , , are the parameters to learn, and is a positive integer regarded as the width of the module.
The extended symplectic neural networks (ESympNets) are the composition of extended modules. As noticed, is the latent dimension of the ESympNets. The terminology for describing the architecture of ESympNets is the same as that for GSympNets, hence we will not repeat here. It is worth mentioning that the approximation property of ESympNets is still unknown, which is left as future work.
Comments
There are no comments yet.