Quantum-Hamiltonians
Accompanying code for the paper "Learning Potentials of Quantum Systems using Deep Neural Networks" by Sehanobish et al
view repo
Machine Learning has wide applications in a broad range of subjects, including physics. Recent works have shown that neural networks can learn classical Hamiltonian mechanics. The results of these works motivate the following question: Can we endow neural networks with inductive biases coming from quantum mechanics and provide insights for quantum phenomena? In this work, we try answering these questions by investigating possible approximations for reconstructing the Hamiltonian of a quantum system given one of its wave–functions. Instead of handcrafting the Hamiltonian and a solution of the Schrödinger equation, we design neural networks that aim to learn it directly from our observations. We show that our method, termed Quantum Potential Neural Networks (QPNN), can learn potentials in an unsupervised manner with remarkable accuracy for a wide range of quantum systems, such as the quantum harmonic oscillator, particle in a box perturbed by an external potential, hydrogen atom, Pöschl–Teller potential, and a solitary wave system. Furthermore, in the case of a particle perturbed by an external force, we also learn the perturbed wave function in a joint end-to-end manner.
READ FULL TEXT VIEW PDFAccompanying code for the paper "Learning Potentials of Quantum Systems using Deep Neural Networks" by Sehanobish et al
Neural Networks (NNs) are universal function approximators, and as such they are remarkably good at learning and generalizing from data. They are widely used in various tasks like Natural Language Processing
Torfi et al. (2020), Image Classification Kolesnikov et al. (2019), Video Captioning Sun et al. (2019)Du and Narasimhan (2019); Higgins et al. (2016). Recent works have shown the capabilities of neural networks in symbolic reasoning and mathematical problem solving Lample and Charton (2019). Naturally, one may wonder whether or not NNs may be able to learn physics. In this respect, several works have been reported Toth et al. (2019); Greydanus et al. (2019); Cranmer et al. (2020); Tong et al. (2020) where different authors have used Hamilton’s equations of motion to generate trajectories that obey the energy conservation and the laws of classical physics. The encouraging results presented in these papers motivate the use of neural networks as powerful tools to gain insight into the laws of physics that govern the behavior of complicated natural phenomena.Unlike classical physics, in quantum physics, objects have characteristics of both particles and waves (wave–particle duality) for which the concept of trajectory is no longer defined nor can their position and momentum both be measured simultaneously J. and D. (1995); Robinett (1997); Feynman et al. (1965); Robinett and Robinett (2006). Quantum phenomena may be described by the wave–function obtained from solving the Schrödinger equation J. and D. (1995); Robinett (1997); Feynman et al. (1965); Robinett and Robinett (2006). However, in many cases not only solving this equation may be difficult but also the correct construction of this equation requires knowledge about the form of the potential operator, a function that contains all the physical effects and constrains relevant for each particular quantum phenomenon, which in many cases are vast and not completely known. The inverse Schrödinger equation Nakatsuji (2002); Chadan and Sabatier (2012); Zakhariev and Suzko (2012); Jensen and Wasserman (2018) presents an alternative for describing quantum phenomena by reformulating the description of quantum mechanical systems as solutions of an inverse problem Aster et al. (2018); Groetsch and Groetsch (1993); Vogel (2002), i.e., from observations (quantum observables) identify the casual factors (physical laws and events) that generated the observed outcomes.
In this work, instead of handcrafting potential functions to describe quantum phenomena as solutions of the inverse Schrödinger equation, we design neural networks called Quantum Potential Neural Networks (QPNN) that aim to learn it directly from our observations. This method was developed based on the underlying formalism for the inverse solution of the Schrödinger equation. Our work opens the possibility for generating simpler and succinct functions that can be used as effective Hamiltonians for the description of quantum systems using only some of the available information known for the system. These effective Hamiltonians can be generalized to obtain other observables and may provide useful predictions for complicated physical phenomena.
The behaviour of matter in the quantum realm often seems peculiar, and its consequences are difficult to understand. Quantum mechanical concepts frequently conflict with common–sense notions derived from the laws of classical physics. Contrary to classical physics, in quantum mechanics the result of an experiment always takes the form of a probability distribution for each possible set of outcomes. Routinely comparisons between theory and experiments involve inferring probability distributions from many repeated experiments and their measured observables.
The mathematical description of a quantum system typically takes the form of complex functions of spatial coordinates and time coordinates called wave–functions J. and D. (1995); Robinett (1997); Feynman et al. (1965); Robinett and Robinett (2006). There is a lot of debate about what, exactly, a wave–function represents, a real physical object or just a mathematical expression of our knowledge (or lack thereof) regarding the underlying state of a particular quantum experiment. In either cases, the probability of finding an outcome is not given directly by but by the probability density, , and the expectation values for the observables. In many scenarios, wave–functions are obtained as solutions of the time–dependent Schrödinger equation,
(1) |
where is the Planck constant, the position coordinate, the time coordinate and is the Hamiltonian for the system. The Hamiltonian in this case is an Hermitian operator acting on an infinite dimensional space of functions. Thus,
need not be compact and as much may not have any eigenvalues. When
is time–independent equation 1 may be reduced to the Schrödinger equation for stationary states or the time–independent Schrödinger equation,(2) |
For many cases, the Schrödinger equation dictates the evolution of the wave–function and the physical information contained within in the system under study.
The Hamiltonian operator () is fundamental in many formulations of quantum theory. This operator is expressed as the sum of the kinetic () and potential energy operators () for all particles in the quantum system,
(3) |
All the physical laws that governs the behavior of the system under any physical variation are contained in . However, finding the appropriate or complete form of and solving the Schrödinger equation for general physical systems are not trivial tasks. Generally, the kinetic energy operator contained in only depends on the second derivatives of the wave–function, with respect to its spatial coordinates, whereas the potential energy operator depends on the physical circumstances imposed onto the system, and varies from system to system.
(4) |
Thus, the problem of finding the that characterizes a given phenomenon could be reduced to formulating the potential operator that contains all the physical descriptors of the events causing the phenomenon.
Another formulation of the quantum dynamics may be given by the Wigner function Curtright et al. (1998); Chen et al. (2019). The Wigner function, , is a phase space distribution function which behaves similarly to and momentum distribution functions B. (2008). Unlike wave–functions, Wigner functions are real valued and bounded. However, contrary to probability distributions, can take negative values. Thus, the Wigner distribution is termed as a quasi–probability distribution and so in a sense loses some of it’s classical appeal. Using the Schrödinger’s equation (equation 1
) and the Taylor expansion, the time evolution of the Wigner function is given by an infinite order partial differential equation called Wigner–Moyal equation
B. (2008).(5) |
The usual method for describing systems in quantum mechanics is by obtaining the wave–function of the system as a solution of the Schrödinger equation. Thus, wave–functions strongly depend on the Hamiltonian and in particular the definition of the potential used to describe the system. However, one could also describe a quantum phenomena through the solution of the inverse problem, i.e. finding an effective potential or function that contains all the important physical constrains that generated the observed outcomes. Inverse problems like this one are common in quantum mechanics, for example Density functional theory (DFT) Jensen and Wasserman (2018); Parr and Yang (1995); Burke et al. (2005) has, at its core, this type of inverse problems. Furthermore, great amount of what is known about the structure of matter has come from solving scattering problems which are mathematically described as inverse problems Zakhariev and Suzko (2012); Jensen and Wasserman (2018); Vogel (2002). Finding the Hamiltonian operator that generated a given wave–function or rather an effective potential that can generate the wave–function for a quantum system may be related to the famous question made by mathematician Mark Kac Kac (1966): "Can one hear the shape of a drum?" In the case of sound waves, the answer to this question is no for all cases except for the trivial case where the shape of a string is equal to its length Gordon et al. (1992); Beals and Greiner (2009)
. In the case of quantum waves, despite the fact that one can get a lot of geometrical and topological information from the spectrum or even its asymptotic behavior, this information is not complete even for quantum systems as simple as the ones defined along a finite interval. The Hamiltonian used to define a given wave–function cannot be reconstructed using that single wave–function for the same reasons a single vector cannot be used to reconstruct the whole Hilbert space. One may easily visualize the challenges involved in finding a potential with a single wave–function is by rephrasing Kac’s question as follows: Can different drum shapes make the same sound? The answer to this question is yes. This is what makes the problem of finding the potential of a quantum system by inverting the wave–function challenging. Different potentials may be found for the same wave–function unless we have prior knowledge of the system and can impose certain constraints.
In this work, we propose to learn a new parametric function
which corresponds to the effective potential that describes the quantum system. We achieve this by implementing a loss that obeys the Schrödinger’s equation. For time–independent systems, this loss function reads,
(6) |
where is the total derivative on a multivariate function and is the Frobenius norm. Thus, energy conservation is effectively demanded for time–independent systems. On the other hand, for time–dependent systems, our formulation for the time–dependent Schrödinger loss reads,
(7) |
In the case of the Wigner function, our Neural Network was trained by implementing a truncated Wigner–Moyal loss,
(8) |
where for all our experiments . The case where is known as the Liouville equation. However, we note that equation 6 and equation 8 determines up to a constant. Thus, an initial condition depending on each individual system was added.
In the formulation of all our neural networks, information about the explicit form of the wave–function is always considered. Therefore, the kinetic energy for each system is always computed in an exact manner. As a consequence, the energy of the system is learned at no additional computational cost.
The use of deep learning for understating physical phenomena has been an active field of development. In particular, the amount of literature where authors have endowed neural networks with classical Hamiltonian mechanics has increased considerably
Toth et al. (2019); Greydanus et al. (2019); Tong et al. (2020); Iten et al. (2020); Bondesan and Lamacraft (2019); Zhong et al. (2019); Chmiela et al. (2017). Conservation of energy and irreversibility in time are the key features of such networks. There are recent reports extending these results in cases of damped pendula, i.e., systems where there is dissipation of energy Zhong et al. (2020). However, applications of deep learning to quantum mechanics is till in its early stages Torfi et al. (2020); Raissi et al. (2017a, b, 2019); Dai et al. (2020); Carleo et al. (2019); Amabilino et al. (2019); Unke and Meuwly (2019); Schmitz et al. (2019); Schmidt et al. (2017); Hibat-Allah et al. (2020). Most of these works are focused on either solving the Schrödinger equation or predicting the trends of specific observables such as the energy of the system. In Cranmer et al. (2019)two methods for estimating the density matrix for a quantum system: Quantum Maximum Likelihood (QML) and Quantum Variational Inference (QVI) are introduced. The authors of this work used a flow based method
Toth et al. (2019); Rezende and Mohamed (2015) to increase the expressivity of their variational family of density matrices. However, they only validate their work on the harmonic and anharmonic quantum oscillator. To the best of our knowledge, there is no work that uses deep learning to solve inverse problems, i.e. to systematically estimate potentials from observations. Our method also shows that neural networks can be used to tackle (often difficult) higher order PDE’s, which are commonly estimated with numerical methods.The performance of our proposed Quantum Potential Neural Network is validated on seven different quantum systems, four of which have exact analytical solutions for the time–independent Schrödinger equation, see Table 1. For all the systems, our neural network is a
layer feedforward network with a residual connection between the first and the second layers and Tanh non–linearity in the first 2 layers. However, the non–linearity in the last layer varied in each of the systems. The networks were trained with optimizer Adam
Kingma and Ba (2014) and a learning rate of . The quantitative analysis of our results are reported in Table 2. More information about the training and implementation of the neural networks can be found in the supplementary material. Additional figures of the wave–functions, probability distributions, and Wigner functions can be also found in the supplementary material. Our code is available at https://github.com/arijitthegame/Quantum-Hamiltonians.System | Potential | wave–function | Energy |
---|---|---|---|
Harmonic Oscillator | |||
Pöchl–Teller potential | |||
Radial Hydrogen atom | |||
2D Harmonic Oscillator | |||
|
All computational implementations were written in Python and PyTorch. Derivatives were computed using the PyTorch autograd function. However, in certain cases, higher derivatives were approximated by forward and backward differences.
.
In this section we consider some simple one–dimensional time–independent systems. Wave–functions, potential energy and energy levels can be found in Table 1. We report our learned potentials in figure 1 and show that our models obey energy conservation laws in figure 2. For more details about the experiments and figures of wave–functions, please refer to the supplementary material. For the derivation of these wave–functions and general properties of these systems, please see J. and D. (1995). For simplicity, , and were set equal to .
Quantum Harmonic Oscillator: The wave–functions in this case are given by Hermite polynomials . We choose as input to our model. Since is given by a differential equation (equation 6) one needs to impose an initial condition to get a unique solution. However, constraining the output of removes the need for the initial condition. Fig 1 and fig 2 (left) shows the learned potential and energy of the system.
The Hydrogen Atom (2p case) : The general radial wave–functions are given by generalized Laguerre polynomials but in this case simplifies to . We used as input to our model, the initial condition and the loss function where is given by equation 6. Fig 1 and fig 2 (middle) shows the learned potential and energy of the system.
Pöschl-Teller potential : The wave–function generated by this potential is defined by Legendre functions . For simplicity, let . We choose as input to our model. We imposed an initial condition and used a similar auxiliary loss function as above. Fig 1 and fig 2 (right) shows the learned potential and energy of the system.
Now, we turn our attention to a quantum system where the Schrödinger equation cannot be solved exactly, but can be formulated in an approximate manner using perturbation theory. In perturbation theory, the Hamiltonian of a system is defined as the sum of the unperturbed Hamiltonian () and the perturbation (), , whereas the wave–functions are expressed in terms of powers of : Here we use the wave–function only corrected up to a first order for the particle in a box perturbed by an external potential of the form . For our experiments the ground truth potential is with . For this system we were not only able to predict the potential but we were also able to learn the perturbed distribution. Fig 3 shows our results on this system. It seems that energy is not conserved for this system but that is merely due to our approximations. For more details about the perturbed wave function and the experiment see the supplementary material.
Our work scales easily and quickly to -dimensions as well. The wave–function here is a product of two Hermite polynomials defined above. We choose as input to our model and constrained our output to . Thus our loss function is exactly as D Harmonic Oscillator. Fig 4 shows our results for this system and the middle figure shows our learned energy is a good approximation to the total energy (z scale chosen from ).
A solitary wave is a wave which propagates without any temporal evolution in shape or size when viewed in the reference frame moving with the group velocity of the wave Wazwaz (2009). Solitary waves arise in many contexts, including the elevation of the surface of water and the intensity of light in optical fibers and is particularly important in the Bose–Einstein condensation theory. A soliton is a nonlinear solitary wave with the additional property that the wave retains its permanent structure, even after interacting with another soliton. Solitons form a special class of solutions of model equations, including the Korteweg de–Vries (KdV) and the Nonlinear Schrödinger (NLS) equations. In our particular case, the soliton satisfies the following differential equation:
(9) |
and the loss function is given by equation 7. Let and is . We choose as input for our model. Fig 4 (right) shows our results for this system.
Harmonic Oscillator : The Wigner function for the harmonic oscillator has the following form B. (2008):
Since , the Moyal–Wigner equation in this case degenerates to the classical Liouville equation. Let and is the input to the model. The initial condition is and our loss function where is given by equation 8. Fig 5(left) shows the potential learned by the model.
Pöschl-Teller potential : The Wigner function in this case Chen et al. (2018) is given by:
(10) |
Using the mathematical properties of Wigner functions, we approximate the above integral by:
(11) |
The potential is which is infinitely differentiable. We choose as input to our model. In this experiment we will attempt to approximate the infinite order PDE (equation 5) by equation 8. One then cannot assume that any non–steady state solution predicted by the truncated Wigner function is immediately valid, as it can be shown that higher order quantum corrections are responsible for quantum mechanical phase space behavior. The th order truncation matches the potential in a small neighborhood of . Figure 5 summarizes some of our findings, for more details on approximation procedures and their challenges for the infinite order Wigner–Moyal PDE see supplementary material.
In most experiments, it is not possible to determine the actual wave–function as one can only observe the probability distribution which contains much less information than the wave–function itself (see the middle sub–figure in Figure 6). However, given that we just know the probability distribution, and not the wave–function, is it still possible to learn something about the potential of the system? We made some initial progress towards answering this question for the case of the quantum harmonic oscillator. For more details about this experiment, please see the supplementary materials.
System | RMSE between True and Learned Potentials | RMSE between True and Learned Energies |
---|---|---|
Harmonic Oscillator | ||
Pöchl–Teller potential | ||
Radial Hydrogen atom | ||
2D Harmonic Oscillator | ||
Particle in a Box |
||
Soliton |
- | |
Harmonic Oscillator from Wigner |
- |
In this work, we presented a new class of neural networks called Quantum Potential Neural Networks. This new type of neural networks are capable of learning the effective potential for a large variety of quantum systems using only data inferred from wave–functions or Wigner functions. To the best of our knowledge, this is the first attempt to systematically investigate solutions for inverse quantum problems using neural networks. Moreover, compared with other numerical techniques used in inverse quantum problems, our approximation does not require previous information about the nature of the system nor information about the magnitudes of its expectation values. The encouraging results obtained for the different reported experiments motivate the further development of Quantum Potential Neural Networks for cases where the data is directly obtained from experimental probability distributions. One can also easily use our methods and equations 6 to solve for wave–functions in the time–dependent Schrödinger equation, for which there are a limited number of non–trivial examples. Generally, one has to start with Schrödinger’s equation and come up with a numerical approximation for a wave–function van Dijk et al. (2017). Similar to this problem, there is no closed formula for a time–dependent Wigner function except in some trivial cases. Furthermore, the development of better approximations for the Wigner functions is a very active field of research in physics. In addition to quantum systems with approximated solutions, our methods can be extended to systems where energy is not conserved. Finally, we would like to point out that our models are easy to implement and easy to train, allowing for future explorations of more complex systems. We hope that this work will be beneficial to the broader community of physicists and will motivate mathematicians to use neural networks to approximate complicated higher order PDEs for which no exact solutions are known.
An important future direction of our work will be to learn the potential just from the observed probabilities and not from the full wave–function or the Wigner function. We will also systemically extend our work to solve higher order Wigner–Moyal equations, for which the interpretation is still not clear. We hope that our work will shed some light into interpreting them. Another potential extension of this work is to apply our method to plasma and high energy physical systems Gonoskov et al. (2019).
We envision this work to be beneficial to a broader community since we hope it will encourage researchers to use deep learning in trying solve various complicated differential equations. We are however limited by the curse of dimensionality as it will be significantly difficult to be able to run these experiments on a CPU.
Quantum mechanics has been one of the most successful models for describing the physical world. However, quantum mechanical systems are generally hard to solve and exact solutions only exist for simple systems. As such, by leveraging the power of neural networks we have aimed to improve the practical use of quantum mechanics and thus potentially contribute to the understanding of our world. One caveat is that our method provides only an approximation for the description of quantum phenomena, and thus the possibility of incorrect predictions cannot be precluded.
The first author would like to thank Neal Ravindra, Emanuele Zappala and Olivier Trottier for helpful suggestions and interesting conversations.
Quantum-chemical insights from deep tensor neural networks
. Nature Communications 8 (1). External Links: ISSN 2041-1723, Link, DocumentWe present a brief explanation for our time–independent Schrödinger Loss function. The Hamilton is the sum of kinetic energy and potential energy . The kinetic energy is given by the Laplacian operator.
(12) |
For the time independent case, the Schrödinger’s equation boils down to
(13) |
where E is the energy of the system. For simplicity, let . Using equation 12, we can write
(14) |
Dividing the above equation by throughout we get
(15) |
Since the energy is constant, the derivative with respect to on the left hand side of equation 15 is and this is our time–independent Schrödinger loss function.
The Wigner function in this case [14] is given by:
(16) |
The Wigner function is a real–valued bounded function. Thus by breaking the integral in equation 16 into real and complex parts, we only focus on the real part. Using Euler’s formula, we get the following:
(17) |
Note that the integral in equation 17 is invariant under the change of variable . This implies in order to calculate , we only have to integrate from to and multiply that integral by 2. Our final simplification comes from studying the decay properties of the Wigner functions. Using and , we found that the integrand in equation 17 behaves like (resp. ) as (resp. ). We picked a threshold of to truncate the integral from positive real axis to a bounded interval which gives the following form :
(18) |
This integral is used in our experiments to approximate the potential. The Wigner method to study the time–frequency properties of dynamical systems involves taking the partial derivatives with respect to time of the Wigner function. These derivatives on the Wigner function yield what is known as the Wigner–Moyal equation. The physical interpretations, numerical difficulties and approximations of the Wigner–Moyal equation have been widely discussed in the literature, thus for information about the mathematical challenges associated with the Wigner–Moyal equation, we recommend readers to consult these references [14, 11, 24, 30, 19, 38, 3, 25].
A particle with no spin, of mass m, was placed in a square one dimensional box, , of length . Later the particle was presented with the perturbation . The wave–function for the perturbed system was approximated by considering first order corrections for the unperturbed particle in a box wave–function,
(19) |
where , and are the unperturbed particle in a box th state wave–function and its energy, whereas, indicates the following integral
(20) |
For our computations, the wave–function, , obtained as solution of the Schröedinger equation for the particle in a box model reads,
(21) |
and the energy for the system is given by
(22) |
In this experiment we not only learn the potential but also learn the perturbed wave–function. We use two neural networks, one to learn the potential and the other to learn the perturbed wave–function. The perturbed wave–function was learned in a supervised manner, whereas the potential was learned in an unsupervised manner. If is the neural network learning the perturbed wave–function , then our auxiliary loss function becomes
(23) |
where is the time–independent Schrödinger loss given by equation in the main text used to learn the potential and is calculated using the perturbed wave–function.
In this section, we revisit the inverse problem in section in the main text. In this case we only know the probability distribution, , and not the wave–function, . As is customary in quantum physics, we start with an approximate wave–function . At this point, physicists try to use the knowledge of the system to recreate the wave–function and then use the Schrödinger’s equation to find the potential. The figure (right) in the main text shows the potential learned by the system for the second excited state when the wave–function is taken to be . Thus, the natural question is can one do better? What other inductive biases can one use in our system? In the case the harmonic oscillator, we know that the potential is an even function, thus we can create a new auxiliary loss , where is given by equation . Using this auxiliary loss, one can get an improvement over the model that only uses the time–independent Schrodinger loss. Even though we learned the potential with a remarkable accuracy, we fall quite short in actually estimating the true energy of the system. However without any additional optimization process or knowledge about the system, this is very difficult problem.
![]() |
![]() |
Our Neural Network is a 3-layer feedforward network with a residual connection between the first and the second layers. The activation and the scaling in the final layers varied from experiment to experiment. Our main motivation for scaling and using different activation is to show that an appropriate architecture can perfectly learn the correct potential without an initial condition. All the models are trained for epochs. Table 3 shows the activation, scaling and the size of the training data for each of the studied systems. All the training data was randomly sampled from the appropriate domains and trained in a minibatch fashion with batch size .
System | Final Layer Activation | Final Layer Scaling | Size of training data |
---|---|---|---|
Harmonic Oscillator | Sigmoid | ||
Pöchl–Teller potential | None | None | |
Radial Hydrogen atom | None | None | |
2D Harmonic Oscillator | Sigmoid | None | |
Potential for Particle in a Box |
Sigmoid | 10 | |
Perturbation for Particle in a Box |
None | None | |
Soliton |
None | None | |
Harmonic Oscillator from Wigner |
None | Sigmoid | |
Harmonic Oscillator from Distribution |
Sigmoid | ||
Pöschl–Teller from Wigner |
None | None |
Some training details and model hyperparameters
Some additional figures of the wave–functions and Wigner functions used in our experiments.
![]() |
![]() |