I Introduction
The simulation of physical systems is both valuable for basic science and has technological applications across a diverse range of industries from materials design to pharmaceutical development. Relative to classical computers, quantum computers have the potential to provide an exponentially more efficient means of simulating quantum mechanical systems. Quantum hardware has progressed substantially in recent years google2019supremacy; google2020observation
. However, despite continual progress, we remain in the ‘noisy intermediatescale quantum’ (NISQ) era in which the available hardware is limited to relatively small numbers of qubits and prone to errors. Simulation algorithms designed for faulttolerant quantum computers, such as Trotterization methods
lloyd1996universal; sornborger1999higher, qubitization methods low2019hamiltonian, and Taylor series methods berry2015simulating, require deeper circuits than viable given the short coherence times of current hardware. Thus alternative approaches are needed to successfully implement useful simulations on NISQ hardware.Variational quantum algorithms cerezo2020variationalreview; endo2020hybrid; bharti2021noisy; VQE; farhi2014quantum; mcclean2016theory; QAQC; larose2019variational; VCH; cerezo2020variationalfidelity; Li; endo2020variational; yao2020adaptive; heya2019subspace; cirstoiu2020variational; VHD, where a classical computer optimizes a cost function measured on a quantum computer, show promise for NISQ quantum simulations. An early approach introduced an iterative method, where the state is variationally learned on a stepbystep basis using action principles trout2018simulating; endo2020variational; yao2020adaptive. Subsequently, a generalization of the variational quantum eigensolver VQE was developed for simulations in low lying energy subspaces heya2019subspace. Very recently, quantumassisted methods have been proposed that perform all necessary quantum measurements at the start of the algorithm instead of employing a classicalquantum feedback loop bharti2020quantum; lau2021quantum; haug2020generalized.
In this work, we improve upon a recently proposed variational quantum algorithm known as Variational Fast Forwarding (VFF) cirstoiu2020variational. VFF allows long time simulations to be performed using a fixed depth circuit, thus enabling a quantum simulation to be ‘fast forwarded’ beyond the coherence time of noisy hardware. The VFF algorithm requires finding a full diagonalization of the short time evolution operator of the system of interest. Once found, the diagonalization enables any initial state of that system to be fast forwarded. However, for practical purposes, one is often interested in studying the evolution of a particular fixed initial state of interest. In that case a full diagonalization of is overkill. Instead, it suffices to find a diagonal compilation of that captures its action on the given initial state. Here, we show that focusing on this commonly encountered but less exacting task can substantially reduce the resources required for the simulation.
Specifically, we introduce the fixed state VFF algorithm (fsVFF) for fast forwarding a fixed initial state beyond the coherence time of a quantum computer. This approach is tailored to making dynamical simulation more suitable for NISQ hardware in two key ways. First, the cost function requires half as many qubits as VFF. This not only allows larger scale simulations to be performed on current resourcelimited hardware, but also has the potential to enable higher fidelity simulations since larger devices tend to be noisier. Second, fsVFF can utilize simpler ansätze than VFF both in terms of the depth of the ansatz and the number of parameters that need to be learnt. Thus, fsVFF can reduce the width, depth, and total number of circuits required to fast forward quantum simulations, hence increasing the viability of performing simulations on nearterm hardware.
We demonstrate these advantages by implementing longtime high fidelity quantum simulations of the 2qubit XY spin chain on Rigetti’s and IBM’s quantum computers. Specifically, while the iterated Trotter approach has a fidelity of less than 0.9 after 4 time steps and has completely thermalized by 25 time steps, with fsVFF we achieve a simulation fidelity greater than 0.9 for over 600 time steps.
In our analytical results, we prove the faithfulness of the fsVFF cost function by utilizing the newly developed NoFreeLunch theorems for quantum machine learning
poland2020no; sharma2020reformulation. We also provide a proof of the noise resilience of the fsVFF cost function, specifically the optimal parameter resilience sharma2019noise. Finally, we perform an analysis of simulation errors under fastforwarding.The diagonalizations obtained using fsVFF may further be useful for determining the eigenstates and eigenvalues of the Hamiltonian on the subspace spanned by the initial state. This can be done using a time series analysis, by using fsVFF to reduce the depth of the quantum phase estimation (QPE) algorithm, or using a simple sampling method. We demonstrate on IBM’s quantum computer that, while standard QPE fails on real hardware, fsVFF can be used to obtain accurate estimates of the spectrum.
Ii Background
Before presenting our fsVFF algorithm, let us first review the original VFF algorithm from Ref. cirstoiu2020variational. Consider a Hamiltonian on a dimensional Hilbert space (i.e., on qubits) evolved for a short time with the simulation unitary , and let (larger than ) denote the desired simulation time. Then the VFF algorithm consists of the following steps:

Approximate with a singletimestep Trotterized unitary denoted .

Variationally search for an approximate diagonalization of by compiling it to a unitary with a structure of the form
(1) where
is a vector of parameters. Here,
is a parameterized unitary that will (after training) encode the eigenvalues of , whileis a parameterized unitary matrix that will consist of the corresponding eigenvectors
cirstoiu2020variational. The compilation is performed using the local HilbertSchmidt test QAQC to find the parameters and that minimize the local HilbertSchmidt cost. 
Use the compiled form to simulate for time using the circuit
(2)
VFF has proven effective for providing a fixed quantum circuit structure with which to fastforward beyond the coherence time of current noisy quantum devices. However, the algorithm requires a full diagonalization of over the entire Hilbert space. The local HilbertSchmidt test used to find this diagonalization requires qubits. Additionally, the ansatz must be sufficiently expressible to diagonalize the full unitary to a high degree of approximation Sukin2019Expressibility; Nakaji2020Expressibility; Holmes2021Expressibility. This typically requires a large number of parameters and a reasonably deep circuit. These overheads limit VFF’s utility on current hardware.
In what follows, we introduce a more NISQfriendly refinement to VFF that reduces these overheads when one is interested in fastforwarding a fixed initial state , rather than any possible initial state. The fixed state VFF algorithm (fsVFF) is summarised in Fig. 1.
We note that VFF, like the standard iterated Trotter approach to quantum simulation, necessarily incurs a Trotter error by approximating with . This Trotter error may be removed using the Variational Hamiltonian Diagonalization algorithm (VHD), which directly diagonalizes the Hamiltonian VHD. However, VHD is yet more resource intensive than VFF on current hardware, so we focus here on refining VFF.
Iii Fixed State Variational Fast Forwarding Algorithm
iii.1 Cost function
In fsVFF, instead of searching for a full diagonalization of over the entire Hilbert space, we search for a diagonal compilation of that captures the action of on the initial state and its future evolution, . Here, we introduce a cost function tailored to this task.
To make precise what is required of the cost for fsVFF, let us first note that as the state evolves, it remains within its initial energy subspace. This can be seen by expanding the initial state in terms of the energy eigenbasis (the eigenbasis of ) as
(3) 
where , and noting that
(4) 
Thus it follows that if spans energy eigenstates of , so does for all future times. Therefore to find a compilation of that captures its action on (for all times ) it suffices to find a compilation of on the dimensional subspace spanned by . We stress that the eigenstates need not be ordered, and therefore the subspace spanned by the subset is not necessarily low lying in energy.
A NoFreeLunch Theorem for quantum machine learning introduced in Ref. poland2020no proves that to perfectly learn the action of a unitary on a dimensional space requires training pairs. In the context of fsVFF, we are interested in learning the action of a unitary on an dimensional subspace. Since the unitary is block diagonal, one can directly apply this NFL theorem to the subspace of interest. Therefore training pairs are required to learn the unitary’s action on this subspace. (Note, we assume here that the training states are not entangled with an additional register. It was shown in Ref. sharma2020reformulation that using entangled training data can reduce the required number of training states. In fact, this more powerful method is used by the VFF algorithm. However, producing such entangled training data requires additional qubits and twoqubit gates and therefore is less NISQfriendly.)
The NoFreeLunch theorem therefore implies that states are required to learn on (assuming leakage due to Trotter error is negligible). In general these states may be freely chosen from the subspace spanned by . Here a convenient choice in training states would be and its Trotter evolutions, that is the set . Motivated by these observations, we define our cost function for fsVFF as
(5) 
where similarly to VFF we use a diagonal ansatz . This cost quantifies the overlap between the initial state evolved under for time steps, , and the initial state evolved under the trained unitary for time steps, , averaged over time steps. Assuming we have access to the unitary that prepares the state , the state overlaps can be measured using qubits, via a circuit that performs a Loschmidt echo sharma2019noise. Therefore can be evaluated using only qubits. This is half as many as standard VFF, opening up the possibility of performing larger simulations on current hardware.
It is important to note that while the exact timeevolved state is perfectly confined to the initial subspace, the approximate evolution induced by allows for leakage from the initial subspace sahinoglu2020hamiltonian. Thus the subspace spanned by in general does not perfectly overlap with . However, by reducing and considering higher order Trotter approximations, this leakage can be made arbitrarily small. In Appendix A, we prove that in the limit that leakage from the initial subspace is negligible, is faithful. That is, we show that the cost vanishes, , if and only if the fidelity of the fastforwarded simulation is perfect,
(6) 
for all times . Note, that the reverse direction is trivial. If for all , then .
Similar to the VFF cost, the fsVFF cost is noise resilient in the sense that incoherent noise should not affect the global optimum of the function. This is proven for a broad class of incoherent noise models using the results of Ref. sharma2019noise in Appendix B.
Nonetheless, it is only possible to measure if the unitary can be implemented comfortably within the coherence time of the QC. Additionally, the number of circuits required to evaluate in general scales with . Given these two restrictions, fsVFF is limited to simulating quantum states spanning a nonexponential number of eigenstates. Consequently, we advocate using fsVFF to simulate states with . Crucially these states need not be low lying and therefore our approach is more widely applicable than the Subspace Variational Quantum Simulator (SVQS) algorithm heya2019subspace, which simulates fixed low energy input states.
While was motivated as a natural choice of cost function to learn the evolution induced by a target unitary on a fixed initial state, it is a global cost cerezo2020cost and hence it encounters what is known as a barren plateau for large simulation sizes mcclean2018BP; cerezo2020cost; cerezo2020impact; arrasmith2020effect; Holmes2020BP; Holmes2021Expressibility; volkoff2021large; Sharma2020BP; pesah2020absence; uvarov2020barren; marrero2020entanglement; patti2020entanglement. In Appendix C we suggest an alternative local version of the cost to mitigate such trainability issues.
iii.2 Calculating
To evaluate , it is first necessary to determine , the number of energy eigenstates of the system Hamiltonian spanned by . In this section, we present an algorithm to calculate .
Our proposed algorithm utilizes the fact that the number of energy eigenstates spanned by is equivalent to the number of linearly independent states in the set where with . The subspace spanned by is known as the Krylov subspace associated with the operator and vector Krylov1931. Therefore is equivalently the dimension of the Krylov subspace .
To determine the dimension of we can utilize the fact that the determinant of the Gramian matrix of a set of vectors (i.e., the matrix of their overlaps) is zero if and only if the vectors are linearly dependent. The Gramian corresponding to is given by
(7) 
If , then the vectors in are linearly independent and therefore span at least a dimensional subspace. Conversely, if , the set contains linear dependencies and the subspace they span is less than dimensional. Therefore, if we can find , the smallest such that , then (noting that is a dimensional matrix) we know that
is the largest number of linearly independent vectors spanned by
. That is, is the dimension of and so we have that .The overlaps for any and can be measured using the Hadamard Test, shown in Fig. 1, and thus the Hadamard test can be used to determine on quantum hardware. Since the Gramian here contains two symmetries, hermiticity and the invariance , we only have to calculate the first row of the matrix on the quantum computer.
In summary, our proposed algorithm to determine consists of the following loop. Starting with ,

Construct using the Hadamard test.

Calculate (classically) .
If , terminate the loop and conclude that .
If , increase and return to step 1.
This is shown schematically in Fig. 1.
We remark that in the presence of degeneracies in the spectrum of , the eigenvectors corresponding to degenerate eigenvalues are not unique. Therefore, in this case, the number of states spanned by depends on how the eigenvectors corresponding to degenerate eigenvalues are chosen. However, as detailed in Appendix A, to learn the action of on , what matters is the number of eigenstates spanned by corresponding to unique eigenvalues. This is equivalent to the dimension of the Krylov subspace . Consequently, the algorithm detailed above can also be used in this case.
iii.3 Ansatz
The fsVFF algorithm, similarly to VFF, employs an ansatz of the form
(8) 
to diagonalize the initial Trotter unitary . Here is a quantum circuit that approximately rotates the standard basis into the eigenbasis of , and is a diagonal unitary that captures the (exponentiated) eigenvalues of . A generic diagonal operator can be written in the form
(9) 
where and we use the notation
(10) 
with the Pauli operator acting on qubit . While Eq. (9) provides a general expression for a diagonal unitary, for practical ansätze it may be desirable to assume that the operators are local operators and the product contains a polynomial number of terms, i.e., is in . There is more flexibility in the construction of the ansätze for since these are generic unitary operations. A natural choice might be to use a hardwareefficient ansatz kandala2017hardware.
One of the main advantages of fsVFF is that diagonalization is only necessary over the subspace spanned by the initial state, rather than the entire Hilbert space which will be significantly larger. To outperform standard VFF, it is in our interest to take advantage of this small subspace to find compact ansätze.
The two main impeding factors we wish to minimize to aid diagonalization are error rates and optimization time. Therefore, when searching for ansätze, our priorities are to minimize the number of CNOT gates required (the noisiest component in the ansätze) and the number of rotation parameters. There is, however, a trade off between expressibility of the ansatz and its trainability. There needs to be enough freedom in the unitary to map the required eigenvectors to the computational basis but generically highly expressive ansätze exhibit barren plateaus Holmes2021Expressibility.
For systems with symmetries and/or systems that are nearby perturbations of known diagonalizable systems, it may be possible to find a fully expressive, compact ansatz by inspection. This is the case for a simple 2qubit XY Hamiltonian, as discussed in Section IV.
More generally, it can be challenging to analytically find compact but sufficiently expressible ansätze. Nonetheless, it is possible to variationally update the ansatz structure and thereby systematically discover simple structures. One straightforward approach is to use a layered ansatz where each layer initializes to the identity gate grant2019initialization; skolik2020layerwise. The ansatz can be optimized until it plateaus, redundant single qubit gates removed, then another layer can be appended and the process repeats. Alternatively, more sophisticated discrete optimization techniques may be used to variationally search the space of ansätze.
iii.4 Summary of algorithm
The fixed state Variational Fast Forwarding algorithm (fsVFF) is summarized in Fig. 1. We start with an initial state that we wish to evolve under the Hamiltonian .

The first step is to approximate the short time evolution using a single step Trotter approximation .

This Trotter approximation can be used to find an approximation for
, the dimension of the energy eigenspace spanned by
, using the method outlined in Section III.2. 
Equipped with a value for , we then variationally search for a diagonalization of over the energy subspace spanned by using , Eq. (5). At each iteration step the gradient of the cost with respect to a parameter is measured on the quantum computer for a fixed set of parameters using the analytic expressions for provided in Appendix D. These gradients are used to update the parameters using a classical optimizer, such as those in Refs. kubler2020adaptive; arrasmith2020operator; sweke2020stochastic. The output of the optimization loop is the set of parameters that minimize ,
(11) 
Finally, the state can be simulated for time using the circuit
(12) That is, by simply multiplying the parameters in the diagonalized unitary by a constant number of iterations .
In Appendix E, we show that the total simulation fidelity, in the limit that leakage is small, is expected to scale subquadratically with the number of fastforwarding time steps . Thus if the minimal cost from the optimization loop is sufficiently small, we expect the fsVFF algorithm to allow for long, high fidelity simulations.
Iv Hardware Implementation
In this section we demonstrate that fsVFF can be used to implement long time simulations on quantum hardware. Specifically, we simulate the XY spin chain, which has the Hamiltonian
(13) 
where and are Pauli operators on the qubit. In what follows, we first present results showing that we can determine for an initial state using the method described in Section III.2. We then demonstrate that the fsVFF cost can be trained to find an approximate diagonalization of on the subspace spanned by . We finally use this diagonalization to perform a long time fast forwarded simulation. In all cases we focus on a two qubit chain, i.e. , and we approximate its evolution operator using a firstorder Trotter approximation.
iv.1 Determining
The 2qubit XY Hamiltonian has the eigenvectors , corresponding to the eigenvalues {0, 1, 1, 0}. As proof of principle, we tested the algorithm for determining on the states (corresponding to ), () and (). As described in Section III.2, the of these states can be found by calculating for increasing values of since, as is increased, the determinant first equals 0 when .
To verify this for the states considered here, we first determine using a classical simulator. As seen in Figure 2, in this case exactly equals 0 when . We then measured on Honeywell’s quantum computer. Although on the real quantum device gate noise and sampling errors are introduced, the results reproduce the classical results reasonably well. Namely, at the correct value of , drastically reduces and approximately equals 0. Thus, we have shown that it is possible to determine for an initial state by measuring on quantum hardware.
iv.2 Training
We tested the training step of the algorithm on IBM and Rigetti’s quantum computers, specifically ibmq_toronto and Aspen8. For the purpose of implementing a complete simulation, we chose to focus on simulating the evolution of the state . As discussed in the previous section, this state spans eigenstates.
To diagonalize on the 2dimensional subspace spanned by , we used a hybrid quantumclassical optimization loop to minimize . For a state with the cost , Eq. (5), uses two training states where is the firstorder Trotter approximation of . On the IBM quantum computer we evaluated the full cost function for each gradient descent iteration. However, the time available on the Aspen8 device was limited, so to speed up the rate of optimization we evaluated the overlap on just one of the two training states per iteration, alternating between iterations (instead of evaluating the overlaps on both training states every iteration). To allow the movement through parameter space to use information averaged over the two timesteps, whilst only using a single training state per cost function evaluation, momentum was added to the gradient updates defazio2020understanding.
To take advantage of the fact that more compact ansätze are viable for fsVFF, we variationally searched for a short depth ansatz, tailored to the target problem. Specifically, we started training with a general 2qubit unitary and then during training the structure was minimised by pruning unnecessary gates. In Figure 3, we show the circuit for the optimal ansatz obtained using the method. The ansatz requires one CNOT gate and two single qubit gates for and only one rotation for . This is a substantial compression on the most general two qubit ansatz for which requires 3 CNOTs and 15 single qubit rotations and the most general 2 qubit ansatz for which requires 2 rotations and one 2qubit ZZ rotation (though in the case of the XY Hamiltonian this may be simplified to only 2 rotations lieb1961two).
Figure 4(a) shows the fsVFF cost function versus the number of iterations for the implementations on ibmq_toronto (yellow) and Aspen8 (red). The dashed line indicates the noisy cost value obtained from the quantum computer. To evaluate the quality of the optimization, we additionally classically compute the true cost (indicated by the solid lines) using the parameters found on ibmq_toronto and Aspen8. While the noisy cost saturates at around , we obtained a minimum noisefree cost of the order . The two orders of magnitude difference between the noisy and the noisefree cost is experimental evidence that the cost function is noise resilient on extant quantum hardware.
iv.3 Fast forwarding
Finally we took the two sets of parameters found from training on ibmq_toronto and Aspen8, and used them to implement a fastforwarded simulation of the state on ibmq_rome. To evaluate the quality of the fast forwarding we calculated the fidelity, , between the density matrix of the simulated state, , after timesteps, and the exact time evolved state, , at time . We used Quantum State Tomography to reconstruct the output density matrix, and then calculated the overlap with the exact state classically.
As shown by the plots of in Figure 4(b), fsVFF significantly outperforms the iterated Trotter method. Let us refer to the time before the simulation falls below an error threshold as the high fidelity time. Then the ratio of the high fidelity time for fsVFF () and for standard Trotterization () is a convenient measure of simulation performance,
(14) 
A simulation can be said to have been successfully fastforwarded if . The iterated Trotter method dropped below a simulation infidelity threshold of after 4 (8) timesteps. In comparison, fsVFF maintained a high fidelity for 625 (1275) timesteps. Thus we achieved a simulation fastforwarding ratio of ().
V Energy estimation
The diagonalization obtained from the optimization stage of fsVFF, , implicitly contains approximations of the eigenvalues and eigenvectors of the Hamiltonian of the system of interest. In this section we discuss methods for extracting the energy eigenstates and eigenvalues from a successful diagonalization and implement them on quantum hardware.
The energy eigenvectors spanned by the initial state can be determined by the following simple sampling method. The first step is to apply to the initial state . In the limit of perfect learning and vanishing Trotter error, this gives
(15) 
where and is a set of computational basis states. The energy eigenstates spanned by are then found by applying to any of the states obtained from measuring in the computational basis, that is .
Extracting the energy eigenvalues from is more subtle. Firstly, as and , even in the limit of perfectly minimizing the cost , may disagree by a global phase , at best we can hope to learn the difference between, rather than absolute values, of the energy eigenvalues of . For simple cases, where the diagonal ansatz is composed of a polynomial number of terms, these energy value differences may be extracted directly by rewriting in the computational basis. For example, in our hardware implementation and therefore the difference in energy between the two eigenvalues spanned by is given by . Here in an integer correcting for the arbitrary phase arising from taking the of that can be determined using the method described in Ref. VHD. Using this approach, we obtain 1.9995 and 2.0019 from the training on IBM and Rigetti respectively, in good agreement of the theoretically expected value of 2. For more complex cases, this simple postprocessing method will be become intractable and an algorithmic approach will be necessary.
Quantum Phase Estimation (QPE) nielsen2000quantum and Quantum Eigenvalue Estimation (QEE) somma2020quantum are fault tolerant quantum algorithms for estimating the eigenvalues of a unitary operation. However, their implementation on current quantum devices is limited by the reliance on the execution of controlled unitaries from ancillary qubits. These controlled unitaries require many entangling gates, and introduce too much noise to be realized for large scale systems on current hardware. Once an evolution operator has been diagonalized in the subspace of an initial state, fsVFF can be used to significantly reduce the circuit depth of QPE and QEE, as shown in Figure 5. In this manner, fsVFF provides a NISQ friendly means of estimating the eigenvalues within a subspace of a Hamiltonian.
To demonstrate the power of fsVFF to reduce the depth of QPE, we perform QPE using the diagonalization obtained from training on IBM’s quantum computer. Specifically, we consider the input eigenvector . This is one of the eigenvectors spanned by the input state of our earlier hardware implementation, . We then consider evolving under for a time step of . Since the energy of the state equals 1, we expect this to result in a phase shift of being applied to . We implemented QPE and fsVFF enhanced QPE to measure this phase using the circuits shown in Fig 5. We chose to measure to 3 bits of precision and therefore the output should be the measurement 001 with probability one. As Figure 6 shows, it appears that the standard QPE implementation was unable to discern this phase. In contrast, when fsVFF was used to reduce the circuit depth, the output distribution was strongly peaked at the correct state.
QEE requires only one ancillary qubit, a single implementation of
, and no Quantum Fourier Transform and therefore is less resource intensive than QPE. Nonetheless, we can again, as shown in Fig.
5, use fsVFF as a preprocessing step to reduce the circuit depth.We tested this on the 3qubit XY Hamiltonian by first performing fsVFF on a quantum simulator with the initial state . Having obtained an approximate diagonalization, we determined the eigenstates using the sampling method described earlier. Figure 7 shows the results of the measurement of , with four strong peaks corresponding to the four eigenvectors in this subspace.
Figure 8 shows the results of QEE implemented on ibmq_boeblingen. We use the basis states found from the sampling method as our inputs to reduce the depth of the circuit, and remove the need to use the timeseries method originally proposed for extracting the eigenvalues, as we could calculate the eigenvalues individually by inputting their corresponding eigenvectors. A value of was used so the phase calculated directly matched the eigenvalue. After removing a global phase, QEE had accurately found the eigenvalues of the four eigenvectors, with a meansquared error from the true values of .
Vi Discussion
In this work, we demonstrated that despite the modest size and noise levels of the quantum hardware that is currently available, it is possible to perform long time dynamical simulations with a high fidelity. Specifically, we have introduced fsVFF, a new algorithm for NISQ simulations, which we used to simulate a 2qubit XYmodel spin chain on the Rigetti and IBM quantum computers. We achieved a fidelity of at least 0.9 for over 600 time steps. This is a 150fold improvement on the standard iterated Trotter approach, which had a fidelity of less than 0.9 after only 4 time steps.
Central to the success of the fsVFF algorithm is the fact that it is tailored to simulating a particular fixed initial state rather than an arbitrary initial state. By focusing on this less demanding task, we showed that it is possible to substantially reduce the width and depth of the previously proposed VFF algorithm. In particular, since fsVFF only requires finding a diagonalization of a shorttime evolution unitary on the subspace spanned by the initial state (compared to the entire Hilbert space in the case of VFF), fsVFF can utilize much simpler ansätze. This is demonstrated in our hardware implementation, where one CNOT and two parameterized single qubit rotations proved sufficient for an effective ansatz for and one single qubit rotation was sufficient for .
While one expects that shorterdepth ansätze may be used to diagonalize a unitary on a subspace, rather than total Hilbert space, the question of how to design such ansätze remains an important open question for future research. We believe it would be worthwhile both to explore analytic approaches whereby the symmetry properties of the ansatz are used, as well as to use machine learning approaches to systematically search for suitable ansätze.
The fsVFF algorithm, similarly to VFF, is fundamentally limited by the initial Trotter error approximating the short time evolution of the system. The Variational Diagonalization Hamiltonian (VHD) algorithm VHD may be used to remove this error. However, like VFF, VHD is designed to simulate any possible initial state. There are a number of different approaches inspired by fsVFF that could be explored for reducing the resource requirements of the VHD algorithm by focusing on simulating a particular initial state. Such a “fixed state VHD” algorithm would allow for more accurate long time simulations on NISQ hardware.
More generally, our work highlights the trade off between the universality of an algorithm and the resources required to implement it. One can imagine a number of alternative ways in which the universality of an algorithm can be sacrificed, without significantly reducing its utility, in order to make it more NISQ friendly. For example, one is often interested in studying the evolution of a particular observable of interest, rather than all possible observables. It would be interesting to investigate whether a fixedobservable fsVFF could further reduce the resources required to implement long time high fidelity simulations. More broadly, an awareness of this trade off may prove useful beyond dynamical simulation for the ongoing challenge of adapting quantum algorithms to the constraints of NISQ hardware.
Vii Acknowledgements
JG and KG acknowledge support from the U.S. Department of Energy (DOE) through a quantum computing program sponsored by the Los Alamos National Laboratory (LANL) Information Science & Technology Institute. ZH, BC, and PJC acknowledge support from the LANL ASC Beyond Moore’s Law project. We acknowledge the LANL Laboratory Directed Research and Development (LDRD) program for support of AS and initial support of BC under project number 20190065DR as well as LC under project number 20200022DR. AA was supported by the U.S. Department of Energy (DOE), Office of Science, Office of High Energy Physics QuantISED program under Contract No. DEAC5206NA25396. LC and PJC were also supported by the U.S. DOE, Office of Science, Basic Energy Sciences, Materials Sciences and Engineering Division, Condensed Matter Theory Program. This research used quantum computing resources provided by the LANL Institutional Computing Program, which is supported by the U.S. Department of Energy National Nuclear Security Administration under Contract No. 89233218CNA000001. This research used additional quantum computational resources supported by the LANL ASC Beyond Moore’s Law program and by the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DEAC0500OR22725.
References
Appendix A Faithfulness of cost function
Here we demonstrate that the cost function is faithful in the limit that leakage from the subspace spanned by the initial state can be disregarded. That is, suppose we could compile the exact evolution of the system for a short timestep . Then the cost function vanishes,
(16) 
if and only if the fidelity of the fastforwarded simulation is perfect,
(17) 
for all times . Note, the reverse direction is trivial. If for all then .
Before embarking on the core of the proof let us first emphasize that in the definition of the cost (16) we average over terms, where is the number of eigenstates spanned by the initial state corresponding to unique eigenvalues of the Hamiltonian . The restriction to eigenstates with unique eigenvalues is important since it is this which determines the dimension of the subspace spanned by the future evolution of (which in turn determines the number of training states required to learn ).
To see this consider a Hamiltonian with a spectrum and corresponding eigenstates . The initial state can be expanded in the energy eigenbasis as
(18) 
The future evolution of such a state, i.e. the set of states
(19) 
with , is solely contained within the subspace spanned by since
(20) 
If the spectrum is nondegenerate then the evolution generates relative phases between all the eigenstates spanned by . In that case, will span the entirety of the dimensional subspace spanned by , i.e., an dimensional space. However, suppose the eigenstate expansion of includes two degenerate eigenstates, i.e. two eigenstates that share the same eigenvalue. In that case the evolution generates relative phases between only of the eigenstates and therefore is confined to a dimensional subspace. More generally, if the eigenstate expansion of includes states with unique eigenvalues, then evolution under generates relative phases between states and so will span a dimensional subspace of the space. In this manner, it is the number of eigenstates spanned by the initial state corresponding to unique eigenvalues, , that determines the subspace spanned by .
We note that any initial state can be written in the form Eq. (18) where, crucially, the sum is over only eigenstates corresponding to unique energies, i.e. . For a Hamiltonian with a nondegenerate spectrum this expansion is trivial. For a Hamiltonian with a degenerate spectrum there is some freedom in how the eigenstates corresponding to degenerate eigenvalues are defined, since the superposition of degenerate eigenstates is also an eigenstate corresponding to the same energy. Therefore, henceforth, for degenerate Hamiltonians, we suppose that the energy eigenbasis is defined such that the eigenstate expansion of only contains eigenstates with unique energies, that is terms.
We remark that our approach here may also be framed in the language of Krylov spaces. The Krylov subspace Krylov1931 associated with the operator and vector is the linear subspace spanned by the vectors generated by evolving under up to times. That is
(21) 
with . Now supposing and , we have that . Thus the future evolution of is confined to the Krylov space . This is an dimensional subspace, and therefore, as will become clear, we require training states in order to learn on this subspace.
To prove the forward direction, we first note that if then as , we have that for all . It thus follows that the action of on agrees with the action of on up to an unknown phase , i.e. for we have that
(22) 
Or equivalently, the action of and agree on the training states up to an unknown phase , that is
(23) 
where .
Now by construction (see Section III.2) the training states are linearly independent. Furthermore, since the initial simulation time may be chosen freely, the unitary can be chosen such that none of the states in are orthogonal. In this case, the unknown phases all agree and we have that for all . To see this note that given and are linearly independent but nonorthogonal, the state can be represented as follows:
(24) 
where and . Then from (23), we find that
(25)  
(26)  
(27) 
where (note that this use of is distinct from used in the main text for a parameterized eigenvector unitary). The above expression can be rearranged as
(28) 
Since and the aforementioned equation is satisfied if and only if
(29) 
which implies that . Then by recursively applying this procedure to the rest of the states in the training set, we find that for and .
To understand the constraints from the minimization of the cost function, it is helpful to consider the form of the unitary matrix . It follows from Eq. (23), and the fact that since is unitary (), that can be represented as poland2020no:
Here the upper left hand block spans the dimensional subspace spanned by the input training states and is an unknown unitary matrix acting on the dimensional space orthogonal to . For later convenience let us also note that the matrix is also a unitary matrix of the form
where is again a dimensional unitary matrix.
We are now in a position to show that if , and by construction the states in are linearly independent and nonorthogonal, then
(30) 
for all times . To see this first note that for all times that follows directly from Eq. (22). Now by construction, any state for linearly depends on the states in and so it can be written as
(31) 
where . Thus we have that
(32) 
for any time . It straightforwardly follows from Eq. (22) and Eq. (32) that the simulation fidelity at time equals 1,
(33)  
(34) 
Now, let us consider the simulation fidelity at time ,
(35)  
(36)  
(37)  
(38) 
where we have again used Eq. (22) and Eq. (32). Finally, the simulation fidelity at an arbitrary time can be evaluated as follows,
(39) 
Thus, as claimed, if the cost function vanishes the simulation fidelity is perfect for all times.
Appendix B Noise Resilience of Cost
To make a connection with prior results on noise resilience, we first note that , Eq. (5), can be rewritten as
(40) 
with
(41) 
Here with . Let denote the circuit used to evaluate the cost term . Let denote the noisy version of the cost term , that is the cost evaluated when the circuit is run in the presence of noise. Let and denote the sets of unitaries that optimise and respectively. Now, it was shown in sharma2019noise that the costs are resilient to measurement noise, gate noise, and Pauli channel noise in the sense that for circuits experiencing such noise we have . This means that any set of parameters that minimize the noisy cost are guaranteed to also minimize the true exact cost .
We will now argue that this implies that is also noise resilient. To do so, we first note that the costs can be minimized simultaneously by any unitary that matches the target unitary up to a global phase , i.e. such that . Therefore, assuming that the ansatz for is sufficiently expressive that the costs can be simultaneously minimized, the total cost is minimized by the set of unitaries that minimize each of the costs simultaneously, that is the set . (Note, if the ansatz is not sufficiently expressive then it might not be possible to simultaneously minimize each of the terms and therefore the intersection might be empty). Now, given that , it follows that . Thus, as claimed the total cost is also noise resilient in the sense that any set of parameters that minimize the noisy cost also minimize the true exact cost .
Appendix C Local cost with trainability guarantee
While was motivated in Section III.1 as a natural choice of cost function to learn the evolution induced by a target unitary on a fixed initial state, it encounters what is known as a barren plateau for large simulation sizes mcclean2018BP; cerezo2020cost. (See Refs. cerezo2020impact; arrasmith2020effect; Holmes2020BP; Holmes2021Expressibility; volkoff2021large; Sharma2020BP; pesah2020absence; uvarov2020barren; marrero2020entanglement; patti2020entanglement) for further details about the barren plateau phenomenon.) Namely, the gradient of the cost vanishes exponentially with . As a result, for large systems the cost landscape is prohibitively flat and therefore an exponential precision is required to discern a minimization direction. This precludes successful training.
In this appendix we introduce a local cost function to surmount this difficulty. To motivate our local cost, we first recall that , Eq. (5), can be rewritten as