Let denote a quantum system associated with a Hilbert space of dimensionality . The simple example of a spin chain [Yao_16_Interferometric, Zhu_16_Measurement, Li_16_Measuring, Garttner_16_Measuring] informs this paper: Quantities will be summed over, as spin operators have discrete spectra. Integrals replace the sums if operators have continuous spectra.
denote local unitary operators. The eigenvalues are denoted byand ; the degeneracy parameters, by and . and may commute. They need not be Hermitian. Examples include single-qubit Pauli operators localized at opposite ends of a spin chain.
We will consider measurements of eigenvalue-and-degeneracy-parameter tuples and . Such tuples can be measured as follows. A Hermitian operator generates the unitary . The generator’s eigenvalues are labeled by the unitary’s eigenvalues: . Additionally, there exists a Hermitian operator that shares its eigenbasis with but whose spectrum is nondegenerate: , wherein denotes a real one-to-one function. I refer to a collective measurement of and as a measurement. Analogous statements concern . If is large, measuring and may be challenging but is possible in principle. Such measurements may be reasonable if is small. Schemes for avoiding measurements of the ’s and ’s are under investigation [BrianDisc].
Let denote a time-independent Hamiltonian. The unitary evolves forward in time for an interval . Heisenberg-picture operators are defined as and .
The OTOC is conventionally evaluated on a Gibbs state , wherein denotes a temperature: . Theorem 1 generalizes beyond to arbitrary density operators . [ denotes the set of density operators defined on .]
Jarzynski’s Equality concerns thermodynamic work, .
is a random variable calculated from measurement outcomes. The out-of-time-ordering inrequires two such random variables. I label these variables and .
Two stepping stones connect and to and . First, I define a complex probability amplitude associated with a quantum protocol. I combine amplitudes into a inferable from weak measurements and from interference. resembles a quasiprobability, a quantum generalization of a probability. In terms of the ’s and ’s in , I define the measurable random variables and .
Jarzynski’s Equality involves a probability distribution over possible values of the work. I define a complex analog . These definitions are designed to parallel expressions in [TLH_07_Work]. Talkner, Lutz and Hänggi cast Jarzynski’s Equality in terms of a time-ordered correlation function. Modifying their derivation will lead to the OTOC Jarzynski-like equality.
2.1 Quantum probability amplitude
The probability amplitude is defined in terms of the following protocol, :
Measure the eigenbasis of , .
Evolve forward in time under .
Evolve backward in time under .
Evolve forward under .
An illustration appears in Fig. 0(a). Consider implementing in one trial. The complex probability amplitude associated with the measurements’ yielding , then , then , then is
The square modulus equals the joint probability that these measurements yield these outcomes.
Suppose that . For example, suppose that occupies the thermal state . (I set Boltzmann’s constant to one: .) Protocol and Eq. (2.1) simplify: The first can be eliminated, because . Why obviates the unitary will become apparent when we combine ’s into .
The protocol defines ; is not a prescription measuring . Consider implementing many times and gathering statistics about the measurements’ outcomes. From the statistics, one can infer the probability , not the probability amplitude . merely is the process whose probability amplitude equals . One must calculate combinations of ’s to calculate the correlator. These combinations, labeled , can be inferred from weak measurements and interference.
2.2 Combined quantum amplitude
Combining quantum amplitudes yields a quantity that is nearly a probability but that differs due to the OTOC’s out-of-time ordering. I first define , which resembles the Kirkwood-Dirac quasiprobability [Kirkwood_33_Quantum, Dirac_45_On, Dressel_15_Weak, BrianDisc]. We gain insight into by supposing that , e.g., that is the infinite-temperature Gibbs state . can reduce to a probability in this case, and protocols for measuring simplify. I introduce weak-measurement and interference schemes for inferring experimentally.
2.2.1 Definition of the combined quantum amplitude
Consider measuring the probability amplitudes associated with all the possible measurement outcomes. Consider fixing an outcome septuple . The amplitude describes one realization, illustrated in Fig. 0(a), of the protocol . Call this realization .
Consider the realization, labeled , illustrated in Fig. 0(b). The initial and final measurements yield the same outcomes as in [outcomes and ]. Let and denote the outcomes of the second and third measurements in . Realization corresponds to the probability amplitude .
Let us complex-conjugate the amplitude and multiply by the amplitude. We marginalize over and over , forgetting about the corresponding measurement outcomes:
The shorthand encapsulates the list . The shorthands , and are defined analogously.
Let us substitute in from Eq. (2.1) and invoke . The sum over evaluates to a resolution of unity. The sum over evaluates to :
This resembles the Kirkwood-Dirac quasiprobability [Dressel_15_Weak, BrianDisc]. Quasiprobabilities surface in quantum optics and quantum foundations [Carmichael_02_Statistical, Ferrie_11_Quasi]. Quasiprobabilities generalize probabilities to quantum settings. Whereas probabilities remain between 0 and 1, quasiprobabilities can assume negative and nonreal values. Nonclassical values signal quantum phenomena such as entanglement. The best-known quasiprobabilities include the Wigner function, the Glauber-Sudarshan representation, and the Husimi representation. Kirkwood and Dirac defined another quasiprobability in 1933 and in 1945 [Kirkwood_33_Quantum, Dirac_45_On]. Interest in the Kirkwood-Dirac quasiprobability has revived recently. The distribution can assume nonreal values, obeys Bayesian updating, and has been measured experimentally [Lundeen_11_Direct, Lundeen_12_Procedure, Bamber_14_Observing, Mirhosseini_14_Compressive].
The Kirkwood-Dirac distribution for a state has the form , wherein and denote bases for [Dressel_15_Weak]. Equation (2.2.1) has the same form except contains more outer products. Marginalizing over every variable except one [or one , one , or one ] yields a probability, as does marginalizing the Kirkwood-Dirac distribution over every variable except one. The precise nature of the relationship between and the Kirkwood-Dirac quasiprobability is under investigation [BrianDisc]. For now, I harness the similarity to formulate a weak-measurement scheme for in Sec. 2.2.3.
is nearly a probability: results from multiplying a complex-conjugated probability amplitude by a probability amplitude . So does the quantum mechanical probability density . Hence the quasiprobability resembles a probability. Yet the argument of the equals the argument of the . The argument of the does not equal the argument of the . This discrepancy stems from the OTOC’s out-of-time ordering. can be regarded as like a probability, differing due to the out-of-time ordering. reduces to a probability under conditions discussed in Sec. 2.2.2. The reduction reinforces the parallel between Theorem 1 and the fluctuation-relation work [TLH_07_Work], which involves a probability distribution that resembles .
2.2.2 Simple case, reduction of to a probability
Suppose that shares the eigenbasis: . For example, may be the infinite-temperature Gibbs state . Equation (2.2.1) becomes
The weak-measurement protocol simplifies, as discussed in Sec. 2.2.3.
Equation (2.2.2) reduces to a probability if or if . For example, suppose that :
The denotes the probability that preparing and measuring will yield . Each denotes the conditional probability that preparing , backward-evolving under , and measuring will yield . Hence the combination of probability amplitudes is nearly a probability: reduces to a probability under simplifying conditions.
Equation (7) strengthens the analogy between Theorem 1 and the fluctuation relation in [TLH_07_Work]. Equation (10) in [TLH_07_Work] contains a conditional probability multiplied by a probability . These probabilities parallel the and in Eq. (7). Equation (7) contains another conditional probability, , due to the OTOC’s out-of-time ordering.
2.2.3 Weak-measurement scheme for the combined quantum amplitude
is related to the Kirkwood-Dirac quasiprobability, which has been inferred from weak measurements [Dressel_14_Understanding, Kofman_12_Nonperturbative, Lundeen_11_Direct, Lundeen_12_Procedure, Bamber_14_Observing, Mirhosseini_14_Compressive]. I sketch a weak-measurement scheme for inferring . Details appear in Appendix F.1.
Let denote the following protocol:
Couple the system’s weakly to an ancilla . Measure strongly.
Evolve forward under .
Couple the system’s weakly to an ancilla . Measure strongly.
Evolve backward under .
Couple the system’s weakly to an ancilla . Measure strongly.
Evolve forward under .
Measure strongly (e.g., projectively).
Consider performing many times. From the measurement statistics, one can infer the form of .
offers an experimental challenge: Concatenating weak measurements raises the number of trials required to infer a quasiprobability. The challenge might be realizable with modifications to existing set-ups (e.g., [White_16_Preserving, Dressel_14_Implementing]). Additionally, simplifies in the case discussed in Sec. 2.2.2—if shares the eigenbasis, e.g., if . The number of weak measurements reduces from three to two. Appendix F.1 contains details.
2.2.4 Interference-based measurement of
can be inferred not only from weak measurement, but also from interference. In certain cases—if shares neither the nor the eigenbasis—also quantum state tomography is needed. From interference, one infers the inner products in . Eigenstates of and are labeled by and ; and . The matrix element is inferred from quantum state tomography in certain cases.
The interference scheme proceeds as follows. An ancilla is prepared in a superposition . The system is prepared in a fiducial state . The ancilla controls a conditional unitary on : If is in state , is rotated to . If is in , is rotated to . The ancilla’s state is rotated about the -axis [if the imaginary part is being inferred] or about the -axis [if the real part is being inferred]. The ancilla’s and the system’s are measured. The outcome probabilities imply the value of . Details appear in Appendix F.2.
The time parameter need not be negated in any implementation of the protocol. The absence of time reversal has been regarded as beneficial in OTOC-measurement schemes [Yao_16_Interferometric, Zhu_16_Measurement], as time reversal can be difficult to implement.
Interference and weak measurement have been performed with cold atoms [Smith_04_Continuous], which have been proposed as platforms for realizing scrambling and quantum chaos [Swingle_16_Measuring, Yao_16_Interferometric, Danshita_16_Creating]. Yet cold atoms are not necessary for measuring . The measurement schemes in this paper are platform-nonspecific.
2.3 Measurable random variables and
The combined quantum amplitude is defined in terms of two realizations of the protocol . The realizations yield measurement outcomes , , , and . Consider complex-conjugating two outcomes: , and . The four values are combined into
Suppose, for example, that and denote single-qubit Paulis. can equal , or . and function analogously to the thermodynamic work in Jarzynski’s Equality: , , and work are random variables calculable from measurement outcomes.
2.4 Complex distribution function
Jarzynski’s Equality depends on a probability distribution . I define an analog in terms of the combined quantum amplitude .
Consider fixing and . For example, let . Consider the set of all possible outcome octuples that satisfy the constraints and . Each octuple corresponds to a set of combined quantum amplitudes . These ’s are summed, subject to the constraints:
The Kronecker delta is denoted by .
resembles a joint probability distribution. Summing any function with weights yields the average-like quantity
The above definitions feature in the Jarzynski-like equality for the OTOC.
The out-of-time-ordered correlator obeys the Jarzynski-like equality
The derivation of Eq. (11) is inspired by [TLH_07_Work]. Talkner et al.
cast Jarzynski’s Equality in terms of a time-ordered correlator of two exponentiated Hamiltonians. Those authors invoke the characteristic function
the Fourier transform of the probability distribution. The integration variable is regarded as an imaginary inverse temperature: . We analogously invoke the (discrete) Fourier transform of :
wherein and .
The in Eq. (2.2.1) has been replaced with , wherein .
The sum over is recast as a trace. Under the trace’s protection, is shifted to the argument’s left-hand side. The other sums and the exponentials are distributed across the product:
The and sums are eigendecompositions of exponentials of unitaries:
The unitaries time-evolve the ’s:
We differentiate with respect to and with respect to . Then, we take the limit as :
Theorem 1 resembles Jarzynski’s fluctuation relation in several ways. Jarzynski’s Equality encodes a scheme for measuring the difficult-to-calculate from realizable nonequilibrium trials. Theorem 1 encodes a scheme for measuring the difficult-to-calculate from realizable nonequilibrium trials. depends on just a temperature and two Hamiltonians. Similarly, the conventional (defined with respect to ) depends on just a temperature, a Hamiltonian, and two unitaries. Jarzynski relates to the characteristic function of a probability distribution. Theorem 1 relates
to (a moment of) the characteristic function of a (complex) distribution.
The complex distribution, , is a combination of probability amplitudes related to quasiprobabilities. The distribution in Jarzynski’s Equality is a combination of probabilities. The quasiprobability-vs.-probability contrast fittingly arises from the OTOC’s out-of-time ordering. signals quantum behavior (noncommutation), as quasiprobabilities signal quantum behaviors (e.g., entanglement). Time-ordered correlators similar to track only classical behaviors and are moments of (summed) classical probabilities [BrianDisc]. OTOCs that encode more time reversals than are moments of combined quasiprobability-like distributions lengthier than [BrianDisc].
The Jarzynski-like equality for the out-of-time correlator combines an important tool from nonequilibrium statistical mechanics with an important tool from quantum information, high-energy theory, and condensed matter. The union opens all these fields to new modes of analysis.
For example, Theorem 1 relates the OTOC to a combined quantum amplitude . This is closely related to a quasiprobability. The OTOC and quasiprobabilities have signaled nonclassical behaviors in distinct settings—in high-energy theory and condensed matter and in quantum optics, respectively. The relationship between OTOCs and quasiprobabilities merits study: What is the relationship’s precise nature? How does behave over time scales during which exhibits known behaviors (e.g., until the dissipation time or from the dissipation time to the scrambling time [Swingle_16_Measuring])? Under what conditions does behave nonclassically (assume negative or nonreal values)? How does a chaotic system’s look? These questions are under investigation [BrianDisc].
As another example, fluctuation relations have been used to estimate the free-energy difference from experimental data. Experimental measurements of are possible for certain platforms, in certain regimes [Swingle_16_Measuring, Yao_16_Interferometric, Zhu_16_Measurement, Li_16_Measuring, Garttner_16_Measuring]. Theorem 1 expands the set of platforms and regimes. Measuring quantum amplitudes, as via weak measurements [Lundeen_11_Direct, Lundeen_12_Procedure, Bamber_14_Observing, Mirhosseini_14_Compressive], now offers access to . Inferring small systems’ ’s with existing platforms [White_16_Preserving] might offer a challenge for the near future.
Finally, Theorem 1 can provide a new route to bounding . A Lyapunov exponent governs the chaotic decay of . The exponent has been bounded, including with Lieb-Robinson bounds and complex analysis [Maldacena_15_Bound, Lashkari_13_Towards, Kitaev_15_Simple]. The right-hand side of Eq. (11) can provide an independent bounding method that offers new insights.
5 Technical introduction
This review consists of three parts. In Sec. 5.1, we overview the KD quasiprobability. Section 5.2 introduces our set-up and notation. In Sec. 5.3, we review the OTOC and its quasiprobability . We overview also the weak-measurement and interference schemes for measuring and .
The quasiprobability section (5.1) provides background for quantum-information, high-energy, and condensed-matter readers. The OTOC section (5.3) targets quasiprobability and weak-measurement readers. We encourage all readers to study the set-up (5.2), as well as and the schemes for measuring (5.4).
5.1 The KD quasiprobability in quantum optics
The Kirkwood-Dirac quasiprobability is defined as follows. Let denote a quantum system associated with a Hilbert space . Let and denote orthonormal bases for . Let denote the set of bounded operators defined on , and let . The KD quasiprobability
regarded as a function of and , contains all the information in , if for all . Density operators are often focused on in the literature and in this paper. This section concerns the context, structure, and applications of .
We set the stage with phase-space representations of quantum mechanics, alternative quasiprobabilities, and historical background. Equation (25) facilitates retrodiction, or inference about the past, reviewed in Sec. 5.1.2. How to decompose an operator in terms of KD-quasiprobability values appears in Sec. 5.1.3. The quasiprobability has mathematical properties reviewed in Sec. 5.1.4.
Much of this section parallels Sec. 9, our theoretical investigation of the OTOC quasiprobability. More background appears in [Dressel_15_Weak].
5.1.1 Phase-space representations, alternative quasiprobabilities, and history
Phase-space distributions form a mathematical toolkit applied in Liouville mechanics [Landau_80_Statistical]. Let denote a system of degrees of freedom (DOFs). An example system consists of particles, lacking internal DOFs, in a three-dimensional space. We index the particles with and let . The component of particle ’s position is conjugate to the component of the particle’s momentum. The variables and label the axes of phase space.
Suppose that the system contains many DOFs: . Tracking all the DOFs is difficult. Which phase-space point occupies, at any instant, may be unknown. The probability that, at time , occupies an infinitesimal volume element localized at is . The phase-space distribution is a probability density.
and seem absent from quantum mechanics (QM), prima facie. Most introductions to QM cast quantum states in terms of operators, Dirac kets , and wave functions . Classical variables are relegated to measurement outcomes and to the classical limit. Wigner, Moyal, and others represented QM in terms of phase space [Carmichael_02_Statistical]. These representations are used most in quantum optics.
In such a representation, a quasiprobability density replaces the statistical-mechanical probability density .444 We will focus on discrete quantum systems, motivated by a spin-chain example. Discrete systems are governed by quasiprobabilities, which resemble probabilities. Continuous systems are governed by quasiprobability densities, which resemble probability densities. Our quasiprobabilities can be replaced with quasiprobability densities, and our sums can be replaced with integrals, in, e.g., quantum field theory. Yet quasiprobabilities violate axioms of probability [Ferrie_11_Quasi]. Probabilities are nonnegative, for example. Quasiprobabilities can assume negative values, associated with nonclassical physics such as contextuality [Spekkens_08_Negativity, Ferrie_11_Quasi, Kofman_12_Nonperturbative, Dressel_14_Understanding, Dressel_15_Weak, Delfosse_15_Wigner], and nonreal values. Relaxing different axioms leads to different quasiprobabilities. Different quasiprobabilities correspond also to different orderings of noncommutative operators [Dirac_45_On]. The best-known quasiprobabilities include the Wigner function, the Glauber-Sudarshan representation, and the Husimi function [Carmichael_02_Statistical].
The KD quasiprobability resembles a little brother of theirs, whom hardly anyone has heard of [Banerji_07_Exploring]. Kirkwood and Dirac defined the quasiprobability independently in 1933 [Kirkwood_33_Quantum] and 1945 [Dirac_45_On]. Their finds remained under the radar for decades. Rihaczek rediscovered the distribution in 1968, in classical-signal processing [Rihaczek_68_Signal, Cohen_89_Time]. (The KD quasiprobability is sometimes called “the Kirkwood-Rihaczek distribution.”) The quantum community’s attention has revived recently. Reasons include experimental measurements, mathematical properties, and applications to retrodiction and state decompositions.
5.1.2 Bayes-type theorem and retrodiction with the KD quasiprobability
Prediction is inference about the future. Retrodiction is inference about the past. One uses the KD quasiprobability to infer about a time , using information about an event that occurred before and information about an event that occurred after . This forward-and-backward propagation evokes the OTOC’s out-of-time ordering.
We borrow notation from, and condense the explanation in, [Dressel_15_Weak]. Let denote a discrete quantum system. Consider preparing in a state at time . Suppose that evolves under a time-independent Hamiltonian that generates the family of unitaries. Let denote an observable measured at time . Let be the eigendecomposition, and let denote the outcome.
Let be the eigendecomposition of an observable that fails to commute with . Let denote a time in . Which value can we most reasonably attribute to the system’s time- , knowing that was prepared in and that the final measurement yielded ?
Propagating the initial state forward to time yields . Propagating the final state backward yields . Our best guess about is the weak value [Ritchie_91_Realization, Hall_01_Exact, Johansen_04_Nonclassical, Hall_04_Prior, Pryde_05_Measurement, Dressel_11_Experimentals, Groen_13_Partial]
The real part of a complex number is denoted by . The guess’s accuracy is quantified with a distance metric (Sec. 9.2) and with comparisons to weak-measurement data.
Aharonov et al. discovered weak values in 1988 [Aharonov_88_How]. Weak values be anomalous, or strange: can exceed the greatest eigenvalue of and can dip below the least eigenvalue . Anomalous weak values concur with negative quasiprobabilities and nonclassical physics [Kofman_12_Nonperturbative, Dressel_14_Understanding, Pusey_14_Anomalous, Dressel_15_Weak, Waegell_16_Confined]. Debate has surrounded weak values’ role in quantum mechanics [Ferrie_14_How, Vaidman_14_Comment, Cohen_14_Comment, Aharonov_14, Sokolovski_14_Comment, Brodutch_15_Comment, Ferrie_15_Ferrie].
The weak value , we will show, depends on the KD quasiprobability. We replace the in Eq. (26) with its eigendecomposition. Factoring out the eigenvalues yields
The weight is a conditional quasiprobability. It resembles a conditional probability—the likelihood that, if was prepared and the measurement yielded , is the value most reasonably attributable to . Multiplying and dividing the argument by yields
Substituting into Eq. (27) yields
Equation (29) illustrates why negative quasiprobabilities concur with anomalous weak values. Suppose that . The triangle inequality, followed by the Cauchy-Schwarz inequality, implies
The penultimate equality follows from . Suppose, now, that the quasiprobability contains a negative value . The distribution remains normalized. Hence the rest of the values sum to . The RHS of (32) exceeds