## References

- [1] C. G. Knott. Life and Scientific Work of Peter Gurthrie Tait. Cambridge University press, Cambridge, United Kingdom, 1911. Letter from Maxwell to Tait, 11 December 1867, quoted herein pp. 213-214.
- [2] R. Landauer. Irreversibility and heat generation in the computing process. IBM J. Res. Develop., 5(3):183–191, 1961.
- [3] C. H. Bennett. Thermodynamics of computation—A review. Intl. J. Theo. Phys., 21:905, 1982.
- [4] J. M. R. Parrondo, J. M. Horowitz, and T. Sagawa. Thermodynamics of information. Nature Physics, 11(2):131–139, 2015.
- [5] A. B. Boyd, D. Mandal, and J. P. Crutchfield. Leveraging environmental correlations: The thermodynamics of requisite variety. J. Stat. Phys., 167(6):1555–1585, 2016.
- [6] D. Mandal and C. Jarzynski. Work and information processing in a solvable model of Maxwell’s demon. Proc. Natl. Acad. Sci. USA, 109(29):11641–11645, 2012.
- [7] A. B. Boyd and J. P. Crutchfield. Maxwell demon dynamics: Deterministic chaos, the Szilard map, and the intelligence of thermodynamic systems. Phys. Rev. Lett., 116:190601, 2016.
- [8] G. N. Bochkov and Y. E. Kuzovlev. Nonlinear fluctuation-dissipation relations and stochastic models in nonequilibrium thermodynamics: I. generalized fluctuation-dissipation theorem. Physica A: Stat. Mech. App., 106(3):443–479, 1981.
- [9] D. J. Evans and D. J. Searles. Equilibrium microstates which generate second law violating steady states. Phys. Rev. E, 50(2):1645–1648, 1994.
- [10] C. Jarzynski. Nonequilibrium equality for free energy differences. Phys. Rev. Lett., 78(14):2690–2693, 1997.
- [11] G. E. Crooks. Nonequilibrium measurements of free energy differences for microscopically reversible Markovian systems. J. Stat. Phys., 90(5/6):1481–1487, 1998.
- [12] G. E. Crooks. Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences. Phys. Rev. E, 60:2721, 1999.
- [13] U. Seifert. Stochastic thermodynamics, fluctuation theorems and molecular machines. Rep. Prog. Phys., 75:126001, 2012.
- [14] R. Klages, W. Just, and C. Jarzynski, editors. Nonequilibrium Statistical Physics of Small Systems: Fluctuation Relations and Beyond. Wiley, New York, 2013.
- [15] P. Maragakis, M. Spichty, and M. Karplus. A differential fluctuation theorem. J. Phys. Chem. B, 112(19):6168–6174, 2008.
- [16] I. Junier, A. Mossa, M. Manosas, and F. Ritort. Recovery of free energy branches in single molecule experiments. Phys. Rev. Lett., 102(7):070602, 2009.
- [17] A. Alemany, A. Mossa, I. Junier, and F. Ritort. Experimental free-energy measurements of kinetic molecular states using fluctuation theorems. Nature Physis, 8:688–694, 2012.
- [18] B. Lambson, D. Carlton, and J. Bokor. Exploring the thermodynamic limits of computation in integrated systems: Magnetic memory, nanomagnetic logic, and the Landauer limit. Phys. Rev. Lett., 107:010604, 2011.
- [19] A. Berut, A. Petrosyan, and S. Ciliberto. Detailed Jarzynski equality applied to a logically irreversible procedure. Euro. Phys. Let., 103:60002, 2013.
- [20] M. Madami, M. d’YAquino, G. Gubbiotti, S. Tacchi, C. Serpico, and G. Carlotti. Micromagnetic study of minimum-energy dissipation during Landauer erasure of either isolated or coupled nanomagnetic switches. Phys. Rev. B, 90:104405, 2014.
- [21] Y. Jun, M. Gavrilov, and J. Bechhoefer. High-precision test of Landauer’s principle. Phys. Rev. Lett., 113:190601, 2014.
- [22] A. Berut, A. Petrosyan, and S. Ciliberto. Information and thermodynamics: Experimental verification of Landauer’s erasure principle. J. Stat Mech: Theory and Experiment, 2015(6):P06015, 2015.
- [23] J. Hong, B. Lambson, S. Dhuey, and J. Bokor. Experimental test of Landauer’s principle in single-bit operations on nanomagnetic memory bits. Sci. Adv., 2:e1501492, 2016.
- [24] J. V. Koski, A. Kutvonen, I. M. Khaymovich, T. Ala-Nissila, and J. P. Pekola. On-chip Maxwell’s demon as an information-powered refrigerator. Phys. Rev. Lett., 115:260602, 2015.
- [25] T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience, New York, second edition, 2006.
- [26] C. Jarzynski. Rare events and the convergence of exponentially averaged work values. Phys. Rev. E, 73(4):046105, 2006.
- [27] A. B. Boyd, A. Patra, C. Jarzynski, and J. P. Crutchfield. Shortcuts to thermodynamic computing: The cost of fast and faithful erasure. arXiv:1812.11241.
- [28] A. B. Boyd, D. Mandal, and J. P. Crutchfield. Identifying functional thermodynamics in autonomous Maxwellian ratchets. New J. Physics, 18:023049, 2016.
- [29] S. Still, D. A. Sivak, A. J. Bell, and G. E. Crooks. Thermodynamics of prediction. Phys. Rev. Lett., 109:120604, 2012.
- [30] A. B. Boyd, D. Mandal, P. M. Riechers, and J. P. Crutchfield. Transient dissipation and structural costs of physical information transduction. Phys. Rev. Lett., 118:220602, 2017.
- [31] A. B. Boyd, D. Mandal, and J. P. Crutchfield. Correlation-powered information engines and the thermodynamics of self-correction. Phys. Rev. E, 95(1):012152, 2017.
- [32] A. B. Boyd, D. Mandal, and J. P. Crutchfield. Thermodynamics of modularity: Structural costs beyond the Landauer bound. Phys. Rev. X, 8(3):031036, 2018.
- [33] P. M. Riechers and J. P. Crutchfield. Fluctuations when driving between nonequilibrium steady states. J. Stat. Phys., 168(4):873–918, 2017.
- [34] C. Aghamohammdi and J. P. Crutchfield. Thermodynamics of random number generation. Phys. Rev. E, 95(6):062139, 2017.
- [35] P. R. Zulkowski and M. R. DeWeese. Optimal control of overdamped systems. Phys. Rev. E, 92(5):032117, 2015.
- [36] T. R. Gingrich, G. M. Rotskoff, G. E. Crooks, and P. L. Geissler. Near optimal protocols in complex nonequilibrium transformations. Proc. Natl. Acad. Sci. U.S.A., 113(37):10263–10268, 2016.
- [37] A. Patra and C. Jarzynski. Classical and quantum shortcuts to adiabaticity in a tilted piston. J. Phys. Chem. B, 121:3403–3411, 2017.
- [38] A. Gomez-Marin, J. M. R. Parrondo, and C. Van den Broeck. Lower bounds on dissipation upon coarse graining. Phys. Rev. E, 78(1):011107, 2008.
- [39] G.E. Crooks. Excursions in statistical dynamics. PhD thesis, University of California, Berkeley, 1999.
- [40] C. Jarzynski. Comparison of far-from-equilibrium work relations. Comptes Rendus Physique, 8(5-6):495–495, 2007.
- [41] A. Berut, A. Arakelyan, A. Petrosyan, S. Ciliberto, R. Dillenschneider, and E. Lutz. Experimental verification of Landauer’s principle linking information and thermodynamics. Nature, 483:187–190, 2012.
- [42] S. Han, J. Lapointe, and J. E. Lukens. Effect of a two-dimensional potential on the rate of thermally induced escape over the potential barrier. Phys. Rev. B, 46:6338, 1992.

## I Principles of Thermodynamic Computing: A recent synopsis

A number of closely-related thermodynamic costs of computing have
been identified, above and beyond the *house-keeping heat* that maintains
a system’s overall nonequilibrium dynamical state. First, there is the
*information-processing Second Law*
[28]
that extends
Landauer’s original bound on erasure
[2]
to dissipation in general computing
and properly highlights the
central role of information generation measured via the physical substrate’s
dynamical Kolmogorov-Sinai entropy.
It specifies the minimum amount of energy
that must be supplied to drive a given amount of computation forward. Second,
when coupling thermodynamic systems together, even a single system and a
complex environment, there are transient costs as the system synchronizes to,
predicts, and then adapts to errors in its environment
[29, 30, 31].
Third, the very modularity of a
system’s organization imposes thermodynamic costs
[32].
Fourth, since
computing is necessarily far out of equilibrium and nonsteady state, there are
costs due to driving transitions between information-storage states
[33].
Fifth, there are costs to generating randomness
[34],
which is itself a widely useful resource. Finally, by way of harnessing these
principles, new strategies for optimally controlling nonequilibrium
transformations have been introduced
[35, 36, 37, 27].

## Ii Microscopic Stochastic Thermodynamical System

For concreteness, we concentrate on a one-dimensional system: a particle with
position and momentum in an external potential and in contact with a
heat reservoir at temperature . An external controller adds or removes
energy from a work reservoir to change the form of the potential
via a predetermined *erasure protocol* . (See Supplementary Materials (SM) VI
for details on the alternative definitions of work.) The potential takes the
form:

with constants . During the erasure protocol, and
change one at a time piecewise-linearly through four protocol
substages: (1) *drop barrier*, (2) *tilt*, (3) *raise barrier*,
and (4) *untilt*, as shown in Table S1. The system starts
at time in the equilibrium distribution for a double-well at
temperature . Being equiprobable, the informational states associated with
each of the two wells thus contain bit of information [25].
The effect of the control protocol on the system potential and system response
is graphically displayed in Fig. 1.

Stage | Drop Barrier | Tilt | Raise Barrier | Untilt | |||||
---|---|---|---|---|---|---|---|---|---|

0 | |||||||||

0 |

We model the erasure physical information processing with underdamped Langevin dynamics:

(S1) |

where is Boltzmann’s constant, is the coupling between the heat reservoir and system, is the particle’s mass, and

is a memoryless Gaussian random variable with

and .For comparison to experiment, we simulated erasure with the following parameters, sufficient to fully specify the dynamics: , , , and . The resulting potential, snapshotted at times during the erasure substages, is shown in Fig. 1(Inner plot sequence).

Reliable information processing dictates that we set time scales so that the system temporarily, but stably, stores information. To support metastable-quasistatic behavior at all times the relaxation rates of the informational states are much faster than the rate of change of the potential, keeping the system near metastable equilibrium throughout. The entropy production for such protocols tends to be minimized.

## Iii Trajectory-Class Fluctuation Theorems: Use and Interpretation

Here, we describe the trajectory-class fluctuation theorems, explaining several of their possible implications and exploring their application to both the simulations and flux qubit experiment. Their derivations are given in the section following.

First, consider a forward process distribution , defined by the probabilities of the system microstate trajectories due to an initial equilibrium microstate distribution evolving forward in time under a control protocol. Then, the reverse process distribution

is determined by preparing the system in equilibrium in the final protocol configuration and running the reverse protocol. The reverse protocol is the original protocol conducted in reverse order but also with objects that are odd under time reversal, like magnetic fields, negated. The time-reversal of a trajectory

is , where if is odd under time-reversal (e.g., momentum or spin), otherwise . For a measurable subset of trajectories , let denote an average over the ensemble of forward process trajectories conditioned on the trajectory class . Let and denote the probabilities of observing the class in the forward process and the reverse class in the reverse process, respectively.We first introduce a *trajectory-class fluctuation theorem* (TCFT) for the
class-averaged exponential work :

(S2) |

with the system equilibrium free energy change. We also introduce
a *class-averaged work* TCFT:

(S3) |

This employs the Kullback-Liebler divergence taken between forward and reverse process distributions over all class trajectories , conditioned on the forward class and reverse class , respectively. If we disregard this divergence, which is nonnegative and would generally be difficult to obtain experimentally, we then find the lower bound on the class-averaged work of Eq. (1).

In the limit of class possessing only a single trajectory, we recover detailed fluctuation theorems as in Ref. [12]. If, however, we take to be the entire set of trajectories , we recover integral fluctuation theorems as in Jarzynski’s equality [10]. Thus, the TCFTs are a suite that spans the space of fluctuation theorems between the extreme of the detailed theorems, that require very precise information about an individual trajectory, and the integral theorems, that describe the system’s entire trajectory ensemble. SM IV below provides proofs for both TCFTs.

We can rearrange Eq. (S2) to obtain Eq. (2)—an expression for estimating equilibrium free energy changes:

(S4) |

Thus, to estimate free energy one sees that statistics are needed for only one particular class and its reverse. Generally, this gives a substantial statistical advantage over direct use of Jaryznski’s equality:

since rare microstate trajectories may generate negative work values that dominate the average exponential work [26]. The problem is clear in the case of erasure. Recall from Fig. 2(Three front panels) that Fail trajectories generate the most-negative work values. In the limit of higher success-rate protocols that maintain low entropy production, failures generate more and more negative works, leading them to dominate when estimating average exponential works.

In contrast, to efficiently determine the change in equilibrium free energy from Eq. (2), its form indicates that one should choose a class that (i) is common in the forward process, (ii) has a reverse class that is common in the reverse process, and (iii) generates a narrow work distribution. This maximizes the accuracy of statistical estimates for the three factors on the RHS. For example, while the equilibrium free energy change in the case of our erasure protocol is theoretically simple (zero); the Success class fits the criteria.

We can then monitor the class-averaged work in excess of its bound:

The inequality in Eq. (1) is a refinement of the equilibrium Second Law and therefore the bound generally provides a more accurate estimate of the average work of trajectories in a class compared to the equilibrium free energy change . More precisely, as we will see below, an average of the excess over all classes in a partition of trajectories must be smaller than the dissipated work . For trajectory classes with narrow work distributions, this can be a significant improvement. We can see this by Taylor expanding the LHS of Eq. (S2) about the mean dimensionless work . This shows that Eq. (1

) becomes an equality when the variance and higher moments vanish. SM

V below delves more into moment approximations. In any case, trajectory classes with narrow work distributions have small excess works .To estimate , we ran simulations of the reverse process. Table S2 shows that the Success and Fail classes have small excesses and, as seen in Fig. 2(Three front panels), these classes indeed have narrow work distributions. Elsewhere we explore these and additional partition schemes, finding that the Transitional trajectories can be further partitioned to yield narrow work distributions so that all trajectory classes have small excesses . In short, this demonstrates how well-formulated trajectory classes allow accurate estimates on the works for all trajectories.

Simulation

Class | |||
---|---|---|---|

All | 0.634 | 0.0 | 0.634 |

Success | 0.713 | 0.683 | 0.030 |

Fail | -3.885 | -3.951 | 0.066 |

Transitional | -0.546 | -1.650 | 1.170 |

Partition | |||
---|---|---|---|

Trivial | 0.634 | 0.0 | 0.634 |

Untilt-Centric I | 0.634 | 0.560 | 0.074 |

Untilt-Centric II | 0.634 | 0.601 | 0.032 |

Experiment

Class | |||
---|---|---|---|

All | 0.668 | 0.0 | 0.668 |

Success | 0.742 | 0.643 | 0.099 |

Fail | -3.132 | -3.475 | 0.343 |

Transitional | -0.443 | -1.215 | 0.772 |

Partition | |||
---|---|---|---|

Trivial | 0.668 | 0.0 | 0.668 |

Untilt-Centric I | 0.668 | 0.554 | 0.115 |

*All*trajectories ,

*Success*trajectories,

*Fail*trajectories, and

*Transitional*trajectories. These are identified in Fig. 2 (Four front panels). From left to right, columns give the estimated class-average work , TCFT lower bound , and their difference . simulations were run for each of the forward and reverse processes, with trajectories successfully ending in the informational state under the forward process. (Top Right) Comparison of ensemble-average work and bounds due to different partitions:

*Trivial*partition;

*Untilt-Centric I*partition, composed of Success, Fail, and Transitional; and

*Untilt-Centric II*partition, described in follow-on work. From left to right, columns give the estimated ensemble-average work, the partition bound , and their difference . All values in units of . (Bottom) Parallel results from the flux qubit experiment.

*All*trajectories ,

*Success*,

*Fail*, and

*Transitional*trajectories identified in Fig. 3(D). Data from trajectories from the forward protocol and trajectories from the reverse protocol.

To measure the efficacy of a given partition of trajectories into classes, we ask what the ensemble-average of class-average excess works is:

with .

From Eq. (1), we see that is the coarse-grained lower bound on ensemble-average dissipation from Ref. [38]:

where is the Kullback-Liebler divergence
between forward and reverse process distributions over the trajectory classes
. Since Kullback-Liebler divergences are nonnegative, such a
bound always provides an improvement over the equilibrium Second Law. Table
S2 shows both and
for the trivial partition , our three-class
partition, labeled *Untilt-Centric I*, and the improved partition
described in follow-on work, labeled *Untilt-Centric II*. In this case,
the latter two also provide an improvement on the nonequilibrium Second Law
which, assuming metastable starting and ending distributions, provides a lower
bound on the average work equal to , the change in nonequilibrium free
energy.

We can appeal to Landauer’s erasure bound——to calibrate the excesses and . We see for the simulation data that our three-class partition Untilt-Centric I provides class-average work bounds that, on average, are only about of from the actual class-average works. The more refined Untilt-Centric II partition reduces this excess to about while the trivial partition fails by about of .

The experimental data matches these results, with the largest discrepancy occurring for the class-average excess for the Fail class. This is not wholly coincidence, since we determined the parameters of the experimental protocol by adjusting the parameters of a simple two-state Markov simulation to obtain a work distribution similar to that obtained by our Langevin simulations of the Duffing potential system described in the main text. However, it is interesting that this was sufficient to provide matches in both the clean decomposition of the total work distribution by trajectory class and the quantities of Table S2.

We also recover the equality of Ref. [38] for the ensemble-average work by averaging Eq. (S3) over each class:

which of course is lower bounded by .

These results suggest the criterion for optimal trajectory partitions: Select a partition sufficiently refined to yield tight bounds on class-average works, but no finer. Machine learning methods for model order selection will provide a basis for a natural classification scheme for trajectories that captures all relevant thermodynamics and information processing.

By changing our forward and reverse processes and to begin in system microstate distributions other than equilibrium, a yet-broader class of TCFTs emerge. We can then find analogous results for heats and comparisons with works and nonequilibrium free-energy changes. We explore these in depth elsewhere.

## Iv TCFT Derivations

Assume that the system dynamics is described by a Hamiltonian specified in part by an external control protocol, as well as by a weak coupling to a thermal environment that induces steady relaxation to canonical equilibrium.

Start the system in equilibrium distribution for Hamiltonian and run a protocol until time , causing the system Hamiltonian to evolve to . If we then hold the Hamiltonian at for a long time, the system relaxes into the equilibrium distribution . The system’s ensemble entropy change from to is then:

The trajectory-wise system entropy difference is defined to be:

where and are the initial and final microstates of system microstate trajectory , respectively. Averaged over all trajectories , this then becomes the ensemble entropy change.

Let denote the probability of obtaining system microstate trajectory via the protocol conditioned on starting the system in state .

Now, start the system Hamiltonian at and run the reverse protocol, ending the Hamiltonian at . We then obtain the trajectory with a different conditional probability: .

Assuming microscopic reversibility and given a system trajectory , the change in the heat bath’s entropy is:

(S5) |

where , is the net energy that flows out of the heat bath into the system given the trajectory , and denotes time-reversal. This holds for systems with strictly finite energies and Markov dynamics that induce the equilibrium distribution when control parameters are held fixed [39]. Both our simulated Duffing potential system and flux qubit obey these requirements at sufficiently short time scales. Then we can express the total trajectory-wise change in entropy production due to a trajectory as the sum of system and heat reservoir entropy changes:

Since , with the system’s equilibrium free energy at time , we can write:

From here, we derive our first TCFT by integrating each side of Eq. (S6) over all trajectories in a measurable set . Starting with the LHS and recalling the Iverson bracket , which is when the interior expression is true and when false, we have:

The first three steps used the unity of the Jacobian in reversing a microstate, the definition , and swapping all instances of with . Integrating the RHS of Eq. (S6) then gives:

Combining, we have our first TCFT, Eq. (S2).

## V Class-Averaged Work Approximation for Narrow Distributions

Here, we demonstrate that the class-averaged work approaches its bound when the variance and higher moments of the class’ distribution of works vanish. One concludes that is a good approximation for when the class’ work distribution is narrow.

We first express the LHS of Eq. (S2) in terms of the unitless distance of work from its class-average:

with . Then, we Taylor expand the exponential inside the class-average:

with . Equation (S2) then gives:

Since is convex,

so . Then:

The second line becomes an equality when goes to zero, which occurs as the variance and higher moments vanish.

## Vi Work Definitions and Experimental Estimation

Properly estimating the required works and devolved heats from experimental devices undergoing cyclic control protocols requires explicitly and consistently accounting for energy and information flows between the system, its environment, and the controlling laboratory apparatus. To this end, we construct a model Hamiltonian universe for common processes involving small systems interacting with laboratory apparatus and a thermal environment. After deriving key equalities for two definitions of work, the inclusive and exclusive works, we define a method of approximating them in appropriate cyclic protocols.

### vi.1 The Model Universe and Hamiltonian

To study a small system that exchanges energy with its environment in the forms
of heat and work, we introduce a model universe: a *system of interest*, a
*heat bath*, and a *lab* (laboratory apparatus) that controls the
system and derives any needed energy from a *work reservoir*. The system
directly interacts with both the heat bath and the lab, but the heat bath and
lab are not directly coupled.

We assume that a Hamiltonian describes the universe’s evolution and that there is a set of generalized coordinates which can be sensibly partitioned into those for the system, heat bath, and lab. Then, we decompose the universe Hamiltonian into the following form:

where , , and denote both the generalized coordinates and conjugate momenta for the system, bath, and lab, respectively. For any universe Hamiltonian , there can be many choices for this decomposition.

We also define the system Hamiltonian as the three components that depend on the system coordinates:

First, consider the subset of lab coordinates for which has
nontrivial dependence. These so-called *protocol parameters* are
often simple and much fewer than the entire set of . We often assume that we
have total control of their evolution. More precisely, under an appropriate
preparation for the lab at time , a specific trajectory for the protocol
parameters for is guaranteed for all
preparations of the heat bath and system coordinates. We refer to the parameter
trajectory as the *protocol*.

Suppose the heat-bath degrees of freedom that interact with the system change much faster than the system’s. We can assume that the system response to the bath resembles Brownian motion. On the time scale of changes in the system coordinates, then, we ignore the system-bath interaction term in writing the system Hamiltonian :

The latter decomposition into kinetic energy and potential energy can be used to write Langevin equations of motion for the system. Furthermore, if the heat bath has a relaxation time sufficiently short that it is roughly in equilibrium at all times with fixed temperature, then its influence on the system will be memoryless.

### vi.2 Inclusive and Exclusive Works and Heats

The basic scenario for executing a protocol is as follows. The universe coordinates begin according to a given initial distribution at time and they evolve in isolation until . As above, we assume that a well-defined protocol emerges due to our preparation of the lab coordinates.

We label all energy exchanged between the system and lab as *work* and all
energy exchanged between the system and heat bath as *heat*. Since the lab
is directly coupled only to the system, the work they exchange is given by the
change in energy of the lab’s work reservoir. Similarly, since the heat bath is
directly coupled only to the system, the heat exchanged is given by the change
in the heat bath’s energy.

Note that this requires choices as to what constitute the energies of the three universe subsystems. While , , and define energies for the heat bath, system, and work reservoir, respectively, what of and ? If all subsystems were macroscopic, these interaction terms would be negligible. While it may be desirable to assume that the system is only weakly coupled to the heat bath—so that can be ignored— can be significant in many important small systems.

And so, in general, we define the system energy to be plus any portions
of and . Then the work reservoir energy is plus the rest of
, while the heat bath energy is plus the rest of . To make
these distinctions clear we label two types of works, each corresponding to
the two extremes for allocation of between the system and work
reservoir: the *inclusive work* and the *exclusive work*
[40].
Specifically:

We can similarly define the *inclusive heat* and *exclusive
heat*
depending on how we allocate between the system and heat bath:

The inclusive work corresponds to fully including in the system energy, while the exclusive work corresponds to excluding it. Inclusive and exclusive heat correspond similarly with respect to .

There is a key relation between the inclusive and exclusive works:

(S7) |

That is, the inclusive work for an interval of time equals the sum of the exclusive work and the change in the system-lab interaction term .

In the above expressions, calculating the rate of change of a work or heat requires the time derivative of one or more of and . This can be problematic. Fortunately, there are alternate forms that are amenable. One can show that the inclusive work rate is given by:

(S8) |

This is a more common definition for the work rate in small-system nonequilibrium thermodynamics. And, it allows the work to be calculated as:

(S9) |

The exclusive work has a corresponding form:

(S10) |

For the case where is a scalar potential for , this is the product of the corresponding force with velocity. This makes the exclusive work equal to a familiar mechanics definition of work as the integral of the dot product of force and displacement:

In this way, we write the inclusive and exclusive work rates in terms of the rates of change of the system and work-reservoir interaction term with respect to either the system or work reservoir coordinates.

### vi.3 Approximating Inclusive Work Experimentally

For the flux qubit experimental system investigated here, we assume the following:

That is, as far as the flux qubit and work reservoir are concerned, the only relevant energies at least partially ascribable to the flux qubit are its kinetic energy and the potential energy with the work reservoir. must then capture the change in the potential due to changes in the protocol parameters. We could simply define so that . However, it is more useful to allocate the initial potential energy to . That is:

For cyclic protocols where , such as in our erasure operation, vanishes for all trajectories at . By Eq. (S7) we then have the useful equality between inclusive and exclusive works taken over the entire protocol.

Estimating for a system trajectory is then equivalent to estimating for the cyclic protocols we consider. In the flux qubit, the form of is known and the specific protocol is known. Unfortunately, we lack sufficient information about its instantaneous state at all times, since the device’s physics precludes precise measurements of system flux —the relevant part of for determining the potential . Instead, we do have reliable measurement of large and stable changes in the flux . This specifically monitors when the system moves between wells in a double-well potential , if the rate of transition between wells is sufficiently slow.

And so, we can use information about the flux to approximate the exclusive work contribution at each moment in time. Then, adding up these contributions yields an approximation to the total exclusive work over the entire protocol and therefore of the inclusive work over the entire protocol. Note that the protocols used here maintain two wells at all times for the system flux . We develop the approximation in two steps.

#### vi.3.1 First-Order Approximation

We first partition the potential in flux space into three segments. Two segments constitute the wells for the flux in which that state spends all its time except for very brief transitions between wells. Then, the third segment connects the two wells, capturing the dynamics arising from crossing the barrier that separates them.

We require that the partitioning allows the following two approximations. First, the particle spends negligible total duration in between the two wells. Second, the wells do not change shape over the protocol, but instead simply raise or lower in potential at different times, if they change at all. This means that the shape of the system-lab interaction term at any time is very simple in the two wells—flat.

The result is that the exclusive work over any time duration is easily calculable from the experimental data. During times when the flux remains in a well, the exclusive work must be zero, since does not change with . During a transition, the shape of does not change due to the first approximation. Then, the exclusive work is the difference in heights of the two wells as measured by :

(S11) |

where is the protocol parameter setting at any time during the transition and and are arbitrary flux values in the starting and ending wells, respectively.

Thus, the total inclusive work over the protocol for a trajectory is simply the sum of the jump contributions above for each transition.

#### vi.3.2 Second-Order Approximation

In point of fact, the potential wells do change shape. Fortunately, our method for calculating the inclusive work over the protocol remains valid under weaker constraints on the protocol.

We first require the protocol to maintain two metastable regions, the
*informational states*, at all times; each possessing a unique local
potential minimum continuously in time. We denote the flux value at the
potential minima of informational state at time as . The
protocol must also evolve slowly enough so that the potential landscape changes
slowly compared to the system’s relaxation rate in each metastable region.
Both of these criteria are met by our erasure protocol.

Consider a short duration during which the potential changes little but long enough compared to the relaxation rates of the informational states. Consider two cases: either the system crosses the barrier between the two informational states during this time or it remains in one informational state.

First, suppose that the system transitions from one informational state to the other . Denote the system flux at the beginning of the transition as and at the end as . By Eq. (S7), the exclusive work contribution is the difference of the inclusive work contribution and the change in system-lab interaction term . The change can itself be broken down into two terms, one for the difference in between the informational-state minima and the other for the change in local to the respective minima. In other words:

(S12) | ||||

(S13) |

where , Eq. (S12)’s first term, is the change in at the informational-state minima and , Eq. (S12)’s second term is the change in of the system with respect to the informational-state minima. Our protocol ensures that the total number of transitions is so small and the time durations so narrow that we can ignore the total contributions of inclusive works due to these transition durations. Then, we approximate the exclusive work contribution during a transition via:

Suppose, now, that the system remains in one informational state during a time interval . Since the relaxation rate is fast compared to the duration , we assume that the system visits all microstates in the informational state roughly in proportion to the local equilibrium distribution. Then, the inclusive work contribution is approximately independent of the specific system trajectory during this time and, instead, is determined by the time duration and the informational state . If during this time we simultaneously shift the entire potential up by a given amount, we add an inclusive work contribution equal to the potential shift but the system trajectory is unchanged. Thus, the actual inclusive work contribution is equal to an amount due to the change in the system-lab interaction term at the informational-state minimum plus an amount due solely to the change in potential shape at the informational state with respect to its minimum. That is:

(S14) |

where is the inclusive work contribution due to the change in potential shape at the informational state. Equation (S13) applies equally well here in describing the change in system-lab interaction term. Thus, the exclusive work contribution for this time interval is:

(S15) | ||||

(S16) |

The result is that we have exclusive work contributions for both durations when the system transitions between informational states and when it remains in one.

To find the total exclusive work over the protocol for a given trajectory we add up the contributions. The sum of all local changes over all durations is the net local change in . Recall, though, that the minima of the informational states begin and end at the same values. And so, the total local change in reduces to the absolute change in . However, since we chose , this must vanish:

We can now specify our final approximation: At any time , the inclusive work contribution due to the change in potential shape is independent of the informational state. This is reasonable for our erasure protocol since the asymmetric contribution to the change in potential—the tilt—is slight. While it clearly breaks the symmetry of the double-well potential by changing the well heights, it has less effect on the well shapes and even less in making those shapes distinct.

Then, we can assume that the sum of for a trajectory is the same as that staying in one informational state the entire time. Since the protocol is very slow and cyclic, though, a particle that stays in one informational state the entire time must receive approximately zero inclusive work . Given that the sum over all must be equal to for such a trajectory, it must also be negligible.

Altogether, the total exclusive work is approximately given by the sum over all transitions between informational states of the difference in potential at the informational-state minima:

(S17) |

To reiterate, since , this is also the total inclusive work for a trajectory over the entire protocol.

## Vii Substage Work Distributions Commentary

Here, we briefly interpret several features of the substage work distributions observed in Fig. 1(Outer left plots).

The distributions for barrier dropping and tilting are narrow, symmetric peaks; see Fig. 1 (Outer left plots). Barrier raising also has a rather narrow peak, composed primarily of trajectories always in the state, but also exhibits a bulge toward positive work; see Fig. 1 (Top right). Note that the state is created mid-way through barrier raising, allowing for trajectories that spend some time in either informational state, but disallowing trajectories that spend all time in the state. The former induce the positive work bulge toward less negative works, which while notable will not be further explored here.

The substage work distributions for untilting presents the most striking picture; see Fig. 1 (Bottom right). Always- trajectories induce a large positive work peak (red), always- trajectories induce a large negative work peak (orange), and all other trajectories induce a ramp between them (blue).

These features can be directly interpreted by following the locations of the potential minima over time and noting how the shifting potential adds or removes energy from a particle. During barrier dropping, to take one example, the protocol raises both minima by over , resulting in a narrow, peaked work distribution with a mean near .

Most interesting is the untilt substage. Since most particles start and then stay in the state for this substage, a large positive work is probable, due to the rising -state well. However, it is also possible for the system to start in and then get stuck in the -state well, resulting in a large negative work. The final possibility is transitioning between states during untilting, resulting in an intermediate range of less-likely work values. For trajectories that do transition between states during untilting, it is more likely to spend more time in the state, since it is energetically favored, resulting in the rising probability with increasing work in their work distribution—giving rise to the log-linear ramp in the work distribution.

Note that there are small peaks on each end of this third class’ distributions that require a more nuanced explanation. When a particle crosses a barrier—due to random thermal excitation—the surplus energy may quickly send the particle back to the previous well before it can be dissipated. Such particles then spend almost all of the substage in this first well, generating a work value accordingly. Statistics of the ramp proper are due to particles that have time to locally equilibrate before crossing any barriers.

Follow-on work develops the theory underlying this detailed mechanistic analysis and analyzes similar behavior in all metastable-quasistatic processes.

## Viii Flux Qubit Device, Calibration, and Measurement

The benefits of the flux qubit device are several-fold. First, their physics provide a genuine two-degree of freedom dynamics, while other comparable experiments on Maxwellian demons and bit erasure are very high dimensional, only indirectly providing an effectively few-degree of freedom dynamics [41, 19, 21]. Second, they operate at very high frequency and so one readily captures the substantial amounts of data required to accurately estimate rare-fluctuation statistics. Third, they leverage recent advances in manufacturing technology led by efforts in quantum computing. Fourth, being constructed via modern integrated circuit technology they form the basis of a technology that will scale to large, multicomponent circuit devices for more sophisticated thermodynamic computing. And, finally, in the near future flux qubits will facilitate experiments that probe the thermodynamics of the transition to quantum information processing.

At the microscopic level, a fraction of the electrons in a superconducting metal form bosonic Cooper pairs—a quantum-coherent condensate. For designing superconducting electronic circuits, though, one can forgo the microscopic description and work with higher-level phenomena, such as flux quantization and the Josephson relations for weak links. Importantly, the circuit-level degrees of freedom are not coarse-grained quantities, but display a full range of quantum behavior, including quantized excitations, coherent superpositions, and entangled states in such circuits. For our purposes here, however, we run the device so that it exhibits only classical stochastic dynamics, reserving quantum information thermodynamic explorations for the future.

This section lays out the basic physics of the flux qubit device and details of the experimental implementation.

### viii.1 Flux qubit physics

Our experimental information processor is a special type of superconducting quantum interference device (SQUID) with two degrees of freedom—a gradiometric flux qubit or the variable- rf SQUID introduced by Ref. [42]. Notably, the energies associated with the motion perpendicular to and along the escape direction differ substantially by about a factor of . Practically, this asymmetry reduces the two-dimensional potential to one dimension. The net result is a device with an effective double-well potential with barriers as low as that operates at frequencies in the GHz range. The potential shape is controlled by fluxes that are readily controlled by currents. SQUID device parameters, used to determine the potential shape and energy scales, were all independently determined.

The variable- rf SQUID replaces the single Josephson junction in a standard rf SQUID with a symmetric dc SQUID with small inductance , where is the loop inductance, is the sum of critical currents of the two junctions, and is the flux quantum . This architecture gives a device whose parameters can be accurately measured and that can be selected to exhibit a range of phenomena including thermal activation, macroscopic quantum tunneling, incoherent relaxation, photon-induced transitions, and macroscopic quantum coherence. It also allows us to perform, as we demonstrate, nanoscale thermodynamic computing.

Its macroscopic dynamical variables are the magnetic flux through the rf SQUID loop and through the dc SQUID loop. Based on the resistively-capacitively-shunted junction model of Josephson junctions, in the classical limit the variable- rf SQUID’s deterministic equations of motion are [42]:

(S18) |

In units of , the 2D potential for the variable- rf SQUID is with:

Comments

There are no comments yet.