1 Introduction
The predictive capacity of molecular dynamics (MD) calculations on large biomolecules is limited by a disparity between the large time scales for conformational motions and the short periods accessible to simulation ^{1, 2, 3}. This metastable trapping of simulations behind large free energy barriers is a consequence of rough free energy surfaces (FES) and leads to incomplete sampling of the thermally accessible phase space ^{1, 2, 4, 5}. Comprehensive sampling of all biologically relevant system states is required to compute converged thermodynamic averages, sample the structurally and functionally relevant molecular configurations, and explore largescale collective motions critical to biological function such as allostery, substrate gating, and ligand binding ^{1, 2, 3}.
To confront this limitation, a plethora of enhanced sampling techniques have been developed to accelerate barrier crossing in MD simulation^{4, 5}. These methodologies can be broadly partitioned into two categories: tempering / generalized ensemble techniques, and collective variable (CV) biasing ^{5}. We distinguish enhanced sampling methods to perform accelerated exploration of configurational space from pathbased techniques such as transition path sampling (TPS) ^{6, 7, 8, 9, 10}, string methods ^{11, 12, 13, 14}, and forward flux sampling (FFS) ^{15, 16} that quantify transition pathways and rates between predefined system states. Tempering techniques include simulated annealing ^{17}, multicanonical algorithm ^{18}, replica exchange ^{19, 20}, and Hamiltonian exchange^{21, 22, 23, 24, 25}, which modify the system temperature and/or Hamiltonian to accelerate barrier crossing^{5}. Collective variable biasing techniques accelerate system dynamics along preselected CVs^{5, 4}, and include nonBoltzmann sampling approaches such as umbrella sampling (US) ^{26}, hyperdynamics ^{27}, conformational flooding ^{28}, metadynamics ^{29, 30, 31}, adiabatic free energy dynamics (AFED)^{32}, temperature accelerated molecular dynamics (TAMD) ^{33} / driven adiabatic free energy dynamics (dAFED) ^{34}, temperature accelerated dynamics (TAD) ^{35}, blue moon sampling ^{36, 37, 38}, adaptive biasing force (ABF) ^{39}, and thermodynamic integration (TI) ^{40, 41}. Where “good” CVs coincident with the important collective motions that separate metastable states are available, CV biasing is typically more computationally efficient than tempering approaches since it performs targeted acceleration along preselected order parameters^{5, 42}. In instances where such CVs are not known or poor choices are made, CV biasing can fail badly and tempering approaches generally prove more efficient and successful. It is normally difficult to intuit good CVs for all but the simplest systems.
Machine learning presents a systematic means to discover datadriven CVs. Conceptually, these approaches posit that the molecular dynamics in the 3dimensional space of the Cartesian coordinates of the
atoms are effectively restrained to lowdimensional “intrinsic manifold” containing the slow dynamics to which the remaining fast degrees of freedom are slaved
^{4, 43}. The validity of this assumption has been demonstrated many times and corresponds to the emergence of a small number of collective modes from cooperative couplings between system degrees of freedom ^{44, 43, 45, 46, 47, 48, 49, 50, 51}. CV discovery algorithms are therefore essentially dimensionality reduction or manifold learning approaches ^{44, 4}, and the CVs are “good” in that they parameterize the intrinsic geometry of the phase space and can correspond to the slow motions of the system over the manifold ^{44, 45}. Datadriven CV discovery suffers from a fundamental difficulty: to perform good sampling of configurational space one requires good CVs that are explicit functions of the atomic coordinates, but to discover these CVs one needs to perform good sampling of configurational space to generate data for CV discovery. This is the root of the biased sampling “chickenandegg problem” identified by Clementi and coworkers ^{4}. Accordingly, datadriven CV discovery must typically be performed in an iterative fashion by interleaving rounds of accelerated sampling and CV discovery: each round of sampling provides more comprehensive sampling of configurational space, and each round of CV discovery determines collective variables parameterizing the slow dynamical motions over the space explored by the biased calculations.Linear dimensionality reduction approaches such as principal components analysis (PCA)
^{52, 48, 51} and multidimensional scaling (MDS) ^{53} are straightforward to apply, and furnish CVs that are explicit functions of the atomic coordinates. As inherently linear approaches, however, the CVs are typically poor descriptors for complex biomolecules possessing convoluted and nonlinear intrinsic manifolds ^{4, 44, 45}. Kernel PCA presents a means to alleviate this problem by applying a known nonlinear transformation of the atomic coordinates prior to dimensionality reduction^{54}, but the specification of appropriate kernels can be almost as difficult as intuiting the CVs themselves ^{44}. Nonlinear manifold learning approaches include local tangent space alignment ^{55}, locally linear embedding (LLE) ^{56, 57}, Isomap ^{49, 58, 59, 60}, Laplacian eigenmaps ^{61}, Hessian eigenmaps ^{62}, semidefinite embedding (SDE) / maximum variance unfolding (MVU)
^{59}, diffusion maps ^{63, 45, 64, 65, 66}, and sketch maps ^{67, 68, 69}. These approaches have demonstrated great success in parameterizing nonlinear intrinsic manifolds of complex bio and macromolecules by discovering nonlinear CVs corresponding to concerted structural motions ^{63, 45, 49, 50, 65, 66, 70, 71, 72}. For the purposes of CV biasing, however, these approaches all share the critical deficiency in that they do not furnish the mapping of atomic coordinates to the CVs. An explicit and differentiable mapping from the atomic coordinates to the CVs is required to perform biased MD simulations in the CVs in order to propagate biasing forces in the CVs to forces on atoms ^{1, 4, 44, 73, 74}. (Biased MonteCarlo requires that the CVs be explicit, but not necessarily differentiable, functions.) Without this mapping, it is not possible perform biasing directly in CVs discovered by nonlinear learning.A small number of approaches have been developed to perform indirect and approximate biasing in CVs for which the atomic mapping is unknown. We previously introduced perhaps the simplest approach that correlates CVs with candidate physical variables (e.g., dihedral angles, radius of gyration) and performs biased sampling in these proxy variables ^{63}. Highthroughput screening can sieve putative physical variables and identify those best correlated with the CVs^{75, 76, 77}. This approach is, however, typically rather unsatisfactory, since it surrenders sampling in the CVs, and is very sensitive to the quality of the physical proxies. A related approach fits the parameters of CVs with predefined functional forms ^{78}, but the definition of appropriate functions can almost as difficult as defining the CVs themselves. A more sophisticated approach expresses atomic coordinates as a sum of localized basis functions over a small number of “landmarks” in the CV embedding ^{1, 60, 73, 79}, and projects new data by performing a weighted nearestneighbors embedding ^{73, 79} or solving a linear optimization problem to minimize the projection error ^{1, 60}
. Accordingly, direct sampling in the CVs is relinquished for an interpolative and approximate basis function expansion
^{44}. An elegant strategy termed diffusionmapdirected molecular dynamics (DMdMD) proposed by Clementi and coworkers accelerates sampling by initializing unbiased simulations in poorly sampled regions of the CV projection ^{66, 72}. Kevrekidis and coworkers recently proposed intrinsic map dynamics (iMapD) as an ingenious approach that first employs diffusion maps to construct a nonlinear parameterization of the local manifold, then combines boundary detection and local principal components analysis to smoothly extend the boundary and initialize short bursts of unbiased simulations in unexplored regions of configurational space ^{80}. Both DMdMD and iMapD use CVs discovered by nonlinear manifold learning to define the limits of current exploration of configurational space, then judiciously initialize new unbiased simulations to drive further exploration and accelerate sampling. However, neither approach actually performs biased sampling directly in the CVs. This can make phase space exploration inefficient by presenting difficulties for surmounting high free energy barriers down which unbiased simulations can rapidly tumble.In this work we present a new approach based on autoassociative artificial neural networks (“autoencoders”) ^{81, 82, 83, 84, 85} to discover nonlinear CVs and perform biasing along these coordinates. Crucially, neural networks offer a flexible and powerful means to discover nonlinear CVs and furnish a explicit and differentiable function of the atomic coordinates. These CVs may then be implemented directly in any CV biasing approach for enhanced sampling, such as umbrella sampling or metadynamics. This feature makes autoencoders uniquely suited to CV discovery for enhanced sampling, and stands in contrast to existing nonlinear machine learning approaches that require appealing to proxy variables, basis set projections, or local linear dimensionality reduction. We term our approach Molecular Enhanced Sampling with Autoencoders (MESA). In this article we introduce the theoretical and mathematical underpinnings of MESA, and demonstrate it in applications to alanine dipeptide and Trpcage, and make the approach freely available to the community through an open source plugin to the molecular simulation package OpenMM ^{86, 87, 88}.
The structure of this paper is as follows. In the next section, we establish the theoretical and mathematical underpinnings of MESA including autoencoders, dataaugmentation, the Lmethod for dimensionality identification, umbrella sampling, and the weighted histogram analysis method (WHAM). In Section 3, we demonstrate our approach to discover nonlinear CVs, perform enhanced sampling in these coordinates, and efficiently recover free energy surfaces for alanine dipeptide and Trpcage. In Section 4, we close with our conclusions and outlook for future work and development.
2 Methods
2.1 Nonlinear dimensionality reduction using autoencoders
2.1.1 Autoassociative artificial neural networks (autoencoders)
Artificial neural networks (ANN) are a collection of simple computational nodes (“artificial neurons”) linked together into a network
^{89}, and socalled because of their similarity to biological neural networks. ANNs have risen to prominence as a flexible tool to perform nonlinear regression and classification due to their large expressive power and predictive capacity ^{90}, with simple multilayer networks proven to be universal approximators of any nonlinear inputoutput relationship^{89, 91}. In this work, we focus on a class of autoassociative ANNs known as autoencoders, which are specifically designed to perform unsupervised nonlinear dimensionality reduction and furnish explicit and differentiable functions for the nonlinear projection^{82, 92, 93}. The structure of a prototypical 5layer autoencoder is shown in Fig. 1. In this section we review the key mathematical details of these networks that we exploit within our CV discovery and enhanced sampling framework.Architecture. The fundamental ansatz of the autoencoder is that there exists a lowdimensional subspace – the socalled “intrinsic manifold” ^{45} –
containing most of the variance in the highdimensional data
. Training of the autoencoder can discover both the nonlinear projection of the highdimensional data to the intrinsic manifold and an approximate reconstruction of the highdimensional states from their lowdimensional projections ^{82, 94}. The architecture of the network encodes this ansatz within a fivelayer, symmetric, fullyconnected, feedforward topology. The projection function is encoded in the first three layers, in which an ensemble of highdimensional data points comprising system snapshots harvested from a molecular dynamics trajectory, are fed to an input layer containing nodes, passed to the first hidden layer containing nodes, and finally projected down into a bottleneck layer comprising nodes containing the lowdimensional projections . The reconstruction function is encoded in the last three layers, in which the lowdimensional projections in the bottleneck layer are passed through the second hidden layer of nodes, and finally to the node output layer containing the reconstructed data points . This is known as a  autoencoder architecture. The number of input and output nodes is specified by the dimensionality of the data. The number of hidden nodes is flexible and may be tuned via crossvalidation to tradeoff network expressive flexibility against complexity. Choosing often provides a good initial guess. The number of bottleneck nodes specifies the intrinsic dimensionality of the system. We present below an automated approach to ascertain an appropriate value of .Activation functions. Denoting the input, output, and bias associated with node of layer as , , and , the weight of the connection between node in layer and node in layer as , and the activation function of layer as , we can build up an explicit mathematical mapping for . The inputs to the input layer are simply the components of the high dimensional data points . The input to any node in any other layer is the weighted sum of the outputs of all of the nodes in the previous layer plus a bias , and the output of that node is the result of activation function acting on its input . The input layer possesses linear activation functions (i.e., identity mappings) whereas the remaining layers possess nonlinear activation functions that we select to be hyperbolic tangents^{95}. Importantly for our application, these smooth activation functions possess analytical first derivatives. Summing over the first three layers, the analytical expression for the nonlinear projection is,
(1) 
and for the nonlinear reconstruction is,
(2) 
Training. Training of the autoencoder amounts to training the network to optimally reconstruct its own inputs (i.e., to autoencode). Mathematically, this corresponds to specifying the network weights and biases to minimize the error function,
(3) 
The first term is the reconstruction error, and the second term is a regularizing penalty known as weight decay that is typically included to prevent overfitting by controlling the growth of the network weights ^{82, 95}. In this work, we choose to set for all layers to eliminate the weight decay term and instead regularize network training using early stopping as an approach that plays essentially the same role in guarding against overfitting without the need to tune the hyperparameters ^{96, 97}
. Early stopping is implemented by randomly selecting 80% of data as the training partition and remaining 20% as the validation partition, and terminating training when we either reach the maximum number of training epochs or the error function evaluated over the validation partition no longer decreases for 30 continuous epochs (i.e., early stopping). In each epoch, neural network parameters are updated using minibatch stochastic gradient descent with momentum.
^{89, 98, 99, 100}.Dimensionality determination. The number of nodes in the bottleneck layer specifies the dimensionality of the nonlinear projection. We empirically determine the dimensionality of the intrinsic manifold from the data by training  autoencoder architectures, where is tuned via crossvalidation ^{95}. An appropriate value of is determined by computing the fraction of variance explained (FVE) by the reconstructed outputs,
(4) 
where is the residual sum of squares, is the total sum of squares, and
is the mean input vector. The emergence of a plateau or knee in the FVE curve at a particular value of
indicates that the predominance of variance in the data exists within a dimensional intrinsic manifold. We determine the location of the knee in an automated fashion using the Lmethod of Salvador and Chan ^{101}. For and a high FVE in excess of 70%, we assert that the autoencoder has discovered that the highdimensional data in admits a lowdimensional representation in from which the original data can be approximately reconstructed. The outputs of the bottleneck layer in the trained autoencoder define the CVs parameterizing the lowdimensional intrinsic manifold. The explicit mapping of the highdimensional input data to these CVs is furnished by the projection in Eqn. 1. Provided the activation functions of all of the nodes in the network are smooth functions, then this explicit mapping also possesses analytical first derivatives with respect to the input data.2.1.2 Elimination of translational and rotational invariances
Our application of autoencoders is to learn nonlinear lowdimensional CVs describing the important configurational motions of biomolecules. The training data comprises snapshots of the molecule harvested from molecular dynamics simulations recording the Cartesian coordinates of the constituent atoms. When studying the configurational motions of a molecule in an isotropic medium, we typically wish to resolve internal structural reconfigurations and exclude trivial changes in rotational orientation or center of mass location. Accordingly, we must properly treat the to eliminate rotation and translation. It is straightforward to eliminate translational degrees of freedom by subtracting the center of mass coordinates from each
prior to passing these data to the autoencoder. We eliminate rotational invariances using data augmentation approach that is commonly used in deep learning to eliminate invariances and enhance training sets, and which has proved particularly valuable in image classification applications
^{102}. For each translationallycentered molecular configuration , we generate an ensemble of additional configurations constructed by applying random threedimensional rotations to . In training the autoencoder we then assert that the network learn that all of these inputs should “autoencode” to the same output, thereby teaching the network to disregard rotations as a relevant degree of freedom and discover CVs that are a function only of the internal molecular configuration. Mathematically, this requires that we modify the error function defined in Eqn. 3 to,(5) 
where denotes a rotation of configuration by a randomly selected angle around a randomly selected axis in 3D space, and represents the optimal rotational alignment of rotated configuration to a reference configuration that has been previously translated such that its center of mass is coincident with the origin. The operation can be efficiently performed using the Kabsch algorithm ^{103}. Training of the autoencoder weights and biases to minimize Eqn. 5 teaches the network to map arbitrary rotations of a particular configuration to the alignment of that configuration to a specific reference structure. A deficiency of this approach is that we must select a particular reference configuration and this selection can affect the particular set of trained weights and biases. We observe that bias introduced by the choice of reference state can be mitigated and the autoencoder made robust to this particular choice by employing reference configurations , each of which is reconstructed by a subset of the outputs of an augmented terminal layer. For example, we show in Fig. 2 an autoencoder with 4828(24) topology employing two reference configurations. The network is identical to that in Fig. 1, except that approximations to each of the two reference configurations are computed in the terminal layer. We then define the error function as the sum of the errors associated with each reference structure,
(6) 
allowing the weights and biases of the output layer to differ for each of the reference configurations. In this manner, we confine the effect of different reference structure rotations to the parameters of the terminal layer, and the weights and biases of the first four layers are trained to be invariant to the choice of reference structure. In practice, we find satisfactory performance for autoencoders trained with =2 reference structures and =16 random rotations.
2.2 Enhanced sampling in autoencoder CVs
2.2.1 Umbrella sampling
The principal advantage of autoencoders over competing nonlinear dimensionality reduction techniques is that the mapping of the high dimensional data – here molecular configurations – to the intrinsic manifold spanned by the discovered CVs is an explicit and differentiable function of the input coordinates given by Eqn. 1
. Accordingly, biasing forces on the CVs can be analytically propagated to biasing forces on the atoms by repeated application of the chain rule. Enhanced sampling can then be conducted directly in the discovered CVs using any of the collective variable biasing techniques listed in Section
1. In this work we choose to use umbrella sampling, but other popular approaches such as metadynamics or adaptive biasing force calculations could be straightforwardly employed.Umbrella sampling (US) was first introduced by Torrie and Valleau in 1977 ^{26}. The method proceeds by augmenting the configurational part of the classical Hamiltonian written in the coordinates of all constituent atoms with a (typically harmonic) restraining potential in a set of predefined collective variables . These potentials confine the system to a narrow region of CV space that looks locally flat, and within which the system can comprehensively sample the acessible configurational space^{104, 105}. By tiling CV space with a number of overlapping umbrella windows, free energy barriers can be surmounted and the full extent of CV space completely explored. Success of the approach depends crucially on the selection of “good” CVs that track the slow collective molecular motions of the system and adequately discriminate the metastable states and resolve the free energy barriers within the system.
The biasaugmented system Hamiltonian determines the force experienced by each atom through the first derivatives with respect to the atomic position ,
(7) 
where we identify as the unbiased forces on the atoms arising from the system Hamiltonian in the absence of the biasing potential, and as the artificial biasing forces on the atoms arising from the potential applied to the CVs. In performing a biased molecular dynamics simulation, the unbiased forces are computed from the molecular force field in exactly the same manner as in an unbiased calculation. These are supplemented by the biased forces derived from the artificial restraining potential applied to the CVs. We now proceed to derive analytical expressions for the biasing forces. In Section 2.5 we describe our computational implementation within biased molecular dynamics simulations.
We first recognize that the furnished by the bottleneck layer depend on the inputs to the autoencoder through the map defined in Eqn. 1. In applications of nonlinear machine learning to biomolecular folding it is conventional to treat the solvent atom coordinates implicitly through their influence on the conformational ensemble explored by the biomolecule, but it is a possible extension of this work to explicitly consider solvent degrees of freedom ^{44}. Accordingly, the autoencoder inputs comprise a list of the Cartesian coordinates of the atoms constituting the biomolecule as a subset of the coordinates of all atoms in the system . Armed with these nested dependencies, we may expand as a product of three Jacobian matrices resulting from repeated application of the chain rule,
(8) 
where is a 1by matrix, is a by matrix, and is a by3 matrix ^{1}. Each of these three matrices possesses straightforward analytical expressions.
Assuming the standard choice of a harmonic biasing potential in each of the CVs, the artificial biasing potential can be written as,
(9) 
where is the harmonic force constant and the harmonic center for CV . The elements of the Jacobian matrix immediately follow as,
(10) 
Appealing to Eqn. 1, the elements of this Jacobian matrix follow from the lowdimensional projection furnished by the trained autoencoder,
(11) 
where the and are the weights an biases of the trained network, and are the activation functions associated with the second and third layers, and and are their first derivatives .
Since the constitute a subset of the atomic Cartesian coordinates, this final Jacobian matrix possesses a very simple form. For the by3 matrix contains three unit elements for , for , and for , with all other entries zero. For , all matrix elements are zero. Accordingly, this matrix serves as a filter that passes forces onto only those atoms whose coordinates feature in the inputs to the autoencoder.
2.2.2 Weighted histogram analysis method (WHAM)
Having conducted a series of umbrella sampling simulations tiling CV space, we estimate the unbiased free energy surface
in these CVs by reweighting and combining the biased histograms recorded in each window using the weighted histogram analysis method (WHAM) ^{106, 107, 108, 109, 110}. The WHAM estimator minimizes statistical errors in the estimated unbiased probability distribution
^{107, 108, 109, 110, 106, 111, 112, 113, 114}, and the free energy surface is then computed from the estimated unbiased distribution using the standard statistical mechanical relation^{115},(12) 
where is an arbitrary additive constant reflecting our ignorance of the absolute free energy scale. Free energy surfaces in (collective) variables other than those in which biased sampling was conducted are estimated from the biased simulation data and WHAM solution using the approach detailed in Ref. [^{63}].
2.3 MESA: Interleaved onthefly CV discovery and enhanced sampling
We now present our novel framework for nonlinear CV discovery and accelerated sampling that we term MESA. The heart of the approach is discovery of CVs that are explicit and differentiable functions of the atomic coordinates that can then be used to perform enhanced sampling by surgical targeting of biased calculations in these collective coordinates. Having conducted additional sampling of the space, the data driven CVs spanning this expanded space will, in general, differ from those parameterizing the original space and we must retrain our autoencoder including the new data. Furthermore, the dimensionality of the system may change with location in configurational space, and new CVs can emerge as we conduct additional sampling ^{63, 80}. This is the origin of the “chickenandegg problem” identified by Clementi and coworkers ^{4}, and requires that we interleave successive rounds of CV discovery and biased sampling until the discovered CVs and the free energy surface they parameterize stabilize. This informs a sixstep iterative protocol for the application of MESA, for which a schematic flowchart is provided in Fig. 3. We observe that the computational cost associated with autoencoder training is vastly lower than that of the biased molecular simulations for all but the most trivial of molecular systems. For example, execution of 15 2 ns biased simulations of the 20residue Trpcage miniprotein ^{116} in a bath of 2773 explicit water molecules required 8 GPUhours, whereas training of an autoencoder over the 32,000 harvested molecular configurations required only 5 GPUminutes.
Step 1 – Generate initial training data. Perform an initial unbiased simulation to generate an initial set of atomic coordinates for the first round of autoencoder training. The CV discovery process performs better for larger data sets that explore larger volumes of configurational space, and so it is typically beneficial to conduct relatively long unbiased calculations. As a rule of thumb, the initial unbiased run is sufficiently good for initial CV discovery if the fraction of variance explained by the reconstructed outputs exceeds 40% (Eqn. 4
). For systems possessing very rugged free energy surfaces, it can be useful to conduct this unbiased run at elevated temperature, employ a biased calculation in intuited CVs based on expert knowledge, or run calculations of a related but less computationally expensive model such as a biomolecule in vacuum or implicit solvent or a coarsegrained analog as a form of transfer learning
^{1, 117}.Step 2 – Autoencoder CV discovery. Train an ensemble of  autoencoders with different values of corresponding to various numbers of bottleneck nodes. is tuned at each value of by crossvalidation. Perform training over all molecular configurations collected in all previous rounds of biased and unbiased sampling using the dataaugmented training procedure described in Section 2.1.2 in which we eliminate translational and rotational invariances. It is typically good practice to train a number of autoencoders at each value of by initializing each with different randomly assigned weights and biases and selecting the top performed measured in terms of fraction of variance explained in the reconstructed outputs (Eqn. 4). Apply the Lmethod^{101} to identify a knee in the FVE plot to determine the dimensionality of the intrinsic manifold and select an appropriate value of .
Step 3 – Boundary detection. Map all molecular configurations to the lowdimensional intrinsic manifold using the trained autoencoder. Employ a boundary detection algorithm to define the dimensional boundary of the explored region of the dimensional manifold. Possible approaches include alphashapes ^{80, 118, 119} or “wrapping” algorithms^{120}. Here we employ a gridbased approach motivated by the normalized graph Laplacian that discretizes CV space to identify cells that are both sparsely populated and adjacent to densely populated regions.

Divide the space spanned by the collected data into a cubic grid comprising cells with side lengths . The size of the cells control how finely we resolve the boundary and how far we extrapolate beyond the explored region. The optimal values are system dependent, but we find the algorithm to be quite robust to the particular choice. As a rule of thumb, we find taking equal to of the linear range of the exploration in provides satisfactory results. Collect histograms over this grid and let denote the number of points falling within cell .

Compute for each cell. This operation applies a Laplacian / exponential kernel with unit variance to each cell count, which has the effect of amplifying difference between empty cells and nonempty cells.

Let be the set containing indexes of neighboring cells of cell . Compute . This operation measures the difference in the population of each grid cell with the mean population of its neighbors.

Discard all cell indexes with positive values. Sort the remaining and select the most negative values (i.e., largest absolute values) as the cells in which to launch biased umbrella sampling runs. In this work we select =1020, but this can be tuned based on available computational resources.
We find this procedure to perform quite well in both identifying the boundary of the explored region of the intrinsic manifold, and identifying internal holes within this region. It is also quite robust to the shape and dimensionality of the explored region. An illustration of the boundary detection procedure is illustrated in Fig. 4.
Step 4 – Enhanced sampling. Run umbrella sampling runs in each of the identified boundary cells employing harmonic restraining potentials given by Eqn. 9. This step both advances the frontier of the explored region and fills internal holes by direct biasing in the identified CVs. In this work we select the umbrella potential center to be the center of the identified cubic cells, but it is straightforward to place the biased runs closer or further from the boundary. Appropriate values for the harmonic restraining potentials are strongly system dependent, and must typically be tuned by trial and error. For systems possessing high free energy barriers, we suggest that it may be useful to adaptively tune based on the distance between the center of the umbrella sampling point cloud and the potential center. Specifically, if the center of mass of the point generated by the biased calculation falls outside of the grid cell, the harmonic force constant may be progressively increased up to a prespecified maximum. In this manner, the biased simulation can adapt to the local gradient and topography of the underlying free energy landscape ^{80}.
Step 5 – Convergence assessment. Determine whether the umbrella sampling runs have led to substantial new exploration of configurational space. We test this in two different ways. First, we project the new molecular configurations onto the intrinsic manifold using the autoencoder and determine whether the boundary of the explored region has changed after the addition of the new data. Second, we perform Step 2 to retrain autoencoders on the augmented data and identify whether the dimensionality of the intrinsic manifold and its parameterizing CVs have converged. The dimensionality of the intrinsic manifold is assessed by application of the the Lmethod^{101}
to identify a knee in the FVE plot. Stabilization of the CVs is assessed by performing a linear regression of the old CVs in terms of the new, and looking for coefficients of determination in excess of 95%. This procedure allows for trivial linear transformations between the old and new CVs – sign changes and rotations – that do not affect the information content of the parameterization. If the boundary has not moved, the dimensionality has not changed, and the CVs have stabilized, then we terminate MESA. Otherwise, we perform a new round of interleaved CV discovery and biased sampling by cycling round the loop again from Step 2.
Step 6 – Compute free energy surface. Perform umbrella sampling over the complete space spanned by the converged CVs specified by the terminal autoencoder. Construct the optimal estimate of the unbiased free energy surface in the collective variables by solving the WHAM equations, and optionally reweighting the unbiased free energy surface into arbitrary collective coordinates (Section 2.2.2).
This algorithm possesses a number of attractive features. First, all steps are largely automated requiring minimal user intervention, including training of the autoencoder, boundary detection, and identification and launch of new umbrella runs. The requirement of the user beyond specifying the parameters of the algorithm is in Step 5 to judge whether the boundary and/or CVs are sufficiently similar to warrant termination of MESA. In principle, this step could also be automated. Second, by interleaving CV discovery and enhanced sampling, we converge towards a set of global datadriven CVs providing a good parameterization of the accessible configurational space. Third, the availability of the explicit mapping from the atomic coordinates to the CVs enables enhanced sampling directly in the CVs. This stands in contrast to existing approaches that initialize unbiased simulations at the edge of the explored region in CV space to drive sampling of unexplored configurational space ^{80, 66, 72} and/or perform sampling in proxy physical variables^{63} or in an approximate basis function expansion ^{1, 60, 73, 78, 79}. Fourth, the collective coordinates are useful not only for enhanced sampling, but in providing understanding of the important slow collective motions governing the longtime evolution of the system. The explicit mapping between the atomic coordinates and CVs can help provide a transparent physical interpretation of these collective modes. Although these relations can be somewhat obscured by the complexity of the mapping function (Eqn. 1), the weights within the trained network can be quite informative in illuminating which atoms play important roles in each of the discovered CVs.
2.4 Molecular dynamics simulations
We demonstrate and validate MESA in applications to two peptides: alanine dipeptide and Trpcage (Fig. 5). All molecular simulations were performed using the OpenMM 7.0 molecular dynamics package ^{86, 87, 88}.
2.4.1 Alanine dipeptide
Simulations of alanine dipeptide (NacetylLalanineNmethylamide, AcAlaNHMe) were performed in vacuum using the Amber99sb force field ^{122}. Calculations were performed at =300K and the equations of motion numerically integrated using a Langevin integrator with a friction coefficient of 1 ps and a 2 fs time step ^{123}. All bond lengths were fixed using the LINCS algorithm ^{124}. LennardJones interactions were treated with no cutoff, and we employed the LorentzBerthelot combining rules to determine interaction parameters between dissimilar atom types ^{125}. Electrostatic interactions were computed exactly in real space using no cutoff. A single 800 ps initial unbiased calculation was performed, and snapshots saved every 1 ps to generate an initial unbiased ensemble of 800 molecular configurations. Enhanced sampling calculations were performed using an inhouse biasing force plugin to OpenMM to launch =1020 umbrella windows in each round employing harmonic restraining potentials of =3000 kJ/mol.(unit of CV) in each CV. Each umbrella calculation was performed for 100 ps, saving configurations every 1 ps. All calculations were performed on Intel i75820K chips (6cores, 15 MB Cache, 3.8 GHz overclocked), achieving execution speeds of 6 s/day.core for unbiased simulations and 1.3 s/day.core for biased calculations.
2.4.2 Trpcage
Simulations of Trpcage (NLYIQWLKDGGPSSGRPPPS; PDB ID: 1L2Y) were performed in a bath of 2773 TIP4PEw water molecules ^{126} treating the protein using the Amber03 force field ^{122}. A single Cl counterion was added to neutralize the net single positive charge of Trpcage. Initial molecular configurations were downloaded from the Protein DataBank ^{116, 127}. Calculations were conducted at =300 K using an Andersen thermostat ^{128} and =1 atm using a MonteCarlo barostat^{129, 130}. Equations of motion were numerically integrated using the leapfrog Verlet algorithm with a 2 fs time step ^{131}. The length of all bonds involving H atoms were fixed using the LINCS algorithm ^{124}. LennardJones interactions were subjected to a 1 nm cutoff, and LorentzBerthelot combining rules used to determine interaction parameters between dissimilar atom types ^{125}. Electrostatic interactions were treated using particle mesh Ewald with a 1 nm cutoff ^{132}. Three independent 10 ns unbiased simulations were conducted saving snapshots every 20 ps, to generate an initial unbiased ensemble of 1500 molecular configurations. Enhanced sampling calculations were performed using an inhouse biasing force plugin to OpenMM to launch =1020 umbrella windows in each round employing harmonic restraining potentials of a minimum of =3000 kJ/mol.(unit of CV) in each CV. Higher force constants were employed as necessary to achieve good sampling in rugged or steep regions of the landscape. Each umbrella calculation was performed for 2 ns, saving configurations every 20 ps. All calculations were performed on GeForce GTX 960 GPU cards, achieving execution speeds of 100 ns/day.core for the unbiased simulations and 95 ns/day.core for the biased calculations.
2.5 Implementation and open source availability
All molecular simulations were performed using the open source OpenMM 7.0 simulation package available for free download from http://openmm.org ^{86, 88}
. All artificial neural network construction and training was conducted using the Keras deep learning Python API available from
https://github.com/fchollet/kerasand running on top of the Theano libraries
^{133}. Umbrella sampling calculations were performed using an inhouse biasing force plugin ANN_Force to OpenMM that computes all of the Jacobian matrix elements in Eqn. 2.2.1 from a trained autoencoder, and calculates from these the atomic biasing forces for any given molecular configuration. We have implemented this plugin for both the CPU and CUDA platforms of OpenMM.We have made all of our software implementations available for free and open source public download under the MIT License. The ANN_Force biasing force plugin to OpenMM is available from https://github.com/weiHelloWorld/ANN_Force. The enhanced sampling framework to train autoencoders, perform boundary detection, deploy and run biased umbrella runs, and postprocess the biased simulation data is available from https://github.com/weiHelloWorld/accelerated_sampling_with_autoencoder.
3 Results and Discussion
We now discuss the application of MESA to two test peptides – alanine dipeptide and Trpcage – to demonstrate, validate, and benchmark the approach (Fig. 5).
3.1 Alanine dipeptide
Alanine dipeptide is a canonical and wellstudied test system for new biomolecular methods ^{134, 63, 66, 72, 135, 136, 137, 138} (Fig. 5a). It possesses a 2D intrinsic manifold that is conventionally parameterized by the and backbone dihedrals and lies on the surface of a flat torus ^{134, 139, 66, 140}. The free energy surface supported by this intrinsic manifold is very well studied, and in vacuum supports a landscape containing three local minima corresponding to the , , and states ^{139, 66}. Interconversion between the and states occurs on the order of 50 ps, but transitions to the state are around 100fold slower making efficient exploration and recovery of converged free energy surfaces a good test of our approach ^{66}. Unbiased and biased simulations of alanine dipeptide in vacuum were conducted as detailed in Section 2.4.1.
Generation of initial unbiased data (Step 1). We commenced application of MESA by conducting a 800 ps unbiased simulation initialized from the state. As illustrated by the Ramachandran plot in Fig. 6, this long unbiased simulation comprehensively explores the and local minima, but fails to transition even once over the high free energy barriers separating them from the state.
Iterative CV discovery and sampling (Steps 25). We then performed 10 rounds of interleaved CV discovery and enhanced sampling using MESA. Collective variable discovery was performed by training  autoencoders over =21dimensional input data z comprising the Cartesian coordinates of the seven backbone atoms of alanine dipeptide. We employ two randomly selected reference configurations in our error function (Eqn. 6), such that the output data are =42dimensional. The number of nodes in the hidden layers was specified to be =40 by cross validation. The number of bottleneck layer nodes in each iteration of MESA was determined by ascertaining the dimensionality of the intrinsic manifold by plotting the fraction of variance explained as a function of (Eqn. 4). As illustrated in Fig. A1 in the Appendix, we find a knee at =2 in all 10 iterations, motivating the use of two bottleneck nodes in each loop of MESA. It is known that the intrinsic manifold of alanine dipeptide is 2D, so it is encouraging that our automated approach can accurately ascertain this from the data. Furthermore, it is known that the periodicity in the and dihedrals parameterizing the manifold endow it with the topology of a flat torus ^{134}. As carefully investigated by Hashemian and Arroyo ^{134}, topological periodicities in the intrinsic manifold can lead to inefficiencies in sampling and difficulties in CV determination. We shall see that these periodicities do not cause any such problems for MESA, but do introduce artifacts into the free energy surface. We quantify these effects below and present a protocol for their elimination.
We present in Fig. 7 the results of the 10 MESA iterations illustrating the expansion of the exploration of the 2D intrinsic manifold, stabilization of the CVs and explored region, and projection of sampling points into  space. The explored region of the intrinsic manifold converges within 10 iterations, with the frontier of the intrinsic manifold and CVs stabilizing between iterations 9 and 10. The projections into  space show that the system is driven to sample the basin by iteration 6, indicating the ability of our approach to discover important collective motions and drive sampling in these collective coordinates.
Inspection of the first two columns of Fig. 7 indicates that there is no simple monotonic relationship between the autoencoder discovered CVs and the backbone dihedrals known to provide a good parameterization of the intrinsic manifold. The source of this apparent discrepancy is the inherent periodicity in these two dihedral angles, which prevents the emergence of a simple bijective mapping with the autoencoder CVs. Close inspection of Fig. 7 reveals the emergence of closed loops in space corresponding to wrapping of the enhanced sampling through the periodic boundary. Accordingly, the intrinsic manifold spanned by  actually represents a projection of the topologically closed flat torus upon which the alanine dipeptide intrinsic manifold resides ^{134}. Indeed, the flat torus can only be embedded in four dimensions or higher ^{141}. Mathematically, four dimensions are required to preserve the topological structure of the flat torus and define an invertible map such that every point on the surface of the torus is uniquely located within the projection and the object is fully unfolded in the embedding without artificial intersections ^{141, 142, 143}. We should therefore expect projections into lower dimensional spaces to contain regions where distant points on the flat torus are projected to identical points in the lowdimensional embedding ^{134}. Such artifacts are visually apparent in the last row of Fig. 7, where the upper right quadrant of the blue 1 rad loop containing configurations with [ 1 rad, 2 rad] appears to lie on top of configurations with [ 2.5 rad, 0 rad]. In the present application these artifacts do not appear to be so severe as to introduce any difficulties for CV determination or enhanced sampling using MESA, but we do observe that they can impede sampling efficiency by causing exploration to become stuck within these loops for a few iterations before MESA is able to drive sampling in the other degree of freedom.
Free energy surface (Step 6). Having converged MESA, we collate all simulation snapshots collected over the 10 iterations to train a terminal autoencoder that will be used to perform umbrella sampling over the complete CV space. An analogous plot to Fig. 7 for this terminal autoencoder is presented in Fig. 8.
Using this terminal autoencoder, we conduct 200 umbrella sampling calculations applying biasing potentials to tile the intrinsic manifold spanned by the CVs . We then analyze these biased trajectories to estimate the unbiased free energy surface by solving the WHAM equations (Section 2.2.2). We present in Fig. 9a the free energy surface in CV space and in Fig. 9b the estimated uncertainties in the landscape computed by 10 rounds of bootstrap resampling with replacement. In Fig. 9cd we present analogous plots for and its estimated uncertainties computed by reweighting of our biased simulation data collected in . As a point of comparison, we present in Fig. 9ef an estimate of and its uncertainties computed by performing umbrella sampling directly in the dihedral angles as a ground truth estimate of the free energy surface. Finally, we present in Fig. 9g a plot of the difference between the optimally aligned landscapes computed by reweighting umbrella sampling data in (Fig. 9c) and direct umbrella sampling in (Fig. 9e). Analysis of Fig. 9 reveals that MESA ably resolves the basins corresponding to the three metastable states , , and , but Fig. 9g shows that the shape, location and depth of the basin differs from that recovered from direct umbrella sampling in , and a spurious metastable minimum is resolved at [ 2.5 rad, 0.75 rad]. These artifacts in the free energy surface are a direct result of the periodicities in the underlying intrinsic manifold that populates the surface of a flat torus and cannot be correctly embedded in a 2D projection ^{134}.
A straightforward way to solve the projection problem with periodic variables is to simply increase the dimensionality of the autoencoder nonlinear projection by increasing the number of bottleneck layer nodes. The Whitney Embedding Theorem states that any dimensional manifold can be embedded (i.e., fully unfolded without spurious crossings) in for = (2+1) ^{142}. The theorem specifies the embedding dimensionality at or above which we are guaranteed to achieve a proper embedding, but depending on the topology of the manifold it may be possible to employ (2+1). Accordingly, if we have prior knowledge that the intrinsic manifold of our system may contain periodicities, or we observe the emergence of closed loops in the CVs identified by MESA (e.g., Fig. 7), then we propose that the number of hidden nodes be increased to (2+1), where is the dimensionality of the intrinsic manifold identified by the FVE approach described in Section 2.1.1
. Increasing dimensionality does come with increasing computational cost since it is more expensive to train the autoencoder (since the number of network parameters increases), run biased calculations (since the analytical expression for the nonlinear projection becomes more complicated), and conduct umbrella sampling (due to the curse of dimensionality). It is also more challenging to interpret and visualize data highdimensional spaces. In practice, therefore, we suggest that the number of hidden nodes may be increased one at a time and the dimensionality increase terminated when projections of the manifold cease to depend on
.In the present case, we know that is sufficient to embed the flat torus containing the intrinsic manifold of alanine dipeptide ^{134}, and so we train a new terminal autoencoder with =4 bottleneck nodes over the collated data from all 10 rounds of MESA conducted using =2 topologies. We then performed 1000 umbrella sampling simulations to tile the 4D space spanned by the autoencoder CVs and recover an estimate of the unbiased FES by solving the WHAM equations. In Fig. 10a we present computed by reweighting the the umbrella sampling calculations, and in Fig. 10b the estimated uncertainties computed by 10 rounds of bootstrap resampling with replacement. In Fig. 10c, we present the difference between Fig. 10a and that recovered by direct umbrella sampling in (Fig. 9e). By fully expanding the flat torus into 4D space, we eliminate the projection problem associated with periodic variables. MESA ably identifies four CVs parameterizing the embedding of the flat torus, and projection of the 4D free energy surface into  space shows excellent agreement with that recovered by direct sampling in these dihedral angles.
3.2 Trpcage
Trpcage (PDB ID: 1L2Y) is a 20residue miniprotein ^{116} with a 4 s folding time and a native state containing an helix, a helix, polyproline II helix, and salt bridge that “cage” Trp6 in a hydrophobic core ^{144, 145, 146, 147, 148} (Fig. 5b). A fastfolding miniprotein containing secondary structural elements present in large proteins, Trpcage has been extensively studied by both computation^{144, 149, 148, 150, 151, 152, 153, 154, 155, 156, 157, 158} and experiment^{145, 159, 160, 161, 162, 163} as the “hydrogen atom of protein folding” ^{147, 164}. In particular, Jurazek and Bolhuis identified two distinct folding pathways using transition path sampling ^{148}. Deng et al. used Markov state models to analyze an long 208 s simulation to resolve similar patterns in overall structure changes and folding routes ^{156}. Kim et al. ran dozens of folding simulations starting from unfolded states, then projected these trajectories into a 2D collective variable space identified by diffusion maps^{144}. Both Deng et al. and Kim et al. achieved good agreement of their estimated folding times with experiment, demonstrating the ability of MD simulations to quantitatively recapitulate biophysical folding processes. We employ Trpcage as a more realistic and challenging test case for MESA. Simulations of Trpcage in explicit water were conducted as detailed in Section 2.4.2.
Generation of initial unbiased data (Step 1). We ran three independent 10 ns unbiased simulations commencing from the Trpcage native state downloaded from the Protein DataBank (PDB ID: 1L2Y) ^{116, 127}. Snapshots were saved every 20 ps to generate an initial unbiased training set of 1500 molecular configurations. These data achieve a maximum root mean squared deviation (RMSD) of the atoms relative to the native state of 3.5 Å, and the first iteration of MESA results in an autoencoder with an FVE 60%. Accordingly, these unbiased data provide a reasonable starting ensemble for iterative application of MESA.
Iterative CV discovery and sampling (Steps 25). We then performed six rounds of interleaved CV discovery and enhanced sampling using MESA before achieving convergence. Collective variable discovery was performed by training  autoencoders over =180dimensional input data z comprising the Cartesian coordinates of the 60 backbone atoms of Trpcage. We again used two randomly selected reference configurations in our error function (Eqn. 6), so that the output data are 2=360dimensional. The number of nodes in the hidden layers were tuned to =50 by crossvalidation. The number of bottleneck layer nodes in each iteration of MESA by plotting the FVE as a function of using Eqn. 4. As illustrated in Fig. A2 in the Appendix, we find a knee at =2 in all 6 iterations, supporting the use of two bottleneck nodes in each iteration of MESA. This estimated dimensionality of the Trpcage intrinsic manifold is consistent with prior work ^{148, 144, 154}.
We present in Fig. 11 the results of the six MESA iterations illustrating accelerated sampling and exploration of the intrinsic manifold. We project the sampled molecular configurations into the 2D intrinsic manifold spanned by the two MESA CVs discovered in each iteration and colored according to three physical order parameters to illuminate the configurational diversity in the accelerated sampling: the RMSD of the atoms relative to the native state, the endtoend extent of the molecule measured between the atoms in Asn1 and Ser20, and the sine of the nonlocal dihedral angle formed by the atoms in Asn1, Asp9, Ser14, and Ser20. As illustrated in Fig. 5b, measures the signed angle between the planes containing the atoms of {Asn1, Asp9, Ser14}, and that containing those of {Asp9, Ser14, Ser20}. RMSD measures the departure of the configurational sampling from the native state, measures the linear extent of the peptide chain, and measures the global chirality of the molecule: indicates that the Nterminal and Cterminal strands are arranged such that the molecule adopts a righthanded chirality, that they adopt a lefthanded chirality, and =0 that the Nterminal and Cterminal strands lie in the same plane. Convergence of the frontier of the intrinsic manifold, volume of configurational space explored, and stabilization of the identified CVs is achieved between iterations 5 and 6.
Free energy surface (Step 6). Upon convergence of MESA, we compile all of the simulation snapshots collected over the six iterative cycles to train a terminal autoencoder with which to perform umbrella sampling over the intrinsic manifold. An analogous plot to Fig. 11 for this terminal autoencoder is presented in Fig. 12.
We then perform 30 umbrella sampling runs tiling the explored region of the intrinsic manifold spanned by the converged CVs identified by the terminal autoencoder. Finally, we estimate the unbiased free energy surface by solving the WHAM equations (Section 2.2.2). We present in Fig. 13a the free energy surface and in Fig. 13b the estimated uncertainties in the landscape computed by 10 rounds of bootstrap resampling with replacement. We superpose onto the landscape representative molecular configurations to illustrate the folding pathways, and also mark the native state (N) and a partially unfolded state (L) that is similar to the previously reported “loop structure” in which the helix is unfolded while the other secondary structure elements remain intact^{144, 148, 165, 166}. Comparison of Fig. 13a with Fig. 12 reveals a physical interpretation of the CVs . We find to be strongly correlated with the global molecular chirality (Pearson correlation coefficient = 0.92 with 2tailed pvalue 10), indicating that MESA has discovered a CV that distinguishes lefthanded, righthanded, and planar molecular configurations. Meanwhile, is strongly correlated with both the endtoend distance (Pearson correlation coefficient = 0.91 with 2tailed pvalue 10) indicating that MESA identifies a second CV tracking the global linear extent of the peptide.
These two CVs describing the important collective motions of Trpcage map out a rugged 2D folding landscape and resolve a lowfree energy folding pathway indicated by the blue arrows in Fig. 13a. Starting from extended configurations with an intact Nterminal helix, fully unfolded helix, and different global chiralities – states B, C, D, E – these states move along shallow valleys within a relatively flat free energy plateau to reach an intermediate state A where the hydrophobic core and a 7 Å salt bridge between Asp9 and Arg16 is formed. In this configuration, Trp6 is buried in the hydrophobic core and the free energy of the system by 37 lower than the plateau. Intermediate state A is structurally quite similar to native state, possessing a RMSD of 3 Å and global molecular planarity. The final transition to the native state N proceeds down a steep free energy valley corresponding to folding of helix and a further decrease in free energy.
The loop structure L represents a lowlying metastable state that is only 2 less stable than the native fold, and accessible over a 2 free energy barrier. Transformation of the native fold into the loop structure is accompanied by unfolding of the helix and the concomitant adoption of a lefthanded chirality. In results reported by Juraszek et al. ^{148} and Kim et al.^{144}, L state is an important intermediate in the nucleationcondensation folding path.
In Fig. 13c we project our FES into the two physical order parameters (RMSD, helixRMSD) – where helixRMSD is the RMSD of the atoms in the Nterminal helix (residues 28) from the Trpcage native fold – in order to draw comparisons with landscapes and folding pathways previously reported by Juraszek et al. (Fig. 3) ^{148}, Kim et al. (Fig. 2) ^{144}, Paschek et al. (Fig. 3) ^{167}, and Deng et al. (Fig. 5) ^{156}. The FES mapped out in these order parameters by MESA shows good agreement with that computed by replica exchange molecular dynamics (REMD) by Juraszek et al. employing the OPLSAA force field ^{168} and SPC water model^{169}. Our folding pathway also closely resembles the diffusioncollision IPN folding path resolved in that study using transition path sampling (TPS). Conversely, in our calculations the Nterminal helix is always intact, whereas Juraszek et al. observe it to unfold, opening up a second nucleationcondensation folding pathway via the L state. Interestingly, this folding path is only observable using TPS, and proceeds through regions of the FES unmapped by the REMD calculations, which – as the authors observe – had difficulty surmounting high free energy barriers and may not be fully converged. We note that preservation of helical content in the unfolded state is consistent with spectroscopy measurements conducted by Ahmed et al. ^{170} suggesting early formation of the helix ^{148}. The Amber03 force field employed in this work possesses an elevated propensity for helicity relative to the OPLSAA force field employed by Juraszek et al. ^{171, 148, 167}, which we suggest may be at the root of the high stability of the helix observed in our simulations.
Similarly, our FES also shows good correspondence with that determined by Kim et al. ^{144} employing the Amber ff03w force field^{172} and TIP4P/2005 water model^{173}. In particular, the line of metastable basins at helixRMSD 0.3 Å map to the F, I, J, and K basins in the vicinity of the native fold identified in that work. However, our landscape only extends up to helixRMSD values of 2 Å with the Nterminal helix always intact, whereas Kim et al. resolve up to helixRMSD values of 4.5 Å. These authors achieve helix unfolding by hightemperature denaturation and estimate the FES from 25 independent folding trajectories that are terminated upon attaining the native fold. Accordingly – as the authors observe – the simulation data are not sampled from an equilibrated distribution, and the FES cannot be considered to be thermodynamically converged. Accordingly, we suggest that the highhelixRMSD configurations resolved by Kim et al. actually lie far higher in free energy, are not thermally accessible at room temperature, and are a result of the simulation protocol in which folding runs were initialized from thermally denatured starting structures. The folding pathway resolved by MESA shows good correspondence with the latter portion of the diffusioncollision pathway (Path B) identified by Kim et al. using diffusion maps. We also resolve the metastable loop structure L termed LOOPI by Kim et al., the nearnative intermediate B that is termed HLXI. We do not resolve the upper portion of the diffusioncollision pathway, nor the nucleationcondensation pathway (Path A), both of which correspond to folding of the initially unfolded Nterminal helix. Interestingly, the structural details and relative propensity of the nucleationcondensation pathways identified by Juraszek et al. and Kim et al. are not in agreement. Kim et al. ascribe these discrepancies to differences in the force field, but it is also likely a result of the nonequilibrium manner in which data were collected in that work ^{144}. We also suggest that the root of the approximate symmetry in CV in the horseshoelike landscape resolved by Kim et al. is the global molecular chirality ^{144} – which is the same origin of our approximate symmetry in CV – and that this CV does not distinguish the two folding mechanisms. Careful inspection of Fig. 8 in that work in which points are colored by helixRMSD, suggests that both the diffusioncollision and nucleation condensation pathways can proceed under each global chirality ^{144}. In our work, we also observe the diffusioncollision path to proceed along the left and righthanded chiralities, but do not resolve the nucleationcondensation routes since the Nterminal helix remains intact.
Our folding pathways show the best correspondence with those identified by Paschek et al. ^{167}, who employed REMD simulations using the Amber ff94 force field^{174} and TIP3P water model ^{175}. In excellent agreement with our calculations, these authors identified a single diffusioncollision folding pathway in good correspondence with that identified by both Juraszek et al. ^{148} and Kim et al. ^{144}. Paschek et al. share our concerns, however, that the excessive stability of helices in the protein force field may artificially overemphasize this folding route ^{167}. Our results are also in good agreement with the work of Deng at al. ^{156}, who analyzed using Markov state models a long 208 s simulation performed by LindorffLarsen et al.^{176} employing the Charm22 force field ^{177, 178} and TIP3P water model ^{175} These researchers also resolved competing diffusioncollision and nucleationcondensation pathways, but found the former to be overwhelmingly more probable than the latter.
In sum, MESA achieves automated datadriven discovery of two CVs parameterizing the important collective motions of Trpcage and in which it executes accelerated sampling of configurational space. The resultant FES in these two CVs identifies the important (meta)stable configurational states, and our accelerated sampling protocol probes a 20 range in free energy. The FES, and folding pathways it contains, are in good agreement with prior work ^{148, 144, 167, 156}, although in the present study we do not observe unfolding of the Nterminal helix and therefore do not resolve the nucleationcondensation folding pathway. The stability of the Nterminal helix is likely largely attributable to the favorability of helical secondary structure within the Amber03 force field ^{122, 171, 167}, and it would be of interest to compare the CVs, FES, and folding pathways resolved by MESA under different choices of protein force fields and solvent models.
We observe that MESA also offers a means to enhance configurational sampling of particular secondary structural elements by tuning the error function used to train the autoencoder to focus on particular regions of the protein (Eqn. 6). Our current choice of error function trains the network to optimally reconstruct – under translation and rigid rotation – the protein coordinates, wherein the contribution of each atom to the reconstruction error is equally weighted. By adapting the error function to more heavily weight particular atoms, we can train the network to focus on accurate reconstruction of particular regions of the protein secondary structure. In the present case, placing more weight on the atoms residing within the Nterminal helix may nudge MESA to discover CVs tailored to this region and assist in improved sampling of secondary structural changes in this region.
Benchmarking of MESA acceleration. In aggregate, the six rounds of MESA required 72 GPUhours on a GeForce GTX 960 card, with network training costs dominated by the 300 ns of molecular simulation. To quantify the acceleration in exploration of configurational space offered by MESA, we conducted a 72 GPUhour unbiased simulation commencing from the Trpcage native state to generate a 300 ns simulation trajectory. Projection in Fig. 14a of the simulation snapshots resulting from this unbiased calculation into the CVs of the terminal autoencoder reveals that it is entirely confined to the native basin, and samples a tiny fraction of the configurational space explored by MESA. The corresponding unbiased FES in Fig. 14b shows that the unbiased run probes a free energy range of only 7 , compared to the far more expansive sampling and 20 range explored by MESA (Fig. 13a).
4 Conclusions
In this work, we have introduced a new methodology – Molecular Enhanced Sampling with Autoencoders (MESA) – for the datadriven accelerated order parameter discovery and accelerated sampling of the configurational space of macromolecular and biomolecular systems. The methodology requires minimal prior knowledge of the system, requiring only an initial short simulation trajectory to seed iterative, onthefly discovery of lowdimensional collective variables and accelerated sampling of configurational space. The essence of our approach is to iteratively train autoassociative artificial neural networks (autoencoders) to learn lowdimensional projections into nonlinear collective variables (CVs) coincident with the important collective molecular motions and which – in contrast with other nonlinear manifold learning approaches – are explicit and differentiable functions of the atomic coordinates. The availability of this explicit mapping admits analytical expressions propagating biasing potentials in the CVs to biasing forces on the atoms, thereby enabling accelerated sampling directly in the datadriven CVs using collective variable biasing techniques. The capacity to apply biasing forces directly within the nonlinear CVs distinguishes our approach from existing datadriven accelerated sampling techniques that appeal to proxy variables, basis set projections, local linear dimensionality reduction, or judicious initialization of unbiased calculations to enhance sampling. We perform interleaved rounds of autoencoder CV discovery and biased sampling until we achieve convergence in the CVs and region of explored configurational space. We have largely automated the MESA framework to require minimal user intervention, and made it freely available for public download (https://github.com/weiHelloWorld/accelerated_sampling_with_autoencoder) together with a plugin implementing the biasing forces within the OpenMM simulation package ^{86, 88} (https://github.com/weiHelloWorld/ANN_Force).
We have validated MESA in simulations of alanine dipeptide in vacuum and Trpcage in explicit water. In each case we recover lowdimensional nonlinear CVs and efficiently drive sampling of configurational space to recover converged free energy surfaces (FES) resolving the important metastable states and folding pathways. Our application to alanine dipeptide illuminates the difficulties associated with periodicities in the inherent manifold – in this case a flat torus – and we demonstrate how to confront this issue by increasing the dimensionality of the embedding space to fully unfold the manifold and prevent spurious intersections. In the case of Trpcage, we recover a FES and diffusioncollision folding mechanism in good agreement with prior work. The favorability of helical secondary structure within the protein force field means that we did not observe unfolding of the Nterminal helix, and therefore did not resolve the nucleationcondensation folding pathway. It would be of interest in future work to compare the MESA CVs and FES predictions under different choices of protein and solvent potential functions. Benchmarking of the acceleration offered by MESA in sampling the Trpcage configurational space demonstrated a threefold increase in the sampled range of free energy and massive enhancement over the exploration achieved by unbiased calculations utilizing the same computational resources. We expect MESA to prove most valuable for large biomolecular systems about which there is little expert knowledge of the important conformational motions and where expensive computation is at a premium. Preliminary testing demonstrates that autoencoders with 15,00010003100015,000 topologies can be efficiently trained over 32,000 configurations in under 40 GPUminutes, supporting the scalability of network training to large molecular systems. Generation of sufficient quantities of training data through biased simulation and robust application of regularization are of prime concern in achieving good network performance while avoiding overfitting in these sizable networks. We note that the computational burden of network training may be alleviated through coarsegrained representations of the molecule to the autoencoder employing, for example, only backbone atom coordinates or internal dihedral angles, and of biased simulation through the use of coarsegrained or multiresolution molecular models ^{179}.
We see many avenues for future development of our approach. First, we have employed umbrella sampling as an efficient and embarrassingly parallel accelerated sampling approach, but anticipate that variants of metadynamics ^{29, 30, 31} may offer an attractive alternative by eliminating the need for boundary detection, initialization of swarms of independent umbrella runs, and adaptive tuning of umbrella potential force constants. Second, the development of bespoke error functions used to train the network presents a powerful means to incorporate prior system knowledge and focus the calculated CVs and accelerated sampling on particular regions and features of the system. For example, this approach may be of value in unfolding particular secondary structure elements, focusing sampling on enzymatic active sites, or actuating important allosteric motions. One drawback of the current error function is that there is no ranking of the CVs discovered by the autoencoder, such that the relative importance of the collective modes is not apparent. The hierarchical error functions proposed by Scholz and Vigário ^{82, 81} extract CVs within a nested hierarchy of subnetworks that discover successively higher dimensional projections that minimize the reconstruction error subject to the constraint that all lower dimensional projections also minimize the reconstruction error. This imposition of a rank ordering on the CVs can help interpret the relative importance of the identified collective motions and also stabilize the CV discovery ^{82}. Third, rather than training a single network over all of the collated simulation data in each iteration to discover and refine good global CVs, we propose that it may be beneficial to train independent networks over different regions of the inherent manifold in order to discover good local CVs. In the spirit of local tangent space alignment ^{55} and iMapD ^{80}, local CVs adapted to their neighborhood of configurational space can resolve the collective motions relevant to that region and improve the efficiency of the enhanced sampling. The biased sampling data furnished from each network may then be patched together using reweighting techniques ^{106}, and the terminal landscape constructed in the union of all local CVs or a unified set of global CVs identified post hoc in the final MESA iteration.
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grant No. CHE1664426.
Appendix
References
 1 Hashemian, B.; Millán, D. and Arroyo, M., The Journal of Chemical Physics, 2013, 139(21).
 2 Bernardi, R. C.; Melo, M. C. and Schulten, K., Biochimica et Biophysica Acta, 2015, 1850(5), 872–877.
 3 Karplus, M. and Petsko, G. A., Nature, 1990, 347(6294), 631–639.
 4 Rohrdanz, M. A.; Zheng, W. and Clementi, C., Annual Review of Physical Chemistry, 2013, 64, 295–316.
 5 Abrams, C. and Bussi, G., Entropy, 2013, 16(1), 163–199.
 6 Dellago, C.; Bolhuis, P. G.; Csajka, F. S. and Chandler, D., The Journal of Chemical Physics, 1998, 108(5), 1964–1977.
 7 Bolhuis, P. G.; Chandler, D.; Dellago, C. and Geissler, P. L., Annual Review of Physical Chemistry, 2002, 53(1), 291–318.
 8 Rogal, J. and Bolhuis, P. G., The Journal of Chemical Physics, 2008, 129(22), 224107.
 9 Moroni, D.; Bolhuis, P. G. and van Erp, T. S., The Journal of Chemical Physics, 2004, 120(9), 4055–4065.
 10 van Erp, T. S.; Moroni, D. and Bolhuis, P. G., The Journal of Chemical Physics, 2003, 118(17), 7762–7774.
 11 Weinan, E.; Ren, W. and VandenEijnden, E., Physical Review B, 2002, 66(5), 052301.
 12 Weinan, E.; Ren, W. and VandenEijnden, E., Journal of Physical Chemistry B, 2005, 109(14), 6688–6693.
 13 Jónsson, H.; Mills, G. and Jacobsen, K. W. In Classical and Quantum Dynamics in Condensed Phase Simulations, Berne, B. J.; Ciccotti, G. and Coker, D. F., Eds.; World Scientific, Singapore, 1998; page 385.
 14 Sheppard, D.; Terrell, R. and Henkelman, G., The Journal of Chemical Physics, 2008, 128(13), 134106.
 15 Borrero, E. E. and Escobedo, F. A., The Journal of Chemical Physics, 2007, 127(16), 164101.
 16 Allen, R. J.; Valeriani, C. and ten Wolde, P. R., Journal of Physics: Condensed Matter, 2009, 21(46), 463102.
 17 Brooks, S. P. and Morgan, B. J., The Statistician, 1995, pages 241–257.
 18 Berg, B. A. and Neuhaus, T., Physical Review Letters, 1992, 68(1), 9.
 19 Hansmann, U. H., Chemical Physics Letters, 1997, 281(1), 140–150.
 20 Sugita, Y. and Okamoto, Y., Chemical Physics Letters, 1999, 314(1), 141–151.
 21 Sugita, Y. and Okamoto, Y., Chemical Physics Letters, 2000, 329(3), 261–270.
 22 Mitsutake, A.; Sugita, Y. and Okamoto, Y., Peptide Science, 2001, 60(2), 96–123.
 23 Fukunishi, H.; Watanabe, O. and Takada, S., The Journal of Chemical Physics, 2002, 116(20), 9058–9067.
 24 Okamoto, Y., Journal of Molecular Graphics and Modelling, 2004, 22(5), 425–439.
 25 Liu, P.; Huang, X.; Zhou, R. and Berne, B. J., The Journal of Physical Chemistry B, 2006, 110(38), 19018–19022.
 26 Torrie, G. M. and Valleau, J. P., Journal of Computational Physics, 1977, 23(2), 187–199.
 27 Voter, A. F., Physical Review Letters, 1997, 78(20), 3908.
 28 Grubmüller, H., Physical Review E, 1995, 52(3), 2893.
 29 Laio, A. and Parrinello, M., Proceedings of the National Academy of Sciences, 2002, 99(20), 12562–12566.
 30 Huber, T.; Torda, A. E. and van Gunsteren, W. F., Journal of ComputerAided Molecular Design, 1994, 8(6), 695–708.
 31 Barducci, A.; Bussi, G. and Parrinello, M., Physical Review Letters, 2008, 100(2), 020603.
 32 Rosso, L.; Mináry, P.; Zhu, Z. and Tuckerman, M. E., The Journal of Chemical Physics, 2002, 116(11), 4389–4402.
 33 Maragliano, L. and VandenEijnden, E., Chemical Physics Letters, 2006, 426(1), 168–175.
 34 Abrams, J. B. and Tuckerman, M. E., The Journal of Physical Chemistry B, 2008, 112(49), 15742–15757.
 35 So/rensen, M. R. and Voter, A. F., The Journal of Chemical Physics, 2000, 112(21), 9599–9606.
 36 den Otter, W. K. and Briels, W. J., The Journal of Chemical Physics, 1998, 109(11), 4139–4146.
 37 Carter, E.; Ciccotti, G.; Hynes, J. T. and Kapral, R., Chemical Physics Letters, 1989, 156(5), 472–477.
 38 Ciccotti, G. and Ferrario, M., Molecular Simulation, 2004, 30(1112), 787–793.
 39 Darve, E.; RodríguezGómez, D. and Pohorille, A., The Journal of Chemical Physics, 2008, 128(14), 144120.
 40 Kirkwood, J. G., The Journal of Chemical Physics, 1935, 3(5), 300–313.
 41 Straatsma, T. and Berendsen, H., The Journal of Chemical Physics, 1988, 89(9), 5876–5886.
 42 Wang, J. and Ferguson, A., Molecular Simulation, 2017, pages 1–18.
 43 Hegger, R.; Altis, A.; Nguyen, P. H. and Stock, G., Physical Review Letters, 2007, 98(2), 028102.
 44 Ferguson, A. L.; Panagiotopoulos, A. Z.; Kevrekidis, I. G. and Debenedetti, P. G., Chemical Physics Letters, 2011, 509(1), 1–11.
 45 Ferguson, A. L.; Panagiotopoulos, A. Z.; Debenedetti, P. G. and Kevrekidis, I. G., Proceedings of the National Academy of Sciences, 2010, 107(31), 13597–13602.
 46 Zhuravlev, P. I.; Materese, C. K. and Papoian, G. A., The Journal of Physical Chemistry B, 2009, 113(26), 8800–8812.
 47 Amadei, A.; Linssen, A. and Berendsen, H. J., Proteins: Structure, Function, and Bioinformatics, 1993, 17(4), 412–425.
 48 García, A. E., Physical Review Letters, 1992, 68(17), 2696.
 49 Das, P.; Moll, M.; Stamati, H.; Kavraki, L. E. and Clementi, C., Proceedings of the National Academy of Sciences, 2006, 103(26), 9885–9890.
 50 Stamati, H.; Clementi, C. and Kavraki, L. E., Proteins: Structure, Function, and Bioinformatics, 2010, 78(2), 223–235.
 51 Ichiye, T. and Karplus, M., Proteins: Structure, Function, and Bioinformatics, 1991, 11(3), 205–217.
 52 Pearson, K., The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 1901, 2(11), 559–572.
 53 Troyer, J. M. and Cohen, F. E., Proteins: Structure, Function, and Bioinformatics, 1995, 23(1), 97–110.
 54 Schölkopf, B.; Smola, A. and Müller, K.R. In International Conference on Artificial Neural Networks, pages 583–588. Springer, 1997.
 55 Wang, J., Geometric Structure of HighDimensional Data and Dimensionality Reduction, Springer, 2011.
 56 Roweis, S. T. and Saul, L. K., Science, 2000, 290(5500), 2323–2326.
 57 Zhang, Z. and Wang, J. In Advances in Neural Information Processing Systems, pages 1593–1600, 2007.
 58 Tenenbaum, J. B.; De Silva, V. and Langford, J. C., Science, 2000, 290(5500), 2319–2323.

59
Weinberger, K. Q. and Saul, L. K.,
International Journal of Computer Vision
, 2006, 70(1), 77–90.  60 Li, C.g.; Guo, J.; Chen, G.; Nie, X.f. and Yang, Z. In 2006 International Conference on Machine Learning and Cybernetics, pages 3201–3206. IEEE, 2006.
 61 Belkin, M. and Niyogi, P. In Advances in Neural Information Processing Systems, pages 585–591, 2002.
 62 Donoho, D. L. and Grimes, C., Proceedings of the National Academy of Sciences, 2003, 100(10), 5591–5596.
 63 Ferguson, A. L.; Panagiotopoulos, A. Z.; Debenedetti, P. G. and Kevrekidis, I. G., The Journal of Chemical Physics, 2011, 134(13), 135103.
 64 Coifman, R. R. and Lafon, S., Applied and Computational Harmonic Analysis, 2006, 21(1), 5–30.
 65 Rohrdanz, M. A.; Zheng, W.; Maggioni, M. and Clementi, C., The Journal of Chemical Physics, 2011, 134(12), 03B624.
 66 Preto, J. and Clementi, C., Physical Chemistry Chemical Physics, 2014, 16(36), 19181–19191.
 67 Ceriotti, M.; Tribello, G. A. and Parrinello, M., Proceedings of the National Academy of Sciences, 2011, 108(32), 13023–13028.
 68 Tribello, G. A.; Ceriotti, M. and Parrinello, M., Proceedings of the National Academy of Sciences, 2012, 109(14), 5196–5201.
 69 Ceriotti, M.; Tribello, G. A. and Parrinello, M., Journal of Chemical Theory and Computation, 2013, 9(3), 1521–1532.
 70 Ferguson, A. L.; Zhang, S.; Dikiy, I.; Panagiotopoulos, A. Z.; Debenedetti, P. G. and Link, A. J., Biophysical Journal, 2010, 99(9), 3056–3065.
 71 Zheng, W.; Rohrdanz, M. A.; Maggioni, M. and Clementi, C., The Journal of Chemical Physics, 2011, 134(14), 144109.
 72 Zheng, W.; Rohrdanz, M. A. and Clementi, C., The journal of physical chemistry B, 2013, 117(42), 12769–12776.
 73 Spiwok, V. and Králová, B., The Journal of Chemical Physics, 2011, 135(22), 224504.
 74 Fiorin, G.; Klein, M. L. and Hénin, J., Molecular Physics, 2013, 111(2223), 3345–3362.
 75 Ma, A. and Dinner, A. R., The Journal of Physical Chemistry B, 2005, 109(14), 6769–6779.
 76 Peters, B. and Trout, B. L., The Journal of Chemical Physics, 2006, 125(5), 054108.
 77 Peters, B.; Beckham, G. T. and Trout, B. L., The Journal of Chemical Physics, 2007, 127(3), 034109.
 78 Abrams, C. F. and VandenEijnden, E., Chemical Physics Letters, 2012, 547, 114–119.
 79 Branduardi, D.; Gervasio, F. L. and Parrinello, M., The Journal of Chemical Physics, 2007, 126(5), 054103.
 80 Chiavazzo, E.; Covino, R.; Coifman, R. R.; Gear, C. W.; Georgiou, A. S.; Hummer, G. and Kevrekidis, I. G., Proceedings of the National Academy of Sciences, 2017, page 201621481.
 81 Scholz, M. and Vigário, R. In ESANN’2002 proceedings  European Symposium on Artificial Neural Networks, pages 439–444, 2002.

82
Scholz, M.; Fraunholz, M. and Selbig, J. In
Principal manifolds for data visualization and dimension reduction;
Springer, 2008; pages 44–67.  83 Hinton, G. E. and Salakhutdinov, R. R., Science, 2006, 313(5786), 504–507.
 84 Yan, S.; Xu, D.; Zhang, B.; Zhang, H.J.; Yang, Q. and Lin, S., IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(1), 40–51.

85
Wang, W.; Huang, Y.; Wang, Y. and Wang, L. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops
, pages 490–497, 2014.  86 Eastman, P.; Swails, J.; Chodera, J. D.; McGibbon, R. T.; Zhao, Y.; Beauchamp, K. A.; Wang, L.P.; Simmonett, A. C.; Harrigan, M. P. and Stern, C. D., PLOS Computational Biology, 2017, 13(7), e1005659.
 87 Eastman, P.; Friedrichs, M. S.; Chodera, J. D.; Radmer, R. J.; Bruns, C. M.; Ku, J. P.; Beauchamp, K. A.; Lane, T. J.; Wang, L.P. and Shukla, D., Journal of Chemical Theory and Computation, 2012, 9(1), 461–469.
 88 Friedrichs, M. S.; Eastman, P.; Vaidyanathan, V.; Houston, M.; Legrand, S.; Beberg, A. L.; Ensign, D. L.; Bruns, C. M. and Pande, V. S., Journal of Computational Chemistry, 2009, 30(6), 864–872.
 89 Hassoun, M. H., Fundamentals of Artificial Neural Networks, MIT press, 1995.
 90 McCulloch, W. S. and Pitts, W., The Bulletin of Mathematical Biophysics, 1943, 5(4), 115–133.
 91 Chen, T. and Chen, H., IEEE Transactions on Neural Networks, 1995, 6(4), 911–917.
 92 Kramer, M. A., AIChE Journal, 1991, 37(2), 233–243.
 93 Kirby, M. J. and Miranda, R., Neural Computation, 1996, 8(2), 390–402.
 94 Bourlard, H. and Kamp, Y., Biological Cybernetics, 1988, 59(4), 291–294.
 95 Friedman, J.; Hastie, T. and Tibshirani, R., The Elements of Statistical Learning, Vol. 1, Springer series in statistics New York, 2001.
 96 Collobert, R. and Bengio, S. In Proceedings of the twentyfirst international conference on Machine learning, page 23. ACM, 2004.
 97 Bengio, Y. In Neural networks: Tricks of the trade; Springer, 2012; pages 437–478.
 98 Rumelhart, D. E.; Durbin, R.; Golden, R. and Chauvin, Y., Backpropagation: Theory, architectures and applications, 1995, pages 1–34.
 99 Rumelhart, D. E.; Hinton, G. E. and Williams, R. J. In Cognitive Modeling, Polk, T. A. and Seifert, C. M., Eds.; MIT Press, 1988; chapter 8.
 100 Sutskever, I.; Martens, J.; Dahl, G. and Hinton, G. In International conference on machine learning, pages 1139–1147, 2013.

101
Salvador, S. and Chan, P. In
Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE International Conference on
, pages 576–584. IEEE, 2004.  102 Krizhevsky, A.; Sutskever, I. and Hinton, G. E. In Advances in Neural Information Processing Systems, pages 1097–1105, 2012.
 103 Kabsch, W., Acta Crystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography, 1976, 32(5), 922–923.
 104 Roux, B., Computer Physics Communications, 1995, 91(13), 275–282.
 105 Chipot, C. and Pohorille, A., Free Energy Calculations, Springer, 2007.
 106 Ferguson, A. L., Journal of Computational Chemistry, 2017, 38(18), 1583–1605.
 107 Kumar, S.; Rosenberg, J. M.; Bouzida, D.; Swendsen, R. H. and Kollman, P. A., Journal of Computational Chemistry, 1992, 13(8), 1011–1021.
 108 Bartels, C., Chemical Physics Letters, 2000, 331(5), 446–454.
 109 Ferrenberg, A. M. and Swendsen, R. H., Physical Review Letters, 1989, 63(12), 1195.
 110 Bennett, C. H., Journal of Computational Physics, 1976, 22(2), 245–268.
 111 Bartels, C. and Karplus, M., Journal of Computational Chemistry, 1997, 18(12), 1450–1462.
 112 Habeck, M., Physical Review Letters, 2012, 109(10), 100601.
 113 Zhu, F. and Hummer, G., Journal of Computational Chemistry, 2012, 33(4), 453–465.
 114 Gallicchio, E.; Andrec, M.; Felts, A. K. and Levy, R. M., The Journal of Physical Chemistry B, 2005, 109(14), 6722–6731.
 115 Hartmann, C.; Latorre, J. C. and Ciccotti, G., The European Physical JournalSpecial Topics, 2011, 200(1), 73–89.
 116 Neidigh, J. W.; Fesinmeyer, R. M. and Andersen, N. H., Nature Structural & Molecular Biology, 2002, 9(6), 425.
 117 Sultan, M. M. and Pande, V. S., The Journal of Physical Chemistry B, in press, 2017, page doi: 10.1021/acs.jpcb.7b06896.
 118 Edelsbrunner, H.; Kirkpatrick, D. and Seidel, R., IEEE Transactions on Information Theory, 1983, 29(4), 551–559.
 119 Edelsbrunner, H. and Mücke, E. P., ACM Transactions on Graphics, 1994, 13(1), 43–72.
 120 Edelsbrunner, H., Algorithms and Combinatorics, 2003, 25, 379–404.
 121 Humphrey, W.; Dalke, A. and Schulten, K., Journal of Molecular Graphics, 1996, 14(1), 33–38.
 122 Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A. and Case, D. A., Journal of Computational Chemistry, 2004, 25(9), 1157–1174.
 123 Schlick, T., Molecular modeling and simulation: an interdisciplinary guide: an interdisciplinary guide, Vol. 21, Springer Science & Business Media, 2010.
 124 Hess, B.; Bekker, H.; Berendsen, H. J. and Fraaije, J. G., Journal of Computational Chemistry, 1997, 18(12), 1463–1472.
 125 Allen, M. and Tildesley, D., Computer Simulation of Liquids, Oxford University Press, New York, 1987.
 126 Horn, H. W.; Swope, W. C.; Pitera, J. W.; Madura, J. D.; Dick, T. J.; Hura, G. L. and HeadGordon, T., The Journal of Chemical Physics, 2004, 120(20), 9665–9678.
 127 Berman, H. M.; Battistuz, T.; Bhat, T. N.; Bluhm, W. F.; Bourne, P. E.; Burkhardt, K.; Feng, Z.; Gilliland, G. L.; Iype, L.; Jain, S.; Fagan, P.; Marvin, J.; Padilla, D.; Ravichandran, V.; Schneider, B.; Thanki, N.; Weissig, H.; Westbrook, J. D. and Zardecki, C., Acta Crystallographica Section D: Biological Crystallography, 2002, 58(6), 899–907.
 128 Andersen, H. C., The Journal of Chemical Physics, 1980, 72(4), 2384–2393.
 129 Chow, K.H. and Ferguson, D. M., Computer Physics Communications, 1995, 91(13), 283–289.
 130 Åqvist, J.; Wennerström, P.; Nervall, M.; Bjelic, S. and Brandsdal, B. O., Chemical Physics Letters, 2004, 384(4), 288–294.
 131 Hockney, R. W. and Eastwood, J. W., Computer Simulation Using Particles, CRC Press, 1988.
 132 Darden, T.; York, D. and Pedersen, L., The Journal of Chemical Physics, 1993, 98(12), 10089–10092.
 133 AlRfou, R.; Alain, G.; Almahairi, A.; Angermueller, C.; Bahdanau, D.; Ballas, N.; Bastien, F.; Bayer, J.; Belikov, A.; Belopolsky, A.; Bengio, Y.; Bergeron, A.; Bergstra, J.; Bisson, V.; Bleecher Snyder, J.; Bouchard, N.; BoulangerLewandowski, N.; Bouthillier, X.; de Brébisson, A.; Breuleux, O.; Carrier, P.L.; Cho, K.; Chorowski, J.; Christiano, P.; Cooijmans, T.; Côté, M.A.; Côté, M.; Courville, A.; Dauphin, Y. N.; Delalleau, O.; Demouth, J.; Desjardins, G.; Dieleman, S.; Dinh, L.; Ducoffe, M.; Dumoulin, V.; Ebrahimi Kahou, S.; Erhan, D.; Fan, Z.; Firat, O.; Germain, M.; Glorot, X.; Goodfellow, I.; Graham, M.; Gulcehre, C.; Hamel, P.; Harlouchet, I.; Heng, J.P.; Hidasi, B.; Honari, S.; Jain, A.; Jean, S.; Jia, K.; Korobov, M.; Kulkarni, V.; Lamb, A.; Lamblin, P.; Larsen, E.; Laurent, C.; Lee, S.; Lefrancois, S.; Lemieux, S.; Léonard, N.; Lin, Z.; Livezey, J. A.; Lorenz, C.; Lowin, J.; Ma, Q.; Manzagol, P.A.; Mastropietro, O.; McGibbon, R. T.; Memisevic, R.; van Merriënboer, B.; Michalski, V.; Mirza, M.; Orlandi, A.; Pal, C.; Pascanu, R.; Pezeshki, M.; Raffel, C.; Renshaw, D.; Rocklin, M.; Romero, A.; Roth, M.; Sadowski, P.; Salvatier, J.; Savard, F.; Schlüter, J.; Schulman, J.; Schwartz, G.; Serban, I. V.; Serdyuk, D.; Shabanian, S.; Simon, E.; Spieckermann, S.; Subramanyam, S. R.; Sygnowski, J.; Tanguay, J.; van Tulder, G.; Turian, J.; Urban, S.; Vincent, P.; Visin, F.; de Vries, H.; WardeFarley, D.; Webb, D. J.; Willson, M.; Xu, K.; Xue, L.; Yao, L.; Zhang, S. and Zhang, Y., arXiv eprints, 2016, http://arxiv.org/abs/1605.02688.
 134 Hashemian, B. and Arroyo, M., The Journal of Chemical Physics, 2015, 142(4), 044102.
 135 Chodera, J. D.; Swope, W. C.; Pitera, J. W.; Seok, C. and Dill, K. A., Journal of Chemical Theory and Computation, 2007, 3(1), 26–41.
 136 Hummer, G. and Kevrekidis, I. G., The Journal of Chemical Physics, 2003, 118(23), 10762–10773.
 137 Chodera, J. D.; Swope, W. C.; Pitera, J. W. and Dill, K. A., Multiscale Modeling & Simulation, 2006, 5(4), 1214–1226.
 138 Chodera, J. D.; Singhal, N.; Pande, V. S.; Dill, K. A. and Swope, W. C., The Journal of Chemical Physics, 2007, 126(15), 04B616.
 139 Bolhuis, P. G.; Dellago, C. and Chandler, D., Proceedings of the National Academy of Sciences, 2000, 97(11), 5877–5882.
 140 Jákli, I.; Jensen, S. J. K.; Csizmadia, I. G. and Perczel, A., Chemical Physics Letters, 2012, 547, 82–88.
 141 Patrangenaru, V. and Ellingson, L., Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis, CRC Press, 2015.
 142 Whitney, H., Annals of Mathematics, 1936, pages 645–680.
 143 Broomhead, D. S. and King, G. P., Physica D: Nonlinear Phenomena, 1986, 20(23), 217–236.
 144 Kim, S. B.; Dsilva, C. J.; Kevrekidis, I. G. and Debenedetti, P. G., The Journal of Chemical Physics, 2015, 142(8), 085101.
 145 Qiu, L.; Pabit, S. A.; Roitberg, A. E. and Hagen, S. J., Journal of the American Chemical Society, 2002, 124(44), 12952–12953.
 146 Seshasayee, A. S. N., Theoretical Biology and Medical Modelling, 2005, 2(1), 7.
 147 Heyda, J.; Kozisek, M.; Bednárova, L.; Thompson, G.; Konvalinka, J.; Vondrasek, J. and Jungwirth, P., The Journal of Physical Chemistry B, 2011, 115(28), 8910–8924.
 148 Juraszek, J. and Bolhuis, P., Proceedings of the National Academy of Sciences, 2006, 103(43), 15859–15864.
 149 Kim, S. B.; Gupta, D. R. and Debenedetti, P. G., Scientific Reports, 2016, 6, 25612.
 150 Hatch, H. W.; Stillinger, F. H. and Debenedetti, P. G., The Journal of Physical Chemistry B, 2014, 118(28), 7761–7769.
 151 Kannan, S. and Zacharias, M., PloS ONE, 2014, 9(2), e88383.
 152 Schug, A.; Wenzel, W. and Hansmann, U. H., The Journal of Chemical Physics, 2005, 122(19), 194711.
 153 Snow, C. D.; Zagrovic, B. and Pande, V. S., Journal of the American Chemical Society, 2002, 124(49), 14548–14549.
 154 Zhou, R., Proceedings of the National Academy of Sciences, 2003, 100(23), 13280–13285.
 155 Zhou, R., Proteins: Structure, Function, and Bioinformatics, 2003, 53(2), 148–161.
 156 Deng, N.j.; Dai, W. and Levy, R. M., The Journal of Physical Chemistry B, 2013, 117(42), 12787–12799.
 157 Patriksson, A.; Adams, C. M.; Kjeldsen, F.; Zubarev, R. A. and van der Spoel, D., The Journal of Physical Chemistry B, 2007, 111(46), 13147–13150.
 158 Pitera, J. W. and Swope, W., Proceedings of the National Academy of Sciences, 2003, 100(13), 7587–7592.
 159 Meuzelaar, H.; Marino, K. A.; HuertaViga, A.; Panman, M. R.; Smeenk, L. E.; Kettelarij, A. J.; van Maarseveen, J. H.; Timmerman, P.; Bolhuis, P. G. and Woutersen, S., The Journal of Physical Chemistry B, 2013, 117(39), 11490–11501.
 160 Barua, B.; Lin, J. C.; Williams, V. D.; Kummler, P.; Neidigh, J. W. and Andersen, N. H., Protein Engineering, Design & Selection, 2008, 21(3), 171–185.
 161 Iavarone, A. T. and Parks, J. H., Journal of the American Chemical Society, 2005, 127(24), 8606–8607.
 162 Rovó, P.; Stráner, P.; Láng, A.; Bartha, I.; Huszár, K.; Nyitray, L. and Perczel, A., ChemistryA European Journal, 2013, 19(8), 2628–2640.
 163 Adams, C. M.; Kjeldsen, F.; Patriksson, A.; van Der Spoel, D.; Gräslund, A.; Papadopoulos, E. and Zubarev, R. A., International Journal of Mass Spectrometry, 2006, 253(3), 263–273.
 164 Bolhuis, P. G., Frontiers in Bioscience, 2009, 14, 2801–2828.
 165 Kannan, S. and Zacharias, M., Proteins: Structure, Function, and Bioinformatics, 2009, 76(2), 448–460.
 166 Shao, Q.; Shi, J. and Zhu, W., The Journal of Chemical Physics, 2012, 137(12), 125103.
 167 Paschek, D.; Hempel, S. and García, A. E., Proceedings of the National Academy of Sciences, 2008, 105(46), 17754–17759.
 168 Kaminski, G. A.; Friesner, R. A.; TiradoRives, J. and Jorgensen, W. L., The Journal of Physical Chemistry B, 2001, 105(28), 6474–6487.
 169 Lindahl, E.; Hess, B. and Van Der Spoel, D., Journal of Molecular Modeling, 2001, 7(8), 306–317.
 170 Ahmed, Z.; Beta, I. A.; Mikhonin, A. V. and Asher, S. A., Journal of the American Chemical Society, 2005, 127(31), 10943–10950.
 171 Best, R. B.; Buchete, N.V. and Hummer, G., Biophysical Journal, 2008, 95(1), L07–L09.
 172 Best, R. B. and Mittal, J., The Journal of Physical Chemistry B, 2010, 114(46), 14916–14923.
 173 Abascal, J. L. and Vega, C., The Journal of Chemical Physics, 2005, 123(23), 234505.
 174 Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W. and Kollman, P. A., Journal of the American Chemical Society, 1995, 117(19), 5179–5197.
 175 Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W. and Klein, M. L., The Journal of Chemical Physics, 1983, 79(2), 926–935.
 176 LindorffLarsen, K.; Piana, S.; Dror, R. O. and Shaw, D. E., Science, 2011, 334(6055), 517–520.
 177 MacKerell, A. D.; Bashford, D.; Bellott, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; JosephMcCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T. K.; Mattos, C.; Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E.; Roux, B.; Schlenkrich, M.; Smith, J. C.; Stote, R.; Straub, J.; Watanabe, M.; WiórkiewiczKuczera, J.; Yin, D. and Karplus, M., The Journal of Physical Chemistry B, 1998, 102(18), 3586–3616.
 178 MacKerell, A. D.; Feig, M. and Brooks, C. L., Journal of Computational Chemistry, 2004, 25(11), 1400–1415.
 179 Kmiecik, S.; Gront, D.; Kolinski, M.; Wieteska, L.; Dawid, A. E. and Kolinski, A., Chemical Reviews, 2016, 116(14), 7898–7936.
Comments
There are no comments yet.