Learning Gradient Fields for Molecular Conformation Generation

by   Chence Shi, et al.

We study a fundamental problem in computational chemistry known as molecular conformation generation, trying to predict stable 3D structures from 2D molecular graphs. Existing machine learning approaches usually first predict distances between atoms and then generate a 3D structure satisfying the distances, where noise in predicted distances may induce extra errors during 3D coordinate generation. Inspired by the traditional force field methods for molecular dynamics simulation, in this paper, we propose a novel approach called ConfGF by directly estimating the gradient fields of the log density of atomic coordinates. The estimated gradient fields allow directly generating stable conformations via Langevin dynamics. However, the problem is very challenging as the gradient fields are roto-translation equivariant. We notice that estimating the gradient fields of atomic coordinates can be translated to estimating the gradient fields of interatomic distances, and hence develop a novel algorithm based on recent score-based generative models to effectively estimate these gradients. Experimental results across multiple tasks show that ConfGF outperforms previous state-of-the-art baselines by a significant margin.


page 1

page 2

page 3

page 4


An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming

Predicting molecular conformations (or 3D structures) from molecular gra...

Direct Molecular Conformation Generation

Molecular conformation generation aims to generate three-dimensional coo...

Searching for Globally Optimal Functional Forms for Inter-Atomic Potentials Using Parallel Tempering and Genetic Programming

We develop a Genetic Programming-based methodology that enables discover...

BIGDML: Towards Exact Machine Learning Force Fields for Materials

Machine-learning force fields (MLFF) should be accurate, computationally...

Atomic structure generation from reconstructing structural fingerprints

Data-driven machine learning methods have the potential to dramatically ...

Decoupled coordinates for machine learning-based molecular fragment linking

Recent developments in machine-learning based molecular fragment linking...

Direct Mapping Hidden Excited State Interaction Patterns from ab initio Dynamics and Its Implications on Force Field Development

The excited states of polyatomic systems are rather complex, and often e...

1 Introduction

Graph-based representations of molecules, where nodes represent atoms and edges represent bonds, have been widely applied to a variety of molecular modeling tasks, ranging from property prediction (Gilmer et al., 2017; Hu et al., 2020) to molecule generation (Jin et al., 2018; You et al., 2018; Shi et al., 2020). However, in reality a more natural and intrinsic representation for molecules is their three-dimensional structures, where atoms are represented by 3D coordinates. These structures are also widely known as molecular geometry or conformation.

Nevertheless, obtaining valid and stable conformations for a molecule remains a long-standing challenge. Existing approaches to molecular conformation generation mainly rely on molecular dynamics (MD) (De Vivo et al., 2016), where coordinates of atoms are sequentially updated based on the forces acting on each atom. The per-atom forces come either from computationally intensive Density Functional Theory (DFT) (Parr, 1989) or from hand-designed force fields (Rappé et al., 1992; Halgren, 1996) which are often crude approximations of the actual molecular energies (Kanal et al., 2017). Recently, there are growing interests in developing machine learning methods for conformation generation (Mansimov et al., 2019; Simm and Hernandez-Lobato, 2020; Xu et al., 2021). In general, these approaches  (Simm and Hernandez-Lobato, 2020; Xu et al., 2021) are divided into two stages, which first predict the molecular distance geometry, i.e., interatomic distances, and then convert the distances to 3D coordinates via post-processing algorithms (Liberti et al., 2014). Such methods effectively model the roto-translation equivariance of molecular conformations and have achieved impressive performance. Nevertheless, such approaches generate conformations in a two-stage fashion, where noise in generated distances may affect 3D coordinate reconstruction, often leading to less accurate or even erroneous structures. Therefore, we are seeking for an approach that is able to generate molecular conformations within a single stage.

Inspired by the traditional force field methods for molecular dynamics simulation (Frenkel and Smit, 1996), in this paper, we propose a novel and principled approach called ConfGF for molecular conformation generation. The basic idea is to directly learn the gradient fields of the log density w.r.t. the atomic coordinates, which can be viewed as pseduo-forces acting on each atom. With the estimated gradient fields, conformations can be generated directly using Langevin dynamics (Welling and Teh, 2011), which amounts to moving atoms toward the high-likelihood regions guided by pseudo-forces. The key challenge with this approach is that such gradient fields of atomic coordinates are roto-translation equivariant, i.e., the gradients rotate together with the molecule system and are invariant under translation.

To tackle this challenge, we observe that the interatomic distances are continuously differentiable w.r.t. atomic coordinates. Therefore, we propose to first estimate the gradient fields of the log density w.r.t. interatomic distances, and show that under a minor assumption, the gradient fields of the log density of atomic coordinates can be calculated from these w.r.t. distances in closed form using chain rule. We also demonstrate that such designed gradient fields satisfy roto-translation equivariance. In contrast to distance-based methods 

(Simm and Hernandez-Lobato, 2020; Xu et al., 2021), our approach generates conformations in an one-stage fashion, i.e., the sampling process is done directly in the coordinate space, which avoids the errors induced by post-processing and greatly enhances generation quality. In addition, our approach requires no surrogate losses and allows flexible network architectures, making it more capable to generate diverse and realistic conformations.

We conduct extensive experiments on GEOM (Axelrod and Gomez-Bombarelli, 2020) and ISO17 (Simm and Hernandez-Lobato, 2020) benchmarks, and compare ConfGF against previous state-of-the-art neural-methods as well as the empirical method on multiple tasks ranging from conformation generation and distance modeling to property prediction. Numerical results show that ConfGF outperforms previous state-of-the-art baselines by a clear margin.

2 Related Work

Molecular Conformation Generation The standard approach to molecular conformation generation is molecular dynamics (MD). MD starts with an initial conformation and applies small changes to it step by step based on forces acting on each atoms until it converges, which is considered as the gold standard for sampling equilibrium states of molecules. However, these methods are inefficient, especially when molecules are large as they involve computationally expensive quantum mechanical calculations (Parr, 1989; Shim and MacKerell Jr, 2011; Ballard et al., 2015). Another much more efficient but less accurate sort of approach is the empirical method, which fixes distances and angles between atoms in a molecule to idealized values according to a set of rules (Blaney and Dixon, 2007).

Recent advances in deep generative models open the door to data-driven approaches, which have shown great potential to strike a good balance between computational efficiency and accuracy. For example, Mansimov et al. (2019) introduce the Conditional Variational Graph Auto-Encoder (CVGAE) which takes molecular graphs as input and directly generates 3D coordinates for each atom. This method does not manage to model the roto-translation equivariance of molecular conformations, resulting in great differences between generated structures and ground-truth structures. To address the issue, Simm and Hernandez-Lobato (2020) and Xu et al. (2021) propose to model interatomic distances of molecular conformations using VAE and Continuous Flow respectively. These approaches preserve roto-translation equivariance of molecular conformations and consist of two stages — first generating distances between atoms and then feeding the distances to the distance geometry algorithms (Liberti et al., 2014) to reconstruct 3D coordinates. However, the distance geometry algorithms are vulnerable to noise in generated distances, which may lead to inaccurate or even erroneous structures. Another line of research (Gogineni et al., 2020)

leverages reinforcement learning for conformation generation by sequentially determining torsional angles, which relies on an additional classical force field for state transition and reward evaluation. Nevertheless, it is incapable of modeling other geometric properties such as bond angles and bond lengths, which distinguishes it from all other works.

Neural Force FieldsNeural networks have also been employed to estimate molecular potential energies and force fields (Schütt et al., 2017a; Zhang et al., 2018; Hu et al., 2021). The goal of these models is to predict energies and forces as accurate as possible, serving as an alternative to quantum chemical methods such as Density Function Theory (DFT) method. Training these models requires access to molecular dynamics trajectories along with ground-truth energies and forces from expensive quantum mechanical calculations. Different from these methods, our goal is to generate equilibrium conformations within a single stage. To this end, we define gradient fields111Such gradient fields, however, are not force fields as they are not energy conserving. analogous to force fields, which serve as pseudo-forces acting on each atom that gradually move atoms toward the high-density areas until they converges to an equilibrium. The only data we need to train the model is equilibrium conformations.

3 Preliminaries

3.1 Problem Definition

Notations In this paper, a molecular graph is represented as an undirected graph , where is the set of nodes representing atoms, and is the set of edges between atoms in the molecule. Each node is associated with a nuclear charge

and a 3D vector

, indicating its atomic type and atomic coordinate respectively. Each edge is associated with a bond type and a scalar denoting the Euclidean distance between the positions of and . All distances between connected nodes can be represented as a vector .

As the bonded edges in a molecule are not sufficient to characterize a conformation (bonds are rotatable), we extend the original molecular graph by adding auxiliary edges, i.e., virtual bonds, between atoms that are 2 or 3 hops away from each other, which is a widely-used technique in previous work (Simm and Hernandez-Lobato, 2020; Xu et al., 2021)

for reducing the degrees of freedom in 3D coordinates. The extra edges between second neighbors help fix angles between atoms, while those between third neighbors fix dihedral angles. Hereafter, we assume all molecular graphs are extended unless stated.

Problem Definition Given an extended molecular graph , our goal is to learn a generative model that generates molecular conformations based on the molecular graph.

3.2 Score-Based Generative Modeling

The score function

of a continuously differentiable probability density

is defined as . Score-based generative modeling (Song and Ermon, 2019; Song et al., 2020; Meng et al., 2020; Song and Ermon, 2020) is a class of generative models that approximate the score function of using neural networks and generate new samples with Langevin dynamics (Welling and Teh, 2011). Modeling the score function instead of the density function allows getting rid of the intractable partition function of

and avoiding extra computation on getting the higher-order gradients of an energy-based model 

(Song and Ermon, 2019).

To address the difficulty of estimating score functions in the regions without training data, Song and Ermon (2019) propose to perturb the data using Gaussian noise of various intensities and jointly estimate the score functions, i.e., , for all noise levels. In specific, given a data distribution and a sequence of noise levels with a noise distribution , e.g., , the training objective for each noise level is as follows:


After the noise conditional score networks are trained, samples are generated using annealed Langevin dynamics (Song and Ermon, 2019)

, where samples from each noise level serve as initializations for Langevin dynamics of the next noise level. For detailed design choices of noise levels and sampling hyperparameters, we refer readers to 

Song and Ermon (2020).

3.3 Equivariance in Molecular Geometry

Equivariance is ubiquitous in physical systems, e.g., 3D roto-translation equivariance of molecular conformations or point clouds (Thomas et al., 2018; Weiler et al., 2018; Fuchs et al., 2020; Miller et al., 2020; Simm et al., 2021). Endowing model with such inductive biases is critical for better generalization and successful learning (Köhler et al., 2020). Formally, a function being equivariant can be represented as follows:222Strictly speaking, does not have to be the same on both sides of the equation, as long as it is a representation of the same group. Note that invariance is a special case of equivariance.


where is a transformation function, e.g., rotation. Intuitively, Eq. 2 says that applying the on the input has the same effect as applying it to the output.

In this paper, we aim to model the score function of , i.e., . As is roto-translation invariant with respect to conformations, the score function is therefore roto-translation equivariant with respect to conformations. To pursue a generalizable and elegant approach, we explicitly build such equivariance into the model design.

Figure 1: Overview of the proposed ConfGF approach. (a) Illustration of the training procedure. We perturb the interatomic distances with random Gaussian noise of various magnitudes, and train the noise conditional score network with denoising score matching. (b) Score estimation of atomic coordinates via chain rule. The procedure amounts to calculating the resultant force from a set of interatomic forces.

4 Proposed Method

4.1 Overview

Our approach extends the denoising score matching (Vincent, 2011; Song and Ermon, 2019) to molecular conformation generation by estimating the gradient fields of the log density of atomic coordinates, i.e., . Directly parameterizing such gradient fields using vanilla Graph Neural Networks (GNNs) (Duvenaud et al., 2015; Kearnes et al., 2016; Kipf and Welling, 2016; Gilmer et al., 2017; Schütt et al., 2017b, a) is problematic, as these GNNs only leverage the node, edge or distance information, resulting in node representations being roto-translation invariant. However, the gradient fields of atomic coordinates should be roto-translation equivariant (see Section 3.3), i.e., the vector fields should rotate together with the molecular system and be invariant under translation.

To tackle this issue, we assume that the log density of atomic coordinates given the molecular graph can be parameterized as up to a constant, where denotes a function that maps a set of atomic coordinates to a set of interatomic distances, and is a graph neural network that estimates the negative energy of a molecule based on the interatomic distances and the graph representation . Such design is a common practice in existing literature (Schütt et al., 2017a; Hu et al., 2020; Klicpera et al., 2020), which is favored as it ensures several physical invariances, e.g., rotation and translation invariance of energy prediction. We observe that the interatomic distances are continuously differentiable w.r.t. atomic coordinates . Therefore, the two gradient fields, and , are connected by chain rule, and the gradients can be backpropageted from to as follows:


where and . Here denotes atom’s neighbors. The second equation of Eq. 3 holds because of the chain rule over the partial derivative of the composite function .

Inspired by the above observations, we take the interatomic pairwise distances as intermediate variables and propose to first estimate the gradient fields of interatomic distances corresponding to different noise levels, i.e., training a noise conditional score network to approximate . We then calculate the gradient fields of atomic coordinates, i.e., , from via Eq. 3, without relying on any extra parameters. Such designed score network enjoys the property of roto-translation equivariance. Formally, we have:

Proposition 1.

Under the assumption that we parameterize as a function of interatomic distances and molecular graph representation , the score network defined in Eq. 3 is roto-translation equivariant, i.e., the gradients rotate together with the molecule system and are invariant under translation.

Proof Sketch.

The score network of distances is roto-translation invariant as it only depends on interatomic distances, and the vector rotates together with the molecule system, which endows the score network of atomic coordinates with roto-translation equivariance. See Supplementary material for a formal proof. ∎

After training the score network, based on the estimated gradient fields of atomic coordinates corresponding to different noise levels, conformations are generated by annealed Langevin dynamics (Song and Ermon, 2019), combining information from all noise levels. The overview of our proposed method is illustrated in Figure 1 and Figure 2. Below we describe the design of the noise conditional score network in Section 4.2, and introduce the generation procedure in Section 4.3. Finally, we show how the designed gradient fields are connected with the classical laws of mechanics in Section 4.4.

Figure 2: Generation procedure of the proposed ConfGF via Langevin dynamics. Starting from a random initialization, the conformation is sequentially updated with the gradient information of atomic coordinates calculated from the score network.

4.2 Noise Conditional Score Networks for Distances

Let be a positive geometric progression with common ratio , i.e., . We define the perturbed conditional distance distribution to be . In this setting, we aim to learn a conditional score network to jointly estimate the scores of all perturbed distance distributions, i.e., . Note that , we therefore formulate it as an edge regression problem.

Given a molecular graph with a set of interatomic distances computed from its conformation , we first embed the node attributes and the edge attributes into low dimensional feature spaces using feedforward networks:


We then adopt a variant of Graph Isomorphism Network (GINs) (Xu et al., 2019; Hu et al., 2020) to compute node embeddings based on graph structures. At each layer, node embeddings are updated by aggregating messages from neighboring nodes and edges:


where denotes atom’s neighbors. After rounds of message passing, we compute the final edge embeddings by concatenating the corresponding node embeddings for each edge as follows:


where denotes the vector concatenation operation, and denotes the final embeddings of edge . Following the suggestions of Song and Ermon (2020), we parameterize the noise conditional network with as follows:


where is an unconditional score network, and the maps edge embeddings to a scalar for each . Note that is roto-translation invariant as we only use interatomic distances. Since , the training loss of is thus:


where is the coefficient function weighting losses of different noise levels according to Song and Ermon (2019), and all expectations can be efficiently estimated using Monte Carlo estimation.

4.3 Conformation Generation via Annealed Langevin Dynamics

Given the molecular graph representation and the well-trained noise conditional score networks, molecular conformations are generated using annealed Langevin dynamics, i.e., gradually anneal down the noise level from large to small. We start annealed Langevin dynamics by first sampling an initial conformation

from some fixed prior distribution, e.g., uniform distribution or Gaussian distribution. Empirically, the choice of the prior distribution is arbitrary as long as the supports of the perturbed distance distributions cover the prior distribution, and we take the Gaussian distribution as prior distribution in our case. Then, we update the conformations by sampling from a series of trained noise conditional score networks

sequentially. For each noise level , starting from the final sample from the last noise level, we run Langevin dynamics for steps with a gradually decreasing step size for each noise level. In specific, at each sampling step , we first obtain the interatomic distances based on the current conformation , and calculate the score function via Eq. 3. The conformation is then updated using the gradient information from the score network. The whole sampling algorithm is presented in Algorithm 1.

0:  molecular graph , noise levels , the smallest step size , and the number of sampling steps per noise level .
1:  Initialize conformation from a prior distribution
2:  for  to  do
3:      is the step size.
4:     for  to  do
5:         get distances from
6:         Eq. 3.
7:        Draw
9:     end for
11:  end for
11:  Generated conformation .
Algorithm 1 Annealed Langevin dynamics sampling

4.4 Discussion

From a physical perspective, reflects how the distances between connected atoms should change to increase the probability density of the conformation , which can be viewed as a network that predicts the repulsion or attraction forces between atoms. Moreover, as shown in Figure 1, can be viewed as resultant forces acting on each atom computed from Eq. 3, where is a unit vector indicating the direction of each component force and specifies the strength of each component force.

5 Experiments

Figure 3: Visualization of conformations generated by different approaches. For each method, we sample multiple conformations and pick ones that are best-aligned with the reference ones, based on four random molecular graphs from the test set of GEOM-Drugs.

Following existing works on conformation generation, we evaluate the proposed method using the following three standard tasks:

  • Conformation Generation tests models’ capacity to learn distribution of conformations by measuring the quality of generated conformations compared with reference ones (Section 5.1).

  • Distributions Over Distances evaluates the discrepancy with respect to distance geometry between the generated conformations and the ground-truth conformations (Section 5.2).

  • Property Prediction is first proposed in Simm and Hernandez-Lobato (2020), which estimates chemical properties for molecular graphs based on a set of sampled conformations (Section 5.3). This can be useful in a variety of applications such as drug discovery, where we want to predict molecular properties from microscopic structures.

Below we describe setups that are shared across the tasks. Additional setups are provided in the task-specific sections.

Data Following Xu et al. (2021), we use the GEOM-QM9 and GEOM-Drugs (Axelrod and Gomez-Bombarelli, 2020) datasets for the conformation generation task. We randomly draw 40,000 molecules and select the 5 most likely conformations333sorted by energy for each molecule, and draw 200 molecules from the remaining data, resulting in a training set with 200,000 conformations and a test set with 22,408 and 14,324 conformations for GEOM-QM9 and GEOM-Drugs respectively. We evaluate the distance modeling task on the ISO17 dataset (Simm and Hernandez-Lobato, 2020). We use the default split of Simm and Hernandez-Lobato (2020), resulting in a training set with 167 molecules and the test set with 30 molecules. For property prediction task, we randomly draw 30 molecules from the GEOM-QM9 dataset with the training molecules excluded.

Baselines We compare our approach with four state-of-the-art methods for conformation generation. In specific, CVGAE (Mansimov et al., 2019)

is built upon conditional variational graph autoencoder, which directly generates atomic coordinates based on molecular graphs.

GraphDG (Simm and Hernandez-Lobato, 2020) and CGCF (Xu et al., 2021) are distance-based methods, which rely on post-processing algorithms to convert distances to final conformations. The difference of the two methods lies in how the neural networks are designed, i.e., VAE vs. Continuous Flow. RDKit (Riniker and Landrum, 2015) is a classical Euclidean Distance Geometry-based approach. Results of all baselines across experiments are obtained by running their official codes unless stated.

Model Configuration

ConfGF is implemented in Pytorch 

(Paszke et al., 2017). The GINs is implemented with layers and the hidden dimension is set as 256 across all modules. For training, we use an exponentially-decayed learning rate starting from 0.001 with a decay rate of 0.95. The model is optimized with Adam (Kingma and Ba, 2014) optimizer on a single Tesla V100 GPU. All hyperparameters related to noise levels as well as annealed Langevin dynamics are selected according to Song and Ermon (2020). See Supplementary material for full details.

5.1 Conformation Generation

Setup The first task is to generate conformations with high diversity and accuracy based on molecular graphs. For each molecular graph in the test set, we sample twice as many conformations as the reference ones from each model. To measure the discrepancy between two conformations, following existing work (Hawkins, 2017; Li et al., 2021), we adopt the Root of Mean Squared Deviations (RMSD) of the heavy atoms between aligned conformations:


where is the number of heavy atoms and is an alignment function that aligns two conformations by rotation and translation. Following Xu et al. (2021), we report the Coverage (COV) and Matching (MAT) score to measure the diversity and accuracy respectively:


where and are generated and reference conformation set respectively. is a given RMSD threshold. In general, a higher COV score indicates a better diversity, while a lower MAT score indicates a better accuracy.

Results We evaluate the mean and median COV and MAT scores on both GEOM-QM9 and GEOM-Drugs datasets for all baselines, where mean and median are taken over all molecules in the test set. Results in Table 1 show that our ConfGF achieves the state-of-the-art performance on all four metrics. In particular, as a score-based model that directly estimates the gradients of the log density of atomic coordinates, our approach consistently outperforms other neural models by a large margin. By contrast, the performance of GraphDG and CGCF is inferior to ours due to their two-stage generation procedure, which incurs extra errors when converting distances to conformations. The CVGAE seems to struggle with both of the datasets as it fails to model the roto-translation equivariance of conformations.

When compared with the rule-based method, our approach outperforms the RDKit on 7 out of 8 metrics without relying on any post-processing algorithms, e.g., force fields, as done in previous work (Mansimov et al., 2019; Xu et al., 2021), making it appealing in real-world applications. Figure 3 visualizes several conformations that are best-aligned with the reference ones generated by different methods, which demonstrates ConfGF’s strong ability to generate realistic and diverse conformations.

Dataset Method COV (%) MAT (Å)
Mean Median Mean Median
QM9 RDKit 83.26 90.78 0.3447 0.2935
CVGAE 0.09 0.00 1.6713 1.6088
GraphDG 73.33 84.21 0.4245 0.3973
CGCF 78.05 82.48 0.4219 0.3900
ConfGF 88.49 94.13 0.2673 0.2685
Drugs RDKit 60.91 65.70 1.2026 1.1252
CVGAE 0.00 0.00 3.0702 2.9937
GraphDG 8.27 0.00 1.9722 1.9845
CGCF 53.96 57.06 1.2487 1.2247
ConfGF 62.15 70.93 1.1629 1.1596
Table 1: COV and MAT scores of different approaches on GEOM-QM9 and GEOM-Drugs datasets. The threshold is set as Å for QM9 and Å for Drugs following Xu et al. (2021). See Supplementary material for additional results.
Method Single Pair All
Mean Median Mean Median Mean Median
RDKit 3.4513 3.1602 3.8452 3.6287 4.0866 3.7519
CVGAE 4.1789 4.1762 4.9184 5.1856 5.9747 5.9928
GraphDG 0.7645 0.2346 0.8920 0.3287 1.1949 0.5485
CGCF 0.4490 0.1786 0.5509 0.2734 0.8703 0.4447
ConfGF 0.3684 0.2358 0.4582 0.3206 0.6091 0.4240
Table 2: Accuracy of the distributions over distances generated by different approaches compared to the ground-truth. Two different metrics are used: mean and median MMD between ground-truth conformations and generated ones. Results of baselines are taken from Xu et al. (2021).

5.2 Distributions Over Distances

Setup The second task is to match the generated distributions over distances with the ground-truth. As ConfGF is not designed for modeling molecular distance geometry, we treat interatomic distances calculated from generated conformations as generated distance distributions. For each molecular graph in the test set, we sample 1000 conformations across all methods. Following previous work (Simm and Hernandez-Lobato, 2020; Xu et al., 2021), we report the discrepancy between generated distributions and the ground-truth distributions, by calculating maximum mean discrepancy (MMD) (Gretton et al., 2012) using a Gaussian kernel. In specific, for each graph G in the test set, we ignore edges connected with the H atoms and evaluate distributions of all distances (All), pairwise distances (Pair), and individual distances (Single).

Results As shown in Table 2, the CVGAE suffers the worst performance due to its poor capacity. Despite the superior performance of RDKit on the conformation generation task, its performance falls far short of other neural models. This is because RDKit incorporates an additional molecular force field (Rappé et al., 1992)

to find equilibrium states of conformations after the initial coordinates are generated. By contrast, although not tailored for molecular distance geometry, our ConfGF still achieves competitive results on all evaluation metrics. In particular, the ConfGF outperforms the state-of-the-art method CGCF on 4 out of 6 metrics, and achieves comparable results on the other two metrics.

5.3 Property Prediction

Setup The third task is to predict the ensemble property (Axelrod and Gomez-Bombarelli, 2020) of molecular graphs. Specifically, the ensemble property of a molecular graph is calculated by aggregating the property of different conformations. For each molecular graph in the test set, we first calculate the energy, HOMO and LUMO of all the ground-truth conformations using the quantum chemical calculation package Psi4 (Smith et al., 2020). Then, we calculate ensemble properties including average energy , lowest energy , average HOMO-LUMO gap , minimum gap , and maximum gap based on the conformational properties for each molecular graph. Next, we use ConfGF and other baselines to generate 50 conformations for each molecular graph and calculate the above-mentioned ensemble properties in the same way. We use mean absolute error (MAE) to measure the accuracy of ensemble property prediction. We exclude CVGAE from this analysis due to its poor quality following Simm and Hernandez-Lobato (2020).

Results As shown in Table 3, ConfGF outperforms all the machine learning-based methods by a large margin and achieves better accuracy than RDKit in lowest energy and maximum HOMO-LUMO gap

. Closer inspection on the generated samples shows that, although ConfGF has better generation quality in terms of diversity and accuracy, it sometimes generates a few outliers which have negative impact on the prediction accuracy. We further present the median of absolute errors in Table

4. When measured by median absolute error, ConfGF has the best accuracy in average energy and the accuracy of other properties is also improved, which reveals the impact of outliers.

RDKit 0.9233 0.6585 0.3698 0.8021 0.2359
GraphDG 9.1027 0.8882 1.7973 4.1743 0.4776
CGCF 28.9661 2.8410 2.8356 10.6361 0.5954
ConfGF 2.7886 0.1765 0.4688 2.1843 0.1433
Table 3: Mean absolute errors (MAE) of predicted ensemble properties. Unit: eV.
RDKit 0.8914 0.6629 0.2947 0.5196 0.1617
ConfGF 0.5328 0.1145 0.3207 0.7365 0.1337
Table 4: Median of absolute prediction errors. Unit: eV.
Dataset Method COV (%) MAT (Å)
Mean Median Mean Median
QM9 CGCF 78.05 82.48 0.4219 0.3900
ConfGFDist 81.94 85.80 0.3861 0.3571
ConfGF 88.49 94.13 0.2673 0.2685
Drugs CGCF 53.96 57.06 1.2487 1.2247
ConfGFDist 53.40 52.58 1.2493 1.2366
ConfGF 62.15 70.93 1.1629 1.1596
Table 5: Ablation study on the performance of ConfGF. The results of CGCF are taken directly from Table 1. The threshold is set as Å for QM9 and Å for Drugs following Table 1.

5.4 Ablation Study

So far, we have justified the effectiveness of the proposed ConfGF on multiple tasks. However, it remains unclear whether the proposed strategy, i.e., directly estimating the gradient fields of log density of atomic coordinates, contributes to the superior performance of ConfGF. To gain insights into the working behaviours of ConfGF, we conduct the ablation study in this section.

We adopt a variant of ConfGF, called ConfGFDist, which works in a way similar to distance-based methods (Simm and Hernandez-Lobato, 2020; Xu et al., 2021). In specific, ConfGFDist bypasses chain rule and directly generates interatomic distances via annealed Langevin dynamics using estimated gradient fields of interatomic distances, i.e., (Section 4.2), after which we convert them to conformations by the same post-processing algorithm as used in Xu et al. (2021). Following the setup of Section 5.1, we evaluate two approaches on the conformation generation task and summarize the results in Table 5. As presented in Table 5, we observe that ConfGFDist performs significantly worse than ConfGF on both datasets, and performs on par with the state-of-the-art method CGCF. The results indicate that the two-stage generation procedure of distance-based methods does harm the quality of conformations, and our strategy effectively addresses this issue, leading to much more accurate and diverse conformations.

6 Conclusion and Future Work

In this paper, we propose a novel principle for molecular conformation generation called ConfGF, where samples are generated using Langevin dynamics within one stage with physically inspired gradient fields of log density of atomic coordinates. We novelly develop an algorithm to effectively estimate these gradients and meanwhile preserve their roto-translation equivariance. Comprehensive experiments over multiple tasks verify that ConfGF outperforms previous state-of-the-art baselines by a significant margin. Our future work will include extending our ConfGF approach to 3D molecular design tasks and many-body particle systems.


  • S. Axelrod and R. Gomez-Bombarelli (2020) GEOM: energy-annotated molecular conformations for property prediction and molecular generation. arXiv preprint arXiv:2006.05531. Cited by: §1, §5.3, §5.
  • A. J. Ballard, S. Martiniani, J. D. Stevenson, S. Somani, and D. J. Wales (2015) Exploiting the potential energy landscape to sample free energy. Wiley Interdisciplinary Reviews: Computational Molecular Science 5 (3), pp. 273–289. Cited by: §2.
  • J. Blaney and J. Dixon (2007) Distance geometry in molecular modeling. ChemInform 25. Cited by: §2.
  • M. De Vivo, M. Masetti, G. Bottegoni, and A. Cavalli (2016) Role of molecular dynamics and related methods in drug discovery. Journal of medicinal chemistry 59 (9), pp. 4035–4061. Cited by: §1.
  • D. K. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell, T. Hirzel, A. Aspuru-Guzik, and R. P. Adams (2015) Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems, pp. 2224–2232. Cited by: §4.1.
  • D. Frenkel and B. Smit (1996) Understanding molecular simulation: from algorithms to applications. Cited by: §1.
  • F. Fuchs, D. Worrall, V. Fischer, and M. Welling (2020) SE(3)-transformers: 3d roto-translation equivariant attention networks. NeurIPS. Cited by: §3.3.
  • J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl (2017) Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1263–1272. Cited by: §1, §4.1.
  • T. Gogineni, Z. Xu, E. Punzalan, R. Jiang, J. A. Kammeraad, A. Tewari, and P. Zimmerman (2020) TorsionNet: a reinforcement learning approach to sequential conformer search. ArXiv abs/2006.07078. Cited by: §2.
  • A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola (2012) A kernel two-sample test. The Journal of Machine Learning Research 13 (1), pp. 723–773. Cited by: §5.2.
  • T. A. Halgren (1996) Merck molecular force field. v. extension of mmff94 using experimental data, additional computational data, and empirical rules. Journal of Computational Chemistry 17 (5-6), pp. 616–641. Cited by: §1.
  • P. C. Hawkins (2017) Conformation generation: the state of the art. Journal of Chemical Information and Modeling 57 (8), pp. 1747–1756. Cited by: §5.1.
  • W. Hu, B. Liu, J. Gomes, M. Zitnik, P. Liang, V. Pande, and J. Leskovec (2020) Strategies for pre-training graph neural networks. In International Conference on Learning Representations, Cited by: §1, §4.1, §4.2.
  • W. Hu, M. Shuaibi, A. Das, S. Goyal, A. Sriram, J. Leskovec, D. Parikh, and L. Zitnick (2021) ForceNet: a graph neural network for large-scale quantum chemistry simulation. Cited by: §2.
  • W. Jin, R. Barzilay, and T. Jaakkola (2018) Junction tree variational autoencoder for molecular graph generation. arXiv preprint arXiv:1802.04364. Cited by: §1.
  • I. Y. Kanal, J. A. Keith, and G. Hutchison (2017) A sobering assessment of small-molecule force field methods for low energy conformer predictions. International Journal of Quantum Chemistry 118. Cited by: §1.
  • S. Kearnes, K. McCloskey, M. Berndl, V. Pande, and P. Riley (2016) Molecular graph convolutions: moving beyond fingerprints. Journal of computer-aided molecular design 30 (8), pp. 595–608. Cited by: §4.1.
  • D. P. Kingma and J. Ba (2014) Adam: a method for stochastic optimization. In 3nd International Conference on Learning Representations, Cited by: §5.
  • T. N. Kipf and M. Welling (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. Cited by: §4.1.
  • J. Klicpera, J. Groß, and S. Günnemann (2020) Directional message passing for molecular graphs. In International Conference on Learning Representations, Cited by: §4.1.
  • J. Köhler, L. Klein, and F. Noe (2020) Equivariant flows: exact likelihood generative learning for symmetric densities. In Proceedings of the 37th International Conference on Machine Learning, Cited by: §3.3.
  • Z. Li, S. Yang, G. Song, and L. Cai (2021) Conformation-guided molecular representation with hamiltonian neural networks. In International Conference on Learning Representations, Cited by: §5.1.
  • L. Liberti, C. Lavor, N. Maculan, and A. Mucherino (2014) Euclidean distance geometry and applications. SIAM review 56 (1), pp. 3–69. Cited by: §1, §2.
  • E. Mansimov, O. Mahmood, S. Kang, and K. Cho (2019) Molecular geometry prediction using a deep generative graph neural network. arXiv preprint arXiv:1904.00314. Cited by: §1, §2, §5.1, §5.
  • C. Meng, L. Yu, Y. Song, J. Song, and S. Ermon (2020) Autoregressive score matching. NeurIPS. Cited by: §3.2.
  • B. Miller, M. Geiger, T. Smidt, and F. Noé (2020) Relevance of rotationally equivariant convolutions for predicting molecular properties. ArXiv abs/2008.08461. Cited by: §3.3.
  • R. Parr (1989) Density-functional theory of atoms and molecules. Cited by: §1, §2.
  • A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer (2017) Automatic differentiation in pytorch. In NIPS-W, Cited by: §5.
  • A. K. Rappé, C. J. Casewit, K. Colwell, W. A. Goddard III, and W. M. Skiff (1992) UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations. Journal of the American chemical society 114 (25), pp. 10024–10035. Cited by: §1, §5.2.
  • S. Riniker and G. A. Landrum (2015) Better informed distance geometry: using what we know to improve conformation generation. Journal of Chemical Information and Modeling 55 (12), pp. 2562–2574. Cited by: §5.
  • K. Schütt, P. Kindermans, H. E. Sauceda Felix, S. Chmiela, A. Tkatchenko, and K. Müller (2017a)

    SchNet: a continuous-filter convolutional neural network for modeling quantum interactions

    In Advances in Neural Information Processing Systems, pp. 991–1001. Cited by: §2, §4.1, §4.1.
  • K. T. Schütt, F. Arbabzadah, S. Chmiela, K. R. Müller, and A. Tkatchenko (2017b)

    Quantum-chemical insights from deep tensor neural networks

    Nature communications 8, pp. 13890. Cited by: §4.1.
  • C. Shi, M. Xu, Z. Zhu, W. Zhang, M. Zhang, and J. Tang (2020)

    GraphAF: a flow-based autoregressive model for molecular graph generation

    arXiv preprint arXiv:2001.09382. Cited by: §1.
  • J. Shim and A. D. MacKerell Jr (2011) Computational ligand-based rational design: role of conformational sampling and force fields in model development. MedChemComm 2 (5), pp. 356–370. Cited by: §2.
  • G. Simm and J. M. Hernandez-Lobato (2020) A generative model for molecular distance geometry. In Proceedings of the 37th International Conference on Machine Learning, H. D. III and A. Singh (Eds.), Vol. 119, pp. 8949–8958. Cited by: §1, §1, §1, §2, §3.1, 3rd item, §5.2, §5.3, §5.4, §5, §5.
  • G. N. C. Simm, R. Pinsler, G. Csányi, and J. M. Hernández-Lobato (2021) Symmetry-aware actor-critic for 3d molecular design. In International Conference on Learning Representations, Cited by: §3.3.
  • D. G. A. Smith, L. Burns, A. Simmonett, R. Parrish, M. C. Schieber, R. Galvelis, P. Kraus, H. Kruse, R. D. Remigio, A. Alenaizan, A. M. James, S. Lehtola, J. P. Misiewicz, et al. (2020)

    Psi4 1.4: open-source software for high-throughput quantum chemistry.

    The Journal of chemical physics. Cited by: §5.3.
  • Y. Song and S. Ermon (2019) Generative modeling by estimating gradients of the data distribution. In Advances in Neural Information Processing Systems, Vol. 32, pp. 11918–11930. Cited by: §3.2, §3.2, §3.2, §4.1, §4.1, §4.2.
  • Y. Song and S. Ermon (2020) Improved techniques for training score-based generative models. NeurIPS. Cited by: §3.2, §3.2, §4.2, §5.
  • Y. Song, S. Garg, J. Shi, and S. Ermon (2020) Sliced score matching: a scalable approach to density and score estimation. In

    Proceedings of The 35th Uncertainty in Artificial Intelligence Conference

    , R. P. Adams and V. Gogate (Eds.),
    Vol. 115, pp. 574–584. Cited by: §3.2.
  • N. Thomas, T. Smidt, S. M. Kearnes, L. Yang, L. Li, K. Kohlhoff, and P. Riley (2018) Tensor field networks: rotation- and translation-equivariant neural networks for 3d point clouds. ArXiv. Cited by: §3.3.
  • P. Vincent (2011)

    A connection between score matching and denoising autoencoders

    Neural Comput.. Cited by: §4.1.
  • M. Weiler, M. Geiger, M. Welling, W. Boomsma, and T. Cohen (2018) 3D steerable cnns: learning rotationally equivariant features in volumetric data. In NeurIPS, Cited by: §3.3.
  • M. Welling and Y. W. Teh (2011) Bayesian learning via stochastic gradient langevin dynamics. In Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML’11, pp. 681–688. Cited by: §1, §3.2.
  • K. Xu, W. Hu, J. Leskovec, and S. Jegelka (2019) How powerful are graph neural networks?. In International Conference on Learning Representations, Cited by: §4.2.
  • M. Xu, S. Luo, Y. Bengio, J. Peng, and J. Tang (2021) Learning neural generative dynamics for molecular conformation generation. In International Conference on Learning Representations, Cited by: §1, §1, §2, §3.1, §5.1, §5.1, §5.2, §5.4, Table 1, Table 2, §5, §5.
  • J. You, B. Liu, Z. Ying, V. Pande, and J. Leskovec (2018) Graph convolutional policy network for goal-directed molecular graph generation. In Advances in neural information processing systems, pp. 6410–6421. Cited by: §1.
  • L. Zhang, J. Han, H. Wang, R. Car, and W. E (2018) Deep Potential Molecular Dynamics: A Scalable Model with the Accuracy of Quantum Mechanics. Physical Review Letters 120 (14), pp. 143001. Cited by: §2.

Appendix A Proof of Proposition 1

Let denote a molecular conformation. Let denote any 3D translation function where , and let denote any 3D rotation function whose rotation matrix representation is , i.e., . In our context, we say a score function of atomic coordinates is roto-translation equivariant if it satisfies:


Intuitively, the above equation says that applying any translation to the input has no effect on the output, and applying any rotation to the input has the same effect as applying it to the output, i.e., the gradients rotate together with the molecule system and are invariant under translation.


Denote . According to the definition of translation and rotation function, we have . According to Eq. 3, we have


Here and because rotation and translation will not change interatomic distances. Together with Eq. 11 and Eq. 12, we conclude that the score network defined in Eq. 3 is roto-translation equivariant. ∎

Appendix B Additional Hyperparameters

We summarize additional hyperparameters of our ConfGF in Table 6, including the biggest noise level , the smallest noise level , the number of noise levels , the number of sampling steps per noise level , the smallest step size

, the batch size and the training epochs.

Dataset Batch size Training epochs
GEOM-QM9 10 0.01 50 100 2.4e-6 128 200
GEOM-Drugs 10 0.01 50 100 2.4e-6 128 200
ISO17 3 0.1 30 100 2.0e-4 128 100
Table 6: Additional hyperparameters of our ConfGF.

Appendix C Additional Experiments

We present more results of the Coverage (COV) score at different threshold for GEOM-QM9 and GEOM-Drugs datasets in Tables 7 and 8 respectively. Results in Tables 7 and 8 indicate that the proposed ConfGF consistently outperforms previous state-of-the-art baselines, including GraphDG and CGCF. In addition, we also report the Mismatch (MIS) score defined as follows:


where is the threshold. The MIS score measures the fraction of generated conformations that are not matched by any ground-truth conformation in the reference set given a threshold . A lower MIS score indicates less bad samples and better generation quality. As shown in Tables 7 and 8, the MIS scores of ConfGF are consistently lower than the competing baselines, which demonstrates that our ConfGF generates less invalid conformations compared with the existing models.

QM9 Mean COV (%) Median COV (%) Mean MIS (%) Median MIS (%)
(Å) GraphDG CGCF ConfGF GraphDG CGCF ConfGF GraphDG CGCF ConfGF GraphDG CGCF ConfGF
0.10 1.03 0.14 17.59 0.00 0.00 10.29 99.70 99.93 92.30 100.00 100.00 96.23
0.20 13.05 10.97 43.60 2.99 3.95 37.92 96.12 96.91 81.67 99.04 99.04 85.65
0.30 32.26 31.02 61.94 18.81 22.94 59.66 87.15 89.21 73.06 94.44 94.44 77.27
0.40 53.53 53.65 75.45 50.00 52.63 80.64 72.60 78.35 65.38 82.63 82.63 70.00
0.50 73.33 78.05 88.49 84.21 82.48 94.13 56.09 63.51 53.56 64.66 64.66 56.59
0.60 88.24 94.85 97.71 98.83 98.79 100.00 40.36 44.82 34.78 43.73 43.73 35.86
0.70 95.93 99.05 99.52 100.00 100.00 100.00 27.93 29.64 21.00 23.38 23.38 15.64
0.80 98.70 99.47 99.68 100.00 100.00 100.00 19.15 20.98 12.86 10.72 10.72 5.23
0.90 99.33 99.50 99.77 100.00 100.00 100.00 12.76 16.74 8.98 3.65 3.65 1.53
1.00 99.48 99.50 99.86 100.00 100.00 100.00 8.00 14.19 6.76 0.47 0.47 0.36
1.10 99.51 99.51 99.91 100.00 100.00 100.00 4.99 12.26 5.57 0.00 0.00 0.00
1.20 99.51 99.51 99.94 100.00 100.00 100.00 2.95 9.68 3.48 0.00 0.00 0.00
1.30 99.51 99.51 99.96 100.00 100.00 100.00 1.65 7.48 1.91 0.00 0.00 0.00
1.40 99.51 99.51 99.96 100.00 100.00 100.00 0.84 5.94 1.05 0.00 0.00 0.00
1.50 99.52 99.51 99.97 100.00 100.00 100.00 0.41 4.94 0.70 0.00 0.00 0.00
Table 7: COV and MIS scores of different approaches on GEOM-QM9 dataset at different threshold .
Drugs Mean COV (%) Median COV (%) Mean MIS (%) Median MIS (%)
(Å) GraphDG CGCF ConfGF GraphDG CGCF ConfGF GraphDG CGCF ConfGF GraphDG CGCF ConfGF
0.25 0.00 0.06 0.17 0.00 0.00 0.00 100.00 99.99 99.97 100.00 100.00 100.00
0.50 0.26 0.80 1.15 0.00 0.00 0.00 99.95 99.80 99.52 100.00 100.00 100.00
0.75 0.75 5.81 9.15 0.00 0.00 0.50 99.69 97.86 96.94 100.00 100.00 99.75
1.00 2.39 24.67 30.60 0.00 11.81 18.89 99.14 90.82 89.63 100.00 96.50 95.58
1.25 8.27 53.96 62.15 0.00 57.06 70.93 97.92 78.32 76.58 100.00 86.28 84.48
1.50 19.96 79.37 86.62 4.00 92.46 98.79 94.40 63.80 60.06 99.14 66.39 63.81
1.75 36.86 91.47 96.53 26.58 100.00 100.00 87.68 49.72 43.63 95.83 47.09 41.72
2.00 55.79 96.73 98.62 55.26 100.00 100.00 76.99 37.53 29.80 87.35 30.90 22.44
2.25 71.43 99.05 99.83 80.00 100.00 100.00 61.76 27.30 18.68 69.74 20.07 10.93
2.50 83.53 99.47 100.00 95.45 100.00 100.00 44.32 18.97 11.09 42.96 12.33 3.31
2.75 91.09 99.60 100.00 100.00 100.00 100.00 27.92 12.52 6.32 16.67 6.82 0.74
3.00 95.00 99.96 100.00 100.00 100.00 100.00 15.97 7.67 3.36 2.46 3.32 0.00
Table 8: COV and MIS scores of different approaches on GEOM-Drugs dataset at different threshold .

Appendix D More Generated Samples

We present more visualizations of generated conformations from our ConfGF in Figure 4, including models trained on GEOM-QM9 and GEOM-drugs datasets.

Figure 4: Visualization of conformations generated by our ConfGF. For each molecular graph, we randomly sample 10 generated conformations from our model. For the top 5 rows, the molecular graphs are drawn from the GEOM-QM9 test dataset, while for the bottom 5 rows, the molecular graphs are drawn from the GEOM-drugs test dataset.