Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures

09/03/2019
by   Jordan Hoffmann, et al.
6

Generative models have achieved impressive results in many domains including image and text generation. In the natural sciences, generative models have led to rapid progress in automated drug discovery. Many of the current methods focus on either 1-D or 2-D representations of typically small, drug-like molecules. However, many molecules require 3-D descriptors and exceed the chemical complexity of commonly used dataset. We present a method to encode and decode the position of atoms in 3-D molecules from a dataset of nearly 50,000 stable crystal unit cells that vary from containing 1 to over 100 atoms. We construct a smooth and continuous 3-D density representation of each crystal based on the positions of different atoms. Two different neural networks were trained on a dataset of over 120,000 three-dimensional samples of single and repeating crystal structures, made by rotating the single unit cells. The first, an Encoder-Decoder pair, constructs a compressed latent space representation of each molecule and then decodes this description into an accurate reconstruction of the input. The second network segments the resulting output into atoms and assigns each atom an atomic number. By generating compressed, continuous latent spaces representations of molecules we are able to decode random samples, interpolate between two molecules, and alter known molecules.

READ FULL TEXT

page 4

page 7

page 11

page 12

page 13

page 19

page 20

page 21

research
10/17/2020

Learning a Continuous Representation of 3D Molecular Structures with Deep Generative Models

Machine learning methods in drug discovery have primarily focused on vir...
research
04/17/2020

Continuous Representation of Molecules Using Graph Variational Autoencoder

In order to continuously represent molecules, we propose a generative mo...
research
03/26/2018

Fréchet ChemblNet Distance: A metric for generative models for molecules

The new wave of successful generative models in machine learning has inc...
research
09/03/2021

IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System

Like many scientific fields, new chemistry literature has grown at a sta...
research
08/18/2022

Improving Small Molecule Generation using Mutual Information Machine

We address the task of controlled generation of small molecules, which e...
research
06/02/2019

Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules

Deep learning has proven to yield fast and accurate predictions of quant...
research
06/13/2023

3D molecule generation by denoising voxel grids

We propose a new score-based approach to generate 3D molecules represent...

Please sign up or login with your details

Forgot password? Click here to reset