Improving Molecular Design by Stochastic Iterative Target Augmentation

02/11/2020
by   Kevin Yang, et al.
13

Generative models in molecular design tend to be richly parameterized, data-hungry neural models, as they must create complex structured objects as outputs. Estimating such models from data may be challenging due to the lack of sufficient training data. In this paper, we propose a surprisingly effective self-training approach for iteratively creating additional molecular targets. We first pre-train the generative model together with a simple property predictor. The property predictor is then used as a likelihood model for filtering candidate structures from the generative model. Additional targets are iteratively produced and used in the course of stochastic EM iterations to maximize the log-likelihood that the candidate structures are accepted. A simple rejection (re-weighting) sampler suffices to draw posterior samples since the generative model is already reasonable after pre-training. We demonstrate significant gains over strong baselines for both unconditional and conditional molecular design. In particular, our approach outperforms the previous state-of-the-art in conditional molecular design by over 10 absolute gain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2021

Towards Conditional Generation of Minimal Action Potential Pathways for Molecular Dynamics

In this paper, we utilized generative models, and reformulate it for pro...
research
06/13/2023

Automated 3D Pre-Training for Molecular Property Prediction

Molecular property prediction is an important problem in drug discovery ...
research
12/28/2020

Deep Evolutionary Learning for Molecular Design

In this paper, we propose a deep evolutionary learning (DEL) process tha...
research
05/31/2022

Pre-training via Denoising for Molecular Property Prediction

Many important problems involving molecular property prediction from 3D ...
research
07/02/2023

Variational Autoencoding Molecular Graphs with Denoising Diffusion Probabilistic Model

In data-driven drug discovery, designing molecular descriptors is a very...
research
10/09/2021

Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design

Antibodies are versatile proteins that bind to pathogens like viruses an...

Please sign up or login with your details

Forgot password? Click here to reset