Analysis of training and seed bias in small molecules generated with a conditional graph-based variational autoencoder – Insights for practical AI-driven molecule generation

07/19/2021
by   Seung-gu Kang, et al.
14

The application of deep learning to generative molecule design has shown early promise for accelerating lead series development. However, questions remain concerning how factors like training, dataset, and seed bias impact the technology's utility to medicine and computational chemists. In this work, we analyze the impact of seed and training bias on the output of an activity-conditioned graph-based variational autoencoder (VAE). Leveraging a massive, labeled dataset corresponding to the dopamine D2 receptor, our graph-based generative model is shown to excel in producing desired conditioned activities and favorable unconditioned physical properties in generated molecules. We implement an activity swapping method that allows for the activation, deactivation, or retention of activity of molecular seeds, and we apply independent deep learning classifiers to verify the generative results. Overall, we uncover relationships between noise, molecular seeds, and training set selection across a range of latent-space sampling procedures, providing important insights for practical AI-driven molecule generation.

READ FULL TEXT

page 10

page 11

page 13

page 14

research
06/15/2018

Molecular generative model based on conditional variational autoencoder for de novo molecular design

We propose a molecular generative model based on the conditional variati...
research
05/15/2022

3DLinker: An E(3) Equivariant Variational Autoencoder for Molecular Linker Design

Deep learning has achieved tremendous success in designing novel chemica...
research
09/29/2020

Physics-Constrained Predictive Molecular Latent Space Discovery with Graph Scattering Variational Autoencoder

Recent advances in artificial intelligence have propelled the developmen...
research
04/05/2022

In-Pocket 3D Graphs Enhance Ligand-Target Compatibility in Generative Small-Molecule Creation

Proteins in complex with small molecule ligands represent the core of st...
research
10/01/2019

Re-balancing Variational Autoencoder Loss for Molecule Sequence Generation

Molecule generation is to design new molecules with specific chemical pr...
research
05/30/2019

All SMILES Variational Autoencoder

Variational autoencoders (VAEs) defined over SMILES string and graph-bas...
research
09/11/2021

Conditional Generation of Synthetic Geospatial Images from Pixel-level and Feature-level Inputs

Training robust supervised deep learning models for many geospatial appl...

Please sign up or login with your details

Forgot password? Click here to reset