Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations

06/08/2021
by   Yair Schiff, et al.
19

Deep generative models have emerged as a powerful tool for learning informative molecular representations and designing novel molecules with desired properties, with applications in drug discovery and material design. Deep generative auto-encoders defined over molecular SMILES strings have been a popular choice for that purpose. However, capturing salient molecular properties like quantum-chemical energies remains challenging and requires sophisticated neural net models of molecular graphs or geometry-based information. As a simpler and more efficient alternative, we present a SMILES Variational Auto-Encoder (VAE) augmented with topological data analysis (TDA) representations of molecules, known as persistence images. Our experiments show that this TDA augmentation enables a SMILES VAE to capture the complex relation between 3D geometry and electronic properties, and allows generation of novel, diverse, and valid molecules with geometric features consistent with the training data, which exhibit a varying range of global electronic structural properties, such as a small HOMO-LUMO gap - a critical property for designing organic solar cells. We demonstrate that our TDA augmentation yields better success in downstream tasks compared to models trained without these representations and can assist in targeted molecule discovery.

READ FULL TEXT
research
10/18/2020

Characterizing the Latent Space of Molecular Deep Generative Models with Persistent Homology Metrics

Deep generative models are increasingly becoming integral parts of the i...
research
07/29/2022

Topology-Driven Generative Completion of Lacunae in Molecular Data

We introduce an approach to the targeted completion of lacunae in molecu...
research
06/02/2019

Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules

Deep learning has proven to yield fast and accurate predictions of quant...
research
07/02/2023

Variational Autoencoding Molecular Graphs with Denoising Diffusion Probabilistic Model

In data-driven drug discovery, designing molecular descriptors is a very...
research
04/15/2022

Self-Similarity Priors: Neural Collages as Differentiable Fractal Representations

Many patterns in nature exhibit self-similarity: they can be compactly d...
research
05/04/2023

Are VAEs Bad at Reconstructing Molecular Graphs?

Many contemporary generative models of molecules are variational auto-en...
research
02/28/2022

Interpretable Molecular Graph Generation via Monotonic Constraints

Designing molecules with specific properties is a long-lasting research ...

Please sign up or login with your details

Forgot password? Click here to reset