All SMILES VAE

05/30/2019
by   Zaccary Alperstein, et al.
0

Variational autoencoders (VAEs) defined over SMILES string and graph-based representations of molecules promise to improve the optimization of molecular properties, thereby revolutionizing the pharmaceuticals and materials industries. However, these VAEs are hindered by the non-unique nature of SMILES strings and the computational cost of graph convolutions. To efficiently pass messages along all paths through the molecular graph, we encode multiple SMILES strings of a single molecule using a set of stacked recurrent neural networks, pooling hidden representations of each atom between SMILES representations, and use attentional pooling to build a final fixed-length latent representation. By then decoding to a disjoint set of SMILES strings of the molecule, our All SMILES VAE learns an almost bijective mapping between molecules and latent representations near the high-probability-mass subspace of the prior. Our SMILES-derived but molecule-based latent representations significantly surpass the state-of-the-art in a variety of fully- and semi-supervised property regression and molecular property optimization tasks.

READ FULL TEXT

page 8

page 18

research
05/30/2019

All SMILES Variational Autoencoder

Variational autoencoders (VAEs) defined over SMILES string and graph-bas...
research
08/09/2020

Augmenting Molecular Images with Vector Representations as a Featurization Technique for Drug Classification

One of the key steps in building deep learning systems for drug classifi...
research
09/08/2018

Molecular Hypergraph Grammar with its Application to Molecular Optimization

This paper is concerned with a molecular optimization framework using va...
research
10/24/2019

Deep Learning for Molecular Graphs with Tiered Graph Autoencoders and Graph Classification

Tiered graph autoencoders provide the architecture and mechanisms for le...
research
03/31/2022

SELFIES and the future of molecular string representations

Artificial intelligence (AI) and machine learning (ML) are expanding in ...
research
12/26/2021

Stepping Back to SMILES Transformers for Fast Molecular Representation Inference

In the intersection of molecular science and deep learning, tasks like v...
research
02/07/2023

Recent advances in the Self-Referencing Embedding Strings (SELFIES) library

String-based molecular representations play a crucial role in cheminform...

Please sign up or login with your details

Forgot password? Click here to reset