Deep Molecular Dreaming: Inverse machine learning for de-novo molecular design and interpretability with surjective representations

12/17/2020
by   Cynthia Shen, et al.
0

Computer-based de-novo design of functional molecules is one of the most prominent challenges in cheminformatics today. As a result, generative and evolutionary inverse designs from the field of artificial intelligence have emerged at a rapid pace, with aims to optimize molecules for a particular chemical property. These models 'indirectly' explore the chemical space; by learning latent spaces, policies, distributions or by applying mutations on populations of molecules. However, the recent development of the SELFIES string representation of molecules, a surjective alternative to SMILES, have made possible other potential techniques. Based on SELFIES, we therefore propose PASITHEA, a direct gradient-based molecule optimization that applies inceptionism techniques from computer vision. PASITHEA exploits the use of gradients by directly reversing the learning process of a neural network, which is trained to predict real-valued chemical properties. Effectively, this forms an inverse regression model, which is capable of generating molecular variants optimized for a certain property. Although our results are preliminary, we observe a shift in distribution of a chosen property during inverse-training, a clear indication of PASITHEA's viability. A striking property of inceptionism is that we can directly probe the model's understanding of the chemical space it was trained on. We expect that extending PASITHEA to larger datasets, molecules and more complex properties will lead to advances in the design of new functional molecules as well as the interpretation and explanation of machine learning models.

READ FULL TEXT
research
09/10/2021

Inverse design of 3d molecular structures with conditional generative neural networks

The rational design of molecules with desired properties is a long-stand...
research
12/03/2022

Calibration and generalizability of probabilistic models on low-data chemical datasets with DIONYSUS

Deep learning models that leverage large datasets are often the state of...
research
03/31/2014

Chemlambda, universality and self-multiplication

We present chemlambda (or the chemical concrete machine), an artificial ...
research
09/26/2022

Tartarus: A Benchmarking Platform for Realistic And Practical Inverse Molecular Design

The efficient exploration of chemical space to design molecules with int...
research
08/19/2016

Space-Filling Curves as a Novel Crystal Structure Representation for Machine Learning Models

A fundamental problem in applying machine learning techniques for chemic...
research
10/23/2018

Analysis of Atomistic Representations Using Weighted Skip-Connections

In this work, we extend the SchNet architecture by using weighted skip c...
research
12/23/2020

Using vis-NIRS and Machine Learning methods to diagnose sugarcane soil chemical properties

Knowing chemical soil properties might be determinant in crop management...

Please sign up or login with your details

Forgot password? Click here to reset