Multi-Constraint Molecular Generation using Sparsely Labelled Training Data for Localized High-Concentration Electrolyte Diluent Screening

01/12/2023
by   Jonathan P. Mailoa, et al.
0

Recently, machine learning methods have been used to propose molecules with desired properties, which is especially useful for exploring large chemical spaces efficiently. However, these methods rely on fully labelled training data, and are not practical in situations where molecules with multiple property constraints are required. There is often insufficient training data for all those properties from publicly available databases, especially when ab-initio simulation or experimental property data is also desired for training the conditional molecular generative model. In this work, we show how to modify a semi-supervised variational auto-encoder (SSVAE) model which only works with fully labelled and fully unlabelled molecular property training data into the ConGen model, which also works on training data that have sparsely populated labels. We evaluate ConGen's performance in generating molecules with multiple constraints when trained on a dataset combined from multiple publicly available molecule property databases, and demonstrate an example application of building the virtual chemical space for potential Lithium-ion battery localized high-concentration electrolyte (LHCE) diluents.

READ FULL TEXT
research
04/30/2018

Conditional molecular design with deep generative models

Although machine learning has been successfully used to propose novel mo...
research
11/25/2021

Fragment-based molecular generative model with high generalization ability and synthetic accessibility

Deep generative models are attracting great attention for molecular desi...
research
02/08/2020

Composing Molecules with Multiple Property Constraints

Drug discovery aims to find novel compounds with specified chemical prop...
research
08/10/2022

Semi-Supervised Junction Tree Variational Autoencoder for Molecular Property Prediction

Recent advances in machine learning have enabled accurate prediction of ...
research
05/23/2019

A COLD Approach to Generating Optimal Samples

Optimising discrete data for a desired characteristic using gradient-bas...
research
11/03/2020

Optimizing Molecules using Efficient Queries from Property Evaluations

Machine learning has shown potential for optimizing existing molecules w...
research
06/08/2019

A Two-Step Graph Convolutional Decoder for Molecule Generation

We propose a simple auto-encoder framework for molecule generation. The ...

Please sign up or login with your details

Forgot password? Click here to reset