Constant Size Molecular Descriptors For Use With Machine Learning

01/23/2017
by   Christopher R. Collins, et al.
0

A set of molecular descriptors whose length is independent of molecular size is developed for machine learning models that target thermodynamic and electronic properties of molecules. These features are evaluated by monitoring performance of kernel ridge regression models on well-studied data sets of small organic molecules. The features include connectivity counts, which require only the bonding pattern of the molecule, and encoded distances, which summarize distances between both bonded and non-bonded atoms and so require the full molecular geometry. In addition to having constant size, these features summarize information regarding the local environment of atoms and bonds, such that models can take advantage of similarities resulting from the presence of similar chemical fragments across molecules. Combining these two types of features leads to models whose performance is comparable to or better than the current state of the art. The features introduced here have the advantage of leading to models that may be trained on smaller molecules and then used successfully on larger molecules.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2011

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning

We introduce a machine learning model to predict atomization energies of...
research
12/15/2017

WACSF - Weighted Atom-Centered Symmetry Functions as Descriptors in Machine Learning Potentials

We introduce weighted atom-centered symmetry functions (wACSFs) as descr...
research
03/25/2021

Quantitative Prediction on the Enantioselectivity of Multiple Chiral Iodoarene Scaffolds Based on Whole Geometry

The mechanistic underpinnings of asymmetric catalysis at atomic levels p...
research
03/07/2019

Transfer Learning Using Ensemble Neural Nets for Organic Solar Cell Screening

Organic Solar Cells are a promising technology for solving the clean ene...
research
01/27/2016

Predicting Drug Interactions and Mutagenicity with Ensemble Classifiers on Subgraphs of Molecules

In this study, we intend to solve a mutual information problem in intera...

Please sign up or login with your details

Forgot password? Click here to reset