Predicting Drug Solubility Using Different Machine Learning Methods – Linear Regression Model with Extracted Chemical Features vs Graph Convolutional Neural Network

08/23/2023
by   John Ho, et al.
0

Predicting the solubility of given molecules is an important task in the pharmaceutical industry, and consequently this is a well-studied topic. In this research, we revisited this problem with the advantage of modern computing resources. We applied two machine learning models, a linear regression model and a graph convolutional neural network model, on multiple experimental datasets. Both methods can make reasonable predictions while the GCNN model had the best performance. However, the current GCNN model is a black box, while feature importance analysis from the linear regression model offers more insights into the underlying chemical influences. Using the linear regression model, we show how each functional group affects the overall solubility. Ultimately, knowing how chemical structure influences chemical properties is crucial when designing new drugs. Future work should aim to combine the high performance of GCNNs with the interpretability of linear regression, unlocking new advances in next generation high throughput screening.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/11/2020

Bayesian Analysis on Limiting the Student-t Linear Regression Model

For the outlier problem in linear regression models, the Student-t linea...
research
08/13/2023

Optimizing Offensive Gameplan in the National Basketball Association with Machine Learning

Throughout the analytical revolution that has occurred in the NBA, the d...
research
07/22/2021

Size doesn't matter: predicting physico- or biochemical properties based on dozens of molecules

The use of machine learning in chemistry has become a common practice. A...
research
07/06/2021

An Inverse QSAR Method Based on Linear Regression and Integer Programming

Recently a novel framework has been proposed for designing the molecular...
research
11/05/2018

Supervised Linear Regression for Graph Learning from Graph Signals

We propose a supervised learning approach for predicting an underlying g...
research
06/07/2020

Sources of high leverage in linear regression model

Some reasons for high leverage are analytically investigated by decompos...

Please sign up or login with your details

Forgot password? Click here to reset