BERT Learns (and Teaches) Chemistry

07/11/2020
by   Josh Payne, et al.
1

Modern computational organic chemistry is becoming increasingly data-driven. There remain a large number of important unsolved problems in this area such as product prediction given reactants, drug discovery, and metric-optimized molecule synthesis, but efforts to solve these problems using machine learning have also increased in recent years. In this work, we propose the use of attention to study functional groups and other property-impacting molecular substructures from a data-driven perspective, using a transformer-based model (BERT) on datasets of string representations of molecules and analyzing the behavior of its attention heads. We then apply the representations of functional groups and atoms learned by the model to tackle problems of toxicity, solubility, drug-likeness, and synthesis accessibility on smaller datasets using the learned representations as features for graph convolution and attention models on the graph structure of molecules, as well as fine-tuning of BERT. Finally, we propose the use of attention visualization as a helpful tool for chemistry practitioners and students to quickly identify important substructures in various chemical properties.

READ FULL TEXT
research
07/20/2020

Visualizing Deep Graph Generative Models for Drug Discovery

Drug discovery aims at designing novel molecules with specific desired p...
research
06/04/2021

Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery

We developed Distilled Graph Attention Policy Networks (DGAPNs), a curio...
research
12/01/2018

Discovering Molecular Functional Groups Using Graph Convolutional Neural Networks

Functional groups (FGs) serve as a foundation for analyzing chemical pro...
research
08/05/2019

ChemBO: Bayesian Optimization of Small Organic Molecules with Synthesizable Recommendations

We describe ChemBO, a Bayesian Optimization framework for generating and...
research
07/25/2019

Graph Informer Networks for Molecules

In machine learning, chemical molecules are often represented by sparse ...
research
02/19/2020

Molecule Attention Transformer

Designing a single neural network architecture that performs competitive...
research
05/04/2023

G-MATT: Single-step Retrosynthesis Prediction using Molecular Grammar Tree Transformer

Various template-based and template-free approaches have been proposed f...

Please sign up or login with your details

Forgot password? Click here to reset