G-MATT: Single-step Retrosynthesis Prediction using Molecular Grammar Tree Transformer

05/04/2023
by   Kevin Zhang, et al.
0

Various template-based and template-free approaches have been proposed for single-step retrosynthesis prediction in recent years. While these approaches demonstrate strong performance from a data-driven metrics standpoint, many model architectures do not incorporate underlying chemistry principles. Here, we propose a novel chemistry-aware retrosynthesis prediction framework that combines powerful data-driven models with prior domain knowledge. We present a tree-to-sequence transformer architecture that utilizes hierarchical SMILES grammar-based trees, incorporating crucial chemistry information that is often overlooked by SMILES text-based representations, such as local structures and functional groups. The proposed framework, grammar-based molecular attention tree transformer (G-MATT), achieves significant performance improvements compared to baseline retrosynthesis models. G-MATT achieves a promising top-1 accuracy of 51 similarity rate of 74.8 G-MATT attention maps demonstrate the ability to retain chemistry knowledge without relying on excessively complex model architectures.

READ FULL TEXT

page 19

page 20

page 24

research
10/19/2021

Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction

Synthesis planning and reaction outcome prediction are two fundamental p...
research
03/28/2020

A Graph to Graphs Framework for Retrosynthesis Prediction

A fundamental problem in computational chemistry is to find a set of rea...
research
02/19/2020

Tree-structured Attention with Hierarchical Accumulation

Incorporating hierarchical structures like constituency trees has been s...
research
12/16/2021

Trees in transformers: a theoretical analysis of the Transformer's ability to represent trees

Transformer networks are the de facto standard architecture in natural l...
research
10/17/2019

Predicting retrosynthetic pathways using a combined linguistic model and hyper-graph exploration strategy

We present an extension of our Molecular Transformer architecture combin...
research
07/11/2020

BERT Learns (and Teaches) Chemistry

Modern computational organic chemistry is becoming increasingly data-dri...
research
10/19/2020

Heads-up! Unsupervised Constituency Parsing via Self-Attention Heads

Transformer-based pre-trained language models (PLMs) have dramatically i...

Please sign up or login with your details

Forgot password? Click here to reset