Molecular Identification from AFM images using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks

05/01/2022
by   Jaime Carracedo-Cosme, et al.
11

Despite being the main tool to visualize molecules at the atomic scale, AFM with CO-functionalized metal tips is unable to chemically identify the observed molecules. Here we present a strategy to address this challenging task using deep learning techniques. Instead of identifying a finite number of molecules following a traditional classification approach, we define the molecular identification as an image captioning problem. We design an architecture, composed of two multimodal recurrent neural networks, capable of identifying the structure and composition of an unknown molecule using a 3D-AFM image stack as input. The neural network is trained to provide the name of each molecule according to the IUPAC nomenclature rules. To train and test this algorithm we use the novel QUAM-AFM dataset, which contains almost 700,000 molecules and 165 million AFM images. The accuracy of the predictions is remarkable, achieving a high score quantified by the cumulative BLEU 4-gram, a common metric in language recognition studies.

READ FULL TEXT

page 9

page 15

page 17

page 35

page 36

page 39

page 40

page 41

research
05/26/2021

Predicting Aqueous Solubility of Organic Molecules Using Deep Learning Models with Varied Molecular Representations

Determining the aqueous solubility of molecules is a vital step in many ...
research
09/03/2021

IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System

Like many scientific fields, new chemistry literature has grown at a sta...
research
01/08/2018

Graph Memory Networks for Molecular Activity Prediction

Molecular activity prediction is critical in drug design. Machine learni...
research
05/17/2023

Predicting Side Effect of Drug Molecules using Recurrent Neural Networks

Identification and verification of molecular properties such as side eff...
research
10/15/2022

Substructure-Atom Cross Attention for Molecular Representation Learning

Designing a neural network architecture for molecular representation is ...
research
01/28/2021

Automatic design of novel potential 3CL^pro and PL^pro inhibitors

With the goal of designing novel inhibitors for SARS-CoV-1 and SARS-CoV-...
research
08/16/2016

Authorship clustering using multi-headed recurrent neural networks

A recurrent neural network that has been trained to separately model the...

Please sign up or login with your details

Forgot password? Click here to reset