Extracting Molecular Properties from Natural Language with Multimodal Contrastive Learning

07/22/2023
by   Romain Lacombe, et al.
0

Deep learning in computational biochemistry has traditionally focused on molecular graphs neural representations; however, recent advances in language models highlight how much scientific knowledge is encoded in text. To bridge these two modalities, we investigate how molecular property information can be transferred from natural language to graph representations. We study property prediction performance gains after using contrastive learning to align neural graph representations with representations of textual descriptions of their characteristics. We implement neural relevance scoring strategies to improve text retrieval, introduce a novel chemically-valid molecular graph augmentation strategy inspired by organic reactions, and demonstrate improved performance on downstream MoleculeNet property classification tasks. We achieve a +4.26 gain versus models pre-trained on the graph modality alone, and a +1.54 compared to recently proposed molecular graph/text contrastively trained MoMu model (Su et al. 2022).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2021

MolCLR: Molecular Contrastive Learning of Representations via Graph Neural Networks

Molecular machine learning bears promise for efficient molecule property...
research
07/14/2023

Can Large Language Models Empower Molecular Property Prediction?

Molecular property prediction has gained significant attention due to it...
research
09/24/2021

GeomGCL: Geometric Graph Contrastive Learning for Molecular Property Prediction

Recently many efforts have been devoted to applying graph neural network...
research
08/14/2023

GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text

Large language models have made significant strides in natural language ...
research
11/03/2022

A 3D-Shape Similarity-based Contrastive Approach to Molecular Representation Learning

Molecular shape and geometry dictate key biophysical recognition process...
research
09/18/2021

MM-Deacon: Multimodal molecular domain embedding analysis via contrastive learning

Molecular representation learning plays an essential role in cheminforma...
research
02/18/2022

Improving Molecular Contrastive Learning via Faulty Negative Mitigation and Decomposed Fragment Contrast

Deep learning has been a prevalence in computational chemistry and widel...

Please sign up or login with your details

Forgot password? Click here to reset