MolFM: A Multimodal Molecular Foundation Model

06/06/2023
by   Yizhen Luo, et al.
0

Molecular knowledge resides within three different modalities of information sources: molecular structures, biomedical documents, and knowledge bases. Effective incorporation of molecular knowledge from these modalities holds paramount significance in facilitating biomedical research. However, existing multimodal molecular foundation models exhibit limitations in capturing intricate connections between molecular structures and texts, and more importantly, none of them attempt to leverage a wealth of molecular expertise derived from knowledge graphs. In this study, we introduce MolFM, a multimodal molecular foundation model designed to facilitate joint representation learning from molecular structures, biomedical texts, and knowledge graphs. We propose cross-modal attention between atoms of molecular structures, neighbors of molecule entities and semantically related texts to facilitate cross-modal comprehension. We provide theoretical analysis that our cross-modal pre-training captures local and global molecular knowledge by minimizing the distance in the feature space between different modalities of the same molecule, as well as molecules sharing similar structures or functions. MolFM achieves state-of-the-art performance on various downstream tasks. On cross-modal retrieval, MolFM outperforms existing models with 12.13 absolute gains under the zero-shot and fine-tuning settings, respectively. Furthermore, qualitative analysis showcases MolFM's implicit ability to provide grounding from molecular substructures and knowledge graphs. Code and models are available on https://github.com/BioFM/OpenBioMed.

READ FULL TEXT
research
05/29/2023

Deeply Coupled Cross-Modal Prompt Learning

Recent advancements in multimodal foundation models (e.g., CLIP) have ex...
research
10/12/2022

Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features

Hateful memes are a growing menace on social media. While the image and ...
research
05/03/2023

MolKD: Distilling Cross-Modal Knowledge in Chemical Reactions for Molecular Property Prediction

How to effectively represent molecules is a long-standing challenge for ...
research
04/20/2022

Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval

Cross-modal image-recipe retrieval has gained significant attention in r...
research
05/31/2023

ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning

Two-Tower Vision-Language (VL) models have shown promising improvements ...
research
11/19/2022

A survey on knowledge-enhanced multimodal learning

Multimodal learning has been a field of increasing interest, aiming to c...

Please sign up or login with your details

Forgot password? Click here to reset