Geometry-aware Line Graph Transformer Pre-training for Molecular Property Prediction

09/01/2023
by   Peizhen Bai, et al.
0

Molecular property prediction with deep learning has gained much attention over the past years. Owing to the scarcity of labeled molecules, there has been growing interest in self-supervised learning methods that learn generalizable molecular representations from unlabeled data. Molecules are typically treated as 2D topological graphs in modeling, but it has been discovered that their 3D geometry is of great importance in determining molecular functionalities. In this paper, we propose the Geometry-aware line graph transformer (Galformer) pre-training, a novel self-supervised learning framework that aims to enhance molecular representation learning with 2D and 3D modalities. Specifically, we first design a dual-modality line graph transformer backbone to encode the topological and geometric information of a molecule. The designed backbone incorporates effective structural encodings to capture graph structures from both modalities. Then we devise two complementary pre-training tasks at the inter and intra-modality levels. These tasks provide properly supervised information and extract discriminative 2D and 3D knowledge from unlabeled molecules. Finally, we evaluate Galformer against six state-of-the-art baselines on twelve property prediction benchmarks via downstream fine-tuning. Experimental results show that Galformer consistently outperforms all baselines on both classification and regression tasks, demonstrating its effectiveness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2022

KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction

Designing accurate deep learning models for molecular property predictio...
research
06/11/2021

ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for Property Prediction

Effective molecular representation learning is of great importance to fa...
research
10/04/2022

One Transformer Can Understand Both 2D 3D Molecular Data

Unlike vision and language data which usually has a unique format, molec...
research
11/25/2022

BatmanNet: Bi-branch Masked Graph Transformer Autoencoder for Molecular Representation

Although substantial efforts have been made using graph neural networks ...
research
11/26/2020

Molecular representation learning with language models and domain-relevant auxiliary tasks

We apply a Transformer architecture, specifically BERT, to learn flexibl...
research
05/18/2023

MolXPT: Wrapping Molecules with Text for Generative Pre-training

Generative pre-trained Transformer (GPT) has demonstrates its great succ...
research
11/29/2022

BARTSmiles: Generative Masked Language Models for Molecular Representations

We discover a robust self-supervised strategy tailored towards molecular...

Please sign up or login with your details

Forgot password? Click here to reset