Molecular representation learning with language models and domain-relevant auxiliary tasks

11/26/2020
by   Benedek Fabian, et al.
0

We apply a Transformer architecture, specifically BERT, to learn flexible and high quality molecular representations for drug discovery problems. We study the impact of using different combinations of self-supervised tasks for pre-training, and present our results for the established Virtual Screening and QSAR benchmarks. We show that: i) The selection of appropriate self-supervised task(s) for pre-training has a significant impact on performance in subsequent downstream tasks such as Virtual Screening. ii) Using auxiliary tasks with more domain relevance for Chemistry, such as learning to predict calculated molecular properties, increases the fidelity of our learnt representations. iii) Finally, we show that molecular representations learnt by our model `MolBert' improve upon the current state of the art on the benchmark datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2022

KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction

Designing accurate deep learning models for molecular property predictio...
research
11/29/2022

BARTSmiles: Generative Masked Language Models for Molecular Representations

We discover a robust self-supervised strategy tailored towards molecular...
research
11/12/2019

SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery

In drug-discovery-related tasks such as virtual screening, machine learn...
research
09/01/2023

Geometry-aware Line Graph Transformer Pre-training for Molecular Property Prediction

Molecular property prediction with deep learning has gained much attenti...
research
07/20/2023

Fractional Denoising for 3D Molecular Pre-training

Coordinate denoising is a promising 3D molecular pre-training method, wh...
research
10/23/2019

Emergent Properties of Finetuned Language Representation Models

Large, self-supervised transformer-based language representation models ...
research
02/19/2023

Evaluating Representations with Readout Model Switching

Although much of the success of Deep Learning builds on learning good re...

Please sign up or login with your details

Forgot password? Click here to reset