Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language

03/06/2023
by   Philipp Seidl, et al.
0

Activity and property prediction models are the central workhorses in drug discovery and materials sciences, but currently they have to be trained or fine-tuned for new tasks. Without training or fine-tuning, scientific language models could be used for such low-data tasks through their announced zero- and few-shot capabilities. However, their predictive quality at activity prediction is lacking. In this work, we envision a novel type of activity prediction model that is able to adapt to new prediction tasks at inference time, via understanding textual information describing the task. To this end, we propose a new architecture with separate modules for chemical and natural language inputs, and a contrastive pre-training objective on data from large biochemical databases. In extensive experiments, we show that our method CLAMP yields improved predictive performance on few-shot learning benchmarks and zero-shot problems in drug discovery. We attribute the advances of our method to the modularized architecture and to our pre-training objective.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2021

Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling

Neural topic models can augment or replace bag-of-words inputs with the ...
research
04/18/2023

CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models

Large pre-trained language models (LLMs) have been shown to have signifi...
research
11/28/2020

Transformer Query-Target Knowledge Discovery (TEND): Drug Discovery from CORD-19

Previous work established skip-gram word2vec models could be used to min...
research
05/30/2023

Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages

We train a MOS prediction model based on wav2vec 2.0 using the open-acce...
research
05/21/2022

Life after BERT: What do Other Muppets Understand about Language?

Existing pre-trained transformer analysis works usually focus only on on...
research
03/21/2023

CLIP-ReIdent: Contrastive Training for Player Re-Identification

Sports analytics benefits from recent advances in machine learning provi...
research
01/29/2023

Unifying Molecular and Textual Representations via Multi-task Language Modelling

The recent advances in neural language models have also been successfull...

Please sign up or login with your details

Forgot password? Click here to reset