Transformers and the representation of biomedical background knowledge

02/04/2022
by   Oskar Wysocki, et al.
0

BioBERT and BioMegatron are Transformers models adapted for the biomedical domain based on publicly available biomedical corpora. As such, they have the potential to encode large-scale biological knowledge. We investigate the encoding and representation of biological knowledge in these models, and its potential utility to support inference in cancer precision medicine - namely, the interpretation of the clinical significance of genomic alterations. We compare the performance of different transformer baselines; we use probing to determine the consistency of encodings for distinct entities; and we use clustering methods to compare and contrast the internal properties of the embeddings for genes, variants, drugs and diseases. We show that these models do indeed encode biological knowledge, although some of this is lost in fine-tuning for specific tasks. Finally, we analyse how the models behave with regard to biases and imbalances in the dataset.

READ FULL TEXT

page 13

page 14

page 15

page 16

page 28

research
09/03/2021

Biomedical Data-to-Text Generation via Fine-Tuning Transformers

Data-to-text (D2T) generation in the biomedical domain is a promising - ...
research
11/01/2020

Investigation of BERT Model on Biomedical Relation Extraction Based on Revised Fine-tuning Mechanism

With the explosive growth of biomedical literature, designing automatic ...
research
09/07/2022

On the Effectiveness of Compact Biomedical Transformers

Language models pre-trained on biomedical corpora, such as BioBERT, have...
research
06/07/2023

Evaluation of ChatGPT on Biomedical Tasks: A Zero-Shot Comparison with Fine-Tuned Generative Transformers

ChatGPT is a large language model developed by OpenAI. Despite its impre...
research
06/17/2021

Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study

Biomedical knowledge graphs (KGs) hold rich information on entities such...
research
01/25/2022

Do Transformers Encode a Foundational Ontology? Probing Abstract Classes in Natural Language

With the methodological support of probing (or diagnostic classification...
research
07/01/2021

Multimodal Graph-based Transformer Framework for Biomedical Relation Extraction

The recent advancement of pre-trained Transformer models has propelled t...

Please sign up or login with your details

Forgot password? Click here to reset