MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining

12/27/2020
by   Zhi Wen, et al.
0

One of the biggest challenges that prohibit the use of many current NLP methods in clinical settings is the availability of public datasets. In this work, we present MeDAL, a large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain. We pre-trained several models of common architectures on this dataset and empirically showed that such pre-training leads to improved performance and convergence speed when fine-tuning on downstream medical tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/10/2021

Unified Multimodal Pre-training and Prompt-based Tuning for Vision-Language Understanding and Generation

Most existing vision-language pre-training methods focus on understandin...
research
05/27/2023

An Investigation into the Effects of Pre-training Data Distributions for Pathology Report Classification

Pre-trained transformer models have demonstrated success across many nat...
research
04/24/2020

Data Annealing for Informal Language Understanding Tasks

There is a huge performance gap between formal and informal language und...
research
11/13/2019

Unsupervised Pre-training for Natural Language Generation: A Literature Review

Recently, unsupervised pre-training is gaining increasing popularity in ...
research
12/16/2022

Decoder Tuning: Efficient Language Understanding as Decoding

With the evergrowing sizes of pre-trained models (PTMs), it has been an ...
research
05/03/2020

How Can We Accelerate Progress Towards Human-like Linguistic Generalization?

This position paper describes and critiques the Pretraining-Agnostic Ide...
research
07/21/2022

Unsupervised pre-training of graph transformers on patient population graphs

Pre-training has shown success in different areas of machine learning, s...

Please sign up or login with your details

Forgot password? Click here to reset