Generalized and Transferable Patient Language Representation for Phenotyping with Limited Data

02/24/2021
by   Yuqi Si, et al.
0

The paradigm of representation learning through transfer learning has the potential to greatly enhance clinical natural language processing. In this work, we propose a multi-task pre-training and fine-tuning approach for learning generalized and transferable patient representations from medical language. The model is first pre-trained with different but related high-prevalence phenotypes and further fine-tuned on downstream target tasks. Our main contribution focuses on the impact this technique can have on low-prevalence phenotypes, a challenging task due to the dearth of data. We validate the representation from pre-training, and fine-tune the multi-task pre-trained models on low-prevalence phenotypes including 38 circulatory diseases, 23 respiratory diseases, and 17 genitourinary diseases. We find multi-task pre-training increases learning efficiency and achieves consistently high performance across the majority of phenotypes. Most important, the multi-task pre-training is almost always either the best-performing model or performs tolerably close to the best-performing model, a property we refer to as robust. All these results lead us to conclude that this multi-task transfer learning architecture is a robust approach for developing generalized and transferable patient language representations for numerous phenotypes.

READ FULL TEXT
06/24/2022

MVP: Multi-task Supervised Pre-training for Natural Language Generation

Pre-trained language models (PLMs) have achieved notable success in natu...
06/17/2022

Using Transfer Learning for Code-Related Tasks

Deep learning (DL) techniques have been used to support several code-rel...
01/26/2021

Muppet: Massive Multi-task Representations with Pre-Finetuning

We propose pre-finetuning, an additional large-scale learning stage betw...
08/05/2020

MultiCheXNet: A Multi-Task Learning Deep Network For Pneumonia-like Diseases Diagnosis From X-ray Scans

We present MultiCheXNet, an end-to-end Multi-task learning model, that i...
06/15/2021

Multivariate Business Process Representation Learning utilizing Gramian Angular Fields and Convolutional Neural Networks

Learning meaningful representations of data is an important aspect of ma...
04/13/2021

What's in your Head? Emergent Behaviour in Multi-Task Transformer Models

The primary paradigm for multi-task training in natural language process...
07/21/2022

Unsupervised pre-training of graph transformers on patient population graphs

Pre-training has shown success in different areas of machine learning, s...