Medical Data Augmentation via ChatGPT: A Case Study on Medication Identification and Medication Event Classification

06/10/2023
by   Shouvon Sarker, et al.
0

The identification of key factors such as medications, diseases, and relationships within electronic health records and clinical notes has a wide range of applications in the clinical field. In the N2C2 2022 competitions, various tasks were presented to promote the identification of key factors in electronic health records (EHRs) using the Contextualized Medication Event Dataset (CMED). Pretrained large language models (LLMs) demonstrated exceptional performance in these tasks. This study aims to explore the utilization of LLMs, specifically ChatGPT, for data augmentation to overcome the limited availability of annotated data for identifying the key factors in EHRs. Additionally, different pre-trained BERT models, initially trained on extensive datasets like Wikipedia and MIMIC, were employed to develop models for identifying these key variables in EHRs through fine-tuning on augmented datasets. The experimental results of two EHR analysis tasks, namely medication identification and medication event classification, indicate that data augmentation based on ChatGPT proves beneficial in improving performance for both medication identification and medication event classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2020

ODVICE: An Ontology-Driven Visual Analytic Tool for Interactive Cohort Extraction

Increased availability of electronic health records (EHR) has enabled re...
research
10/11/2020

PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation

De-identification is the task of identifying protected health informatio...
research
03/05/2023

Effectiveness of Data Augmentation for Prefix Tuning with Limited Data

Recent work has demonstrated that tuning continuous prompts on large, fr...
research
03/27/2023

Adapting Pretrained Language Models for Solving Tabular Prediction Problems in the Electronic Health Record

We propose an approach for adapting the DeBERTa model for electronic hea...
research
11/13/2022

Textual Data Augmentation for Patient Outcomes Prediction

Deep learning models have demonstrated superior performance in various h...
research
01/13/2021

Adversarial Sample Enhanced Domain Adaptation: A Case Study on Predictive Modeling with Electronic Health Records

With the successful adoption of machine learning on electronic health re...
research
04/15/2021

Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?

Large Transformers pretrained over clinical notes from Electronic Health...

Please sign up or login with your details

Forgot password? Click here to reset