Large Language Models for Granularized Barrett's Esophagus Diagnosis Classification

08/16/2023
by   Jenna Kefeli, et al.
0

Diagnostic codes for Barrett's esophagus (BE), a precursor to esophageal cancer, lack granularity and precision for many research or clinical use cases. Laborious manual chart review is required to extract key diagnostic phenotypes from BE pathology reports. We developed a generalizable transformer-based method to automate data extraction. Using pathology reports from Columbia University Irving Medical Center with gastroenterologist-annotated targets, we performed binary dysplasia classification as well as granularized multi-class BE-related diagnosis classification. We utilized two clinically pre-trained large language models, with best model performance comparable to a highly tailored rule-based system developed using the same data. Binary dysplasia extraction achieves 0.964 F1-score, while the multi-class model achieves 0.911 F1-score. Our method is generalizable and faster to implement as compared to a tailored rule-based approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2020

CIA_NITT at WNUT-2020 Task 2: Classification of COVID-19 Tweets Using Pre-trained Language Models

This paper presents our models for WNUT 2020 shared task2. The shared ta...
research
09/01/2022

Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods

As structured data are often insufficient, labels need to be extracted f...
research
07/24/2020

Named entity recognition in chemical patents using ensemble of contextual language models

Chemical patent documents describe a broad range of applications holding...
research
08/21/2023

Extraction of Text from Optic Nerve Optical Coherence Tomography Reports

Purpose: The purpose of this study was to develop and evaluate rule-base...
research
01/26/2023

Task formulation for Extracting Social Determinants of Health from Clinical Narratives

Objective: The 2022 n2c2 NLP Challenge posed identification of social de...
research
12/16/2017

NegBio: a high-performance tool for negation and uncertainty detection in radiology reports

Negative and uncertain medical findings are frequent in radiology report...
research
04/24/2019

Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes

PURPOSE: The medical literature relevant to germline genetics is growing...

Please sign up or login with your details

Forgot password? Click here to reset