Validation of a Zero-Shot Learning Natural Language Processing Tool for Data Abstraction from Unstructured Healthcare Data

07/23/2023
by   Basil Kaufmann, et al.
0

Objectives: To describe the development and validation of a zero-shot learning natural language processing (NLP) tool for abstracting data from unstructured text contained within PDF documents, such as those found within electronic health records. Materials and Methods: A data abstraction tool based on the GPT-3.5 model from OpenAI was developed and compared to three physician human abstractors in terms of time to task completion and accuracy for abstracting data on 14 unique variables from a set of 199 de-identified radical prostatectomy pathology reports. The reports were processed by the software tool in vectorized and scanned formats to establish the impact of optical character recognition on data abstraction. The tool was assessed for superiority for data abstraction speed and non-inferiority for accuracy. Results: The human abstractors required a mean of 101s per report for data abstraction, with times varying from 15 to 284 s. In comparison, the software tool required a mean of 12.8 s to process the vectorized reports and a mean of 15.8 to process the scanned reports (P < 0.001). The overall accuracies of the three human abstractors were 94.7 2786 datapoints. The software tool had an overall accuracy of 94.2 vectorized reports, proving to be non-inferior to the human abstractors at a margin of -10 88.7 human abstractors. Conclusion: The developed zero-shot learning NLP tool affords researchers comparable levels of accuracy to that of human abstractors, with significant time savings benefits. Because of the lack of need for task-specific model training, the developed tool is highly generalizable and can be used for a wide variety of data abstraction tasks, even outside the field of medicine.

READ FULL TEXT
research
02/08/2023

Is ChatGPT a General-Purpose Natural Language Processing Task Solver?

Spurred by advancements in scale, large language models (LLMs) have demo...
research
03/09/2022

HealthPrompt: A Zero-shot Learning Paradigm for Clinical Natural Language Processing

Deep learning algorithms are dependent on the availability of large-scal...
research
05/01/2023

Company classification using zero-shot learning

In recent years, natural language processing (NLP) has become increasing...
research
07/07/2021

Neural Natural Language Processing for Unstructured Data in Electronic Health Records: a Review

Electronic health records (EHRs), digital collections of patient healthc...
research
09/14/2022

Natural Language Inference Prompts for Zero-shot Emotion Classification in Text across Corpora

Within textual emotion classification, the set of relevant labels depend...
research
01/21/2023

Improving Accuracy of Zero-Shot Action Recognition with Handcrafted Features

With the development of machine learning, datasets for models are gettin...
research
08/18/2021

AdapterHub Playground: Simple and Flexible Few-Shot Learning with Adapters

The open-access dissemination of pretrained language models through onli...

Please sign up or login with your details

Forgot password? Click here to reset