Estimating Redundancy in Clinical Text

05/25/2021
by   Thomas Searle, et al.
0

The current mode of use of Electronic Health Record (EHR) elicits text redundancy. Clinicians often populate new documents by duplicating existing notes, then updating accordingly. Data duplication can lead to a propagation of errors, inconsistencies and misreporting of care. Therefore, quantifying information redundancy can play an essential role in evaluating innovations that operate on clinical narratives. This work is a quantitative examination of information redundancy in EHR notes. We present and evaluate two strategies to measure redundancy: an information-theoretic approach and a lexicosyntactic and semantic model. We evaluate the measures by training large Transformer-based language models using clinical text from a large openly available US-based ICU dataset and a large multi-site UK based Trust. By comparing the information-theoretic content of the trained models with open-domain language models, the language models trained using clinical text have shown  1.5x to  3x less efficient than open-domain corpora. Manual evaluation shows a high correlation with lexicosyntactic and semantic redundancy, with averages  43 to  65

READ FULL TEXT
research
10/30/2019

Phenotyping of Clinical Notes with Improved Document Classification Models Using Contextualized Neural Language Models

Clinical notes contain an extensive record of a patient's health status,...
research
09/01/2023

Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes

The development of large language models tailored for handling patients'...
research
02/16/2023

Do We Still Need Clinical Language Models?

Although recent advances in scaling large language models (LLMs) have re...
research
05/21/2018

Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers

Numeracy is the ability to understand and work with numbers. It is a nec...
research
10/12/2022

Developing a general-purpose clinical language inference model from a large corpus of clinical notes

Several biomedical language models have already been developed for clini...
research
12/23/2021

Towards more patient friendly clinical notes through language models and ontologies

Clinical notes are an efficient way to record patient information but ar...
research
08/03/2023

Evaluating ChatGPT text-mining of clinical records for obesity monitoring

Background: Veterinary clinical narratives remain a largely untapped res...

Please sign up or login with your details

Forgot password? Click here to reset