Cross-institution text mining to uncover clinical associations: a case study relating social factors and code status in intensive care medicine

01/16/2023
by   Madhumita Sushil, et al.
0

Objective: Text mining of clinical notes embedded in electronic medical records is increasingly used to extract patient characteristics otherwise not or only partly available, to assess their association with relevant health outcomes. As manual data labeling needed to develop text mining models is resource intensive, we investigated whether off-the-shelf text mining models developed at external institutions, together with limited within-institution labeled data, could be used to reliably extract study variables to conduct association studies. Materials and Methods: We developed multiple text mining models on different combinations of within-institution and external-institution data to extract social factors from discharge reports of intensive care patients. Subsequently, we assessed the associations between social factors and having a do-not-resuscitate/intubate code. Results: Important differences were found between associations based on manually labeled data compared to text-mined social factors in three out of five cases. Adopting external-institution text mining models using manually labeled within-institution data resulted in models with higher F1-scores, but not in meaningfully different associations. Discussion: While text mining facilitated scaling analyses to larger samples leading to discovering a larger number of associations, the estimates may be unreliable. Confirmation is needed with better text mining models, ideally on a larger manually labeled dataset. Conclusion: The currently used text mining models were not sufficiently accurate to be used reliably in an association study. Model adaptation using within-institution data did not improve the estimates. Further research is needed to set conditions for reliable use of text mining in medical research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2019

Unsupervised Extraction of Phenotypes from Cancer Clinical Notes for Association Studies

The recent adoption of Electronic Health Records (EHRs) by health care p...
research
12/24/2022

A Marker-based Neural Network System for Extracting Social Determinants of Health

Objective. The impact of social determinants of health (SDoH) on patient...
research
06/20/2023

ChatGPT Chemistry Assistant for Text Mining and Prediction of MOF Synthesis

We use prompt engineering to guide ChatGPT in the automation of text min...
research
11/02/2022

A study linking patient EHR data to external death data at Stanford Medicine

This manuscript explores linking real-world patient data with external d...
research
09/11/2023

Zero-shot Learning with Minimum Instruction to Extract Social Determinants and Family History from Clinical Notes using GPT Model

Demographics, Social determinants of health, and family history document...
research
12/17/2021

Matching Social Issues to Technologies for Civic Tech by Association Rule Mining using Weighted Casual Confidence

More than 80 civic tech communities in Japan are developing information ...

Please sign up or login with your details

Forgot password? Click here to reset