Advancing Italian Biomedical Information Extraction with Large Language Models: Methodological Insights and Multicenter Practical Application

06/08/2023
by   Claudio Crema, et al.
0

The introduction of computerized medical records in hospitals has reduced burdensome operations like manual writing and information fetching. However, the data contained in medical records are still far underutilized, primarily because extracting them from unstructured textual medical records takes time and effort. Information Extraction, a subfield of Natural Language Processing, can help clinical practitioners overcome this limitation, using automated text-mining pipelines. In this work, we created the first Italian neuropsychiatric Named Entity Recognition dataset, PsyNIT, and used it to develop a Large Language Model for this task. Moreover, we conducted several experiments with three external independent datasets to implement an effective multicenter model, with overall F1-score 84.77 86.44 annotation process and (ii) a fine-tuning strategy that combines classical methods with a "few-shot" approach. This allowed us to establish methodological guidelines that pave the way for future implementations in this field and allow Italian hospitals to tap into important research opportunities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2020

Med7: a transferable clinical natural language processing model for electronic health records

The field of clinical natural language processing has been advanced sign...
research
06/27/2023

CamemBERT-bio: a Tasty French Language Model Better for your Health

Clinical data in hospitals are increasingly accessible for research thro...
research
01/06/2019

Named Entity Recognition in Electronic Health Records Using Transfer Learning Bootstrapped Neural Networks

Neural networks (NNs) have become the state of the art in many machine l...
research
10/02/2020

Multi-domain Clinical Natural Language Processing with MedCAT: the Medical Concept Annotation Toolkit

Electronic health records (EHR) contain large volumes of unstructured te...
research
03/08/2023

Does Synthetic Data Generation of LLMs Help Clinical Text Mining?

Recent advancements in large language models (LLMs) have led to the deve...
research
04/18/2023

A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese

Textual health records of cancer patients are usually protracted and hig...
research
04/27/2020

Automatic Textual Evidence Mining in COVID-19 Literature

We created this EVIDENCEMINER system for automatic textual evidence mini...

Please sign up or login with your details

Forgot password? Click here to reset