Quantification of BERT Diagnosis Generalizability Across Medical Specialties Using Semantic Dataset Distance

08/14/2020
by   Mihir P. Khambete, et al.
0

Deep learning models in healthcare may fail to generalize on data from unseen corpora. Additionally, no quantitative metric exists to tell how existing models will perform on new data. Previous studies demonstrated that NLP models of medical notes generalize variably between institutions, but ignored other levels of healthcare organization. We measured SciBERT diagnosis sentiment classifier generalizability between medical specialties using EHR sentences from MIMIC-III. Models trained on one specialty performed better on internal test sets than mixed or external test sets (mean AUCs 0.92, 0.87, and 0.83, respectively; p = 0.016). When models are trained on more specialties, they have better test performances (p < 1e-4). Model performance on new corpora is directly correlated to the similarity between train and test sentence content (p < 1e-4). Future studies should assess additional axes of generalization to ensure deep learning models fulfil their intended purpose across institutions, specialties, and practices.

READ FULL TEXT

page 5

page 6

page 19

research
12/28/2019

Natural language processing of MIMIC-III clinical notes for identifying diagnosis and procedures with neural networks

Coding diagnosis and procedures in medical records is a crucial process ...
research
09/05/2023

Sample Size in Natural Language Processing within Healthcare Research

Sample size calculation is an essential step in most data-based discipli...
research
10/08/2022

KG-MTT-BERT: Knowledge Graph Enhanced BERT for Multi-Type Medical Text Classification

Medical text learning has recently emerged as a promising area to improv...
research
04/29/2022

Making sense of violence risk predictions using clinical notes

Violence risk assessment in psychiatric institutions enables interventio...
research
08/07/2023

Revealing the Underlying Patterns: Investigating Dataset Similarity, Performance, and Generalization

Supervised deep learning models require significant amount of labelled d...
research
04/21/2021

Rethinking annotation granularity for overcoming deep shortcut learning: A retrospective study on chest radiographs

Deep learning has demonstrated radiograph screening performances that ar...

Please sign up or login with your details

Forgot password? Click here to reset