Publicly Available Clinical BERT Embeddings

04/06/2019
by   Emily Alsentzer, et al.
0

Contextual word embedding models such as ELMo (Peters et al., 2018) and BERT (Devlin et al., 2018) have dramatically improved performance for many natural language processing (NLP) tasks in recent months. However, these models have been minimally explored on specialty corpora, such as clinical text; moreover, in the clinical domain, no publicly-available pre-trained BERT models yet exist. In this work, we address this need by exploring and releasing BERT models for clinical text: one for generic clinical text and another for discharge summaries specifically. We demonstrate that using a domain-specific model yields performance improvements on three common clinical NLP tasks as compared to nonspecific embeddings. These domain-specific models are not as performant on two clinical de-identification tasks, and argue that this is a natural consequence of the differences between de-identified source text and synthetically non de-identified task text.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2022

AKI-BERT: a Pre-trained Clinical Language Model for Early Prediction of Acute Kidney Injury

Acute kidney injury (AKI) is a common clinical syndrome characterized by...
research
09/01/2022

Which anonymization technique is best for which NLP task? – It depends. A Systematic Study on Clinical Text Processing

Clinical text processing has gained more and more attention in recent ye...
research
03/31/2023

Attention is Not Always What You Need: Towards Efficient Classification of Domain-Specific Text

For large-scale IT corpora with hundreds of classes organized in a hiera...
research
08/02/2023

Bio+Clinical BERT, BERT Base, and CNN Performance Comparison for Predicting Drug-Review Satisfaction

The objective of this study is to develop natural language processing (N...
research
10/14/2021

BI-RADS BERT Using Section Tokenization to Understand Radiology Reports

Radiology reports are the main form of communication between radiologist...
research
03/09/2022

Pretrained Domain-Specific Language Model for General Information Retrieval Tasks in the AEC Domain

As an essential task for the architecture, engineering, and construction...
research
07/08/2022

A Medical Information Extraction Workbench to Process German Clinical Text

Background: In the information extraction and natural language processin...

Please sign up or login with your details

Forgot password? Click here to reset