GERNERMED++: Transfer Learning in German Medical NLP

06/29/2022
by   Johann Frei, et al.
0

We present a statistical model for German medical natural language processing trained for named entity recognition (NER) as an open, publicly available model. The work serves as a refined successor to our first GERNERMED model which is substantially outperformed by our work. We demonstrate the effectiveness of combining multiple techniques in order to achieve strong results in entity recognition performance by the means of transfer-learning on pretrained deep language models (LM), word-alignment and neural machine translation. Due to the sparse situation on open, public medical entity recognition models for German texts, this work offers benefits to the German research community on medical NLP as a baseline model. Since our model is based on public English data, its weights are provided without legal restrictions on usage and distribution. The sample code and the statistical model is available at: https://github.com/frankkramer-lab/GERNERMED-pp

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/24/2021

GERNERMED – An Open German Medical NER Model

The current state of adoption of well-structured electronic health recor...
research
03/07/2023

German BERT Model for Legal Named Entity Recognition

The use of BERT, one of the most popular language models, has led to imp...
research
10/21/2020

German's Next Language Model

In this work we present the experiments which lead to the creation of ou...
research
10/14/2021

Understanding Model Robustness to User-generated Noisy Texts

Sensitivity of deep-neural models to input noise is known to be a challe...
research
08/30/2022

Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP

Obtaining text datasets with semantic annotations is an effortful proces...
research
04/27/2023

ViMQ: A Vietnamese Medical Question Dataset for Healthcare Dialogue System Development

Existing medical text datasets usually take the form of ques- tion and a...
research
11/20/2017

Optical Character Recognition (OCR) for Telugu: Database, Algorithm and Application

Telugu is a Dravidian language spoken by more than 80 million people wor...

Please sign up or login with your details

Forgot password? Click here to reset