Cross-Lingual Knowledge Transfer for Clinical Phenotyping

Clinical phenotyping enables the automatic extraction of clinical conditions from patient records, which can be beneficial to doctors and clinics worldwide. However, current state-of-the-art models are mostly applicable to clinical notes written in English. We therefore investigate cross-lingual knowledge transfer strategies to execute this task for clinics that do not use the English language and have a small amount of in-domain data available. We evaluate these strategies for a Greek and a Spanish clinic leveraging clinical notes from different clinical domains such as cardiology, oncology and the ICU. Our results reveal two strategies that outperform the state-of-the-art: Translation-based methods in combination with domain-specific encoders and cross-lingual encoders plus adapters. We find that these strategies perform especially well for classifying rare phenotypes and we advise on which method to prefer in which situation. Our results show that using multilingual data overall improves clinical phenotyping models and can compensate for data sparseness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2023

Multilingual Clinical NER: Translation or Cross-lingual Transfer?

Natural language tasks like Named Entity Recognition (NER) in the clinic...
research
09/15/2021

Cross-lingual Transfer of Monolingual Models

Recent studies in zero-shot cross-lingual learning using multilingual mo...
research
09/16/2019

Bridging the domain gap in cross-lingual document classification

The scarcity of labeled training data often prohibits the internationali...
research
05/30/2021

Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking

Injecting external domain-specific knowledge (e.g., UMLS) into pretraine...
research
10/19/2016

Cross-Lingual Syntactic Transfer with Limited Resources

We describe a simple but effective method for cross-lingual syntactic tr...
research
04/10/2022

Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of Code-Mixed Clinical Texts

Despite the advances in digital healthcare systems offering curated stru...
research
11/25/2016

Bidirectional LSTM-CRF for Clinical Concept Extraction

Automated extraction of concepts from patient clinical records is an ess...

Please sign up or login with your details

Forgot password? Click here to reset