Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain

05/19/2020
by   Lukas Lange, et al.
0

Exploiting natural language processing in the clinical domain requires de-identification, i.e., anonymization of personal information in texts. However, current research considers de-identification and downstream tasks, such as concept extraction, only in isolation and does not study the effects of de-identification on other tasks. In this paper, we close this gap by reporting concept extraction performance on automatically anonymized data and investigating joint models for de-identification and concept extraction. In particular, we propose a stacked model with restricted access to privacy-sensitive information and a multitask model. We set the new state of the art on benchmark datasets in English (96.1 88.9

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset