SECNLP: A Survey of Embeddings in Clinical Natural Language Processing

03/04/2019
by   Kalyan KS, et al.
0

Traditional representations like Bag of words are high dimensional, sparse and ignore the order as well as syntactic and semantic information. Distributed vector representations or embeddings map variable length text to dense fixed length vectors as well as capture the prior knowledge which can transferred to downstream tasks. Even though embedding has become de facto standard for representations in deep learning based NLP tasks in both general and clinical domains, there is no survey paper which presents a detailed review of embeddings in Clinical Natural Language Processing. In this survey paper, we discuss various medical corpora and their characteristics, medical codes and present a brief overview as well as comparison of popular embeddings models. We classify clinical embeddings into nine types and discuss each embedding type in detail. We discuss various evaluation methods followed by possible solutions to various challenges in clinical embeddings. Finally, we conclude with some of the future directions which will advance the research in clinical embeddings.

READ FULL TEXT
research
02/22/2019

Enhancing Clinical Concept Extraction with Contextual Embedding

Neural network-based representations ("embeddings") have dramatically ad...
research
09/15/2021

Learning Mathematical Properties of Integers

Embedding words in high-dimensional vector spaces has proven valuable in...
research
11/03/2018

Learning Contextual Hierarchical Structure of Medical Concepts with Poincairé Embeddings to Clarify Phenotypes

Biomedical association studies are increasingly done using clinical conc...
research
02/18/2022

Evaluating the Construct Validity of Text Embeddings with Application to Survey Questions

Text embedding models from Natural Language Processing can map text data...
research
04/17/2021

Robust Embeddings Via Distributions

Despite recent monumental advances in the field, many Natural Language P...
research
10/01/2019

Writing habits and telltale neighbors: analyzing clinical concept usage patterns with sublanguage embeddings

Natural language processing techniques are being applied to increasingly...
research
06/25/2017

Automated text summarisation and evidence-based medicine: A survey of two domains

The practice of evidence-based medicine (EBM) urges medical practitioner...

Please sign up or login with your details

Forgot password? Click here to reset