Writing habits and telltale neighbors: analyzing clinical concept usage patterns with sublanguage embeddings

10/01/2019
by   Denis Newman-Griffis, et al.
0

Natural language processing techniques are being applied to increasingly diverse types of electronic health records, and can benefit from in-depth understanding of the distinguishing characteristics of medical document types. We present a method for characterizing the usage patterns of clinical concepts among different document types, in order to capture semantic differences beyond the lexical level. By training concept embeddings on clinical documents of different types and measuring the differences in their nearest neighborhood structures, we are able to measure divergences in concept usage while correcting for noise in embedding learning. Experiments on the MIMIC-III corpus demonstrate that our approach captures clinically-relevant differences in concept usage and provides an intuitive way to explore semantic characteristics of clinical document collections.

READ FULL TEXT

page 4

page 5

research
03/20/2022

Enriching Unsupervised User Embedding via Medical Concepts

Clinical notes in Electronic Health Records (EHR) present rich documente...
research
12/06/2019

Med2Meta: Learning Representations of Medical Concepts with Meta-Embeddings

Distributed representations of medical concepts have been used to suppor...
research
03/04/2019

SECNLP: A Survey of Embeddings in Clinical Natural Language Processing

Traditional representations like Bag of words are high dimensional, spar...
research
10/02/2020

Multi-domain Clinical Natural Language Processing with MedCAT: the Medical Concept Annotation Toolkit

Electronic health records (EHR) contain large volumes of unstructured te...
research
02/22/2019

Enhancing Clinical Concept Extraction with Contextual Embedding

Neural network-based representations ("embeddings") have dramatically ad...
research
06/30/2021

A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers

We present ASDiv (Academia Sinica Diverse MWP Dataset), a diverse (in te...
research
12/18/2017

An anthropological account of the Vim text editor: features and tweaks after 10 years of usage

The Vim text editor is very rich in capabilities and thus complex. This ...

Please sign up or login with your details

Forgot password? Click here to reset