DeepAI AI Chat
Log In Sign Up

Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding

by   Yanjun Gao, et al.
Loyola University Chicago
University of Wisconsin-Madison
Harvard University

Applying methods in natural language processing on electronic health records (EHR) data is a growing field. Existing corpus and annotation focus on modeling textual features and relation prediction. However, there is a paucity of annotated corpus built to model clinical diagnostic thinking, a process involving text understanding, domain knowledge abstraction and reasoning. This work introduces a hierarchical annotation schema with three stages to address clinical text understanding, clinical reasoning, and summarization. We created an annotated corpus based on an extensive collection of publicly available daily progress notes, a type of EHR documentation that is collected in time series in a problem-oriented format. The conventional format for a progress note follows a Subjective, Objective, Assessment and Plan heading (SOAP). We also define a new suite of tasks, Progress Note Understanding, with three tasks utilizing the three annotation stages. The novel suite of tasks was designed to train and evaluate future NLP models for clinical text understanding, clinical knowledge representation, inference, and summarization.


page 1

page 2

page 3

page 4


Progress Note Understanding – Assessment and Plan Reasoning: Overview of the 2022 N2C2 Track 3 Shared Task

Daily progress notes are common types in the electronic health record (E...

DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing

The meaningful use of electronic health records (EHR) continues to progr...

The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool

Objective. Annotation is expensive but essential for clinical note revie...

MedSTS: A Resource for Clinical Semantic Textual Similarity

The wide adoption of electronic health records (EHRs) has enabled a wide...

Ontology-Driven Self-Supervision for Adverse Childhood Experiences Identification Using Social Media Datasets

Adverse Childhood Experiences (ACEs) are defined as a collection of high...

Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective

This paper presents a Lisp architecture for a portable NLP system, terme...