Topic Modeling for Classification of Clinical Reports

06/19/2017
by   Efsun Sarioglu Kayi, et al.
0

Electronic health records (EHRs) contain important clinical information about patients. Efficient and effective use of this information could supplement or even replace manual chart review as a means of studying and improving the quality and safety of healthcare delivery. However, some of these clinical data are in the form of free text and require pre-processing before use in automated systems. A common free text data source is radiology reports, typically dictated by radiologists to explain their interpretations. We sought to demonstrate machine learning classification of computed tomography (CT) imaging reports into binary outcomes, i.e. positive and negative for fracture, using regular text classification and classifiers based on topic modeling. Topic modeling provides interpretable themes (topic distributions) in reports, a representation that is more compact than the commonly used bag-of-words representation and can be processed faster than raw text in subsequent automated processes. We demonstrate new classifiers based on this topic modeling representation of the reports. Aggregate topic classifier (ATC) and confidence-based topic classifier (CTC) use a single topic that is determined from the training dataset based on different measures to classify the reports on the test dataset. Alternatively, similarity-based topic classifier (STC) measures the similarity between the reports' topic distributions to determine the predicted class. Our proposed topic modeling-based classifier systems are shown to be competitive with existing text classification techniques and provides an efficient and interpretable representation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2020

Comparative Analysis of Text Classification Approaches in Electronic Health Records

Text classification tasks which aim at harvesting and/or organizing info...
research
07/06/2016

Bag of Tricks for Efficient Text Classification

This paper explores a simple and efficient baseline for text classificat...
research
01/13/2023

Natural Language Processing of Aviation Occurrence Reports for Safety Management

Occurrence reporting is a commonly used method in safety management syst...
research
10/31/2019

Human-centric Metric for Accelerating Pathology Reports Annotation

Pathology reports contain useful information such as the main involved o...
research
02/18/2021

Regular Expressions for Fast-response COVID-19 Text Classification

Text classifiers are at the core of many NLP applications and use a vari...
research
11/14/2018

From Free Text to Clusters of Content in Health Records: An Unsupervised Graph Partitioning Approach

Electronic Healthcare records contain large volumes of unstructured data...
research
07/07/2018

From Text to Topics in Healthcare Records: An Unsupervised Graph Partitioning Methodology

Electronic Healthcare Records contain large volumes of unstructured data...

Please sign up or login with your details

Forgot password? Click here to reset