Comparative Analysis of Text Classification Approaches in Electronic Health Records

05/08/2020
by   Aurelie Mascio, et al.
0

Text classification tasks which aim at harvesting and/or organizing information from electronic health records are pivotal to support clinical and translational research. However these present specific challenges compared to other classification tasks, notably due to the particular nature of the medical lexicon and language used in clinical records. Recent advances in embedding methods have shown promising results for several clinical tasks, yet there is no exhaustive comparison of such approaches with other commonly used word representations and classification models. In this work, we analyse the impact of various word representations, text pre-processing and classification algorithms on the performance of four different text classification tasks. The results show that traditional approaches, when tailored to the specific language and structure of the text inherent to the classification task, can achieve or exceed the performance of more recent ones based on contextual embeddings such as BERT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/19/2020

Deep Learning for Hindi Text Classification: A Comparison

Natural Language Processing (NLP) and especially natural language text a...
research
06/19/2017

Topic Modeling for Classification of Clinical Reports

Electronic health records (EHRs) contain important clinical information ...
research
07/13/2020

An Enhanced Text Classification to Explore Health based Indian Government Policy Tweets

Government-sponsored policy-making and scheme generations is one of the ...
research
08/19/2019

A novel text representation which enables image classifiers to perform text classification, applied to name disambiguation

Patent data are often used to study the process of innovation and resear...
research
10/05/2022

Privacy-Preserving Text Classification on BERT Embeddings with Homomorphic Encryption

Embeddings, which compress information in raw text into semantics-preser...
research
10/16/2018

INFODENS: An Open-source Framework for Learning Text Representations

The advent of representation learning methods enabled large performance ...
research
03/15/2023

Rediscovery of CNN's Versatility for Text-based Encoding of Raw Electronic Health Records

Making the most use of abundant information in electronic health records...

Please sign up or login with your details

Forgot password? Click here to reset