Improving Medical Short Text Classification with Semantic Expansion Using Word-Cluster Embedding

12/05/2018
by   Ying Shen, et al.
0

Automatic text classification (TC) research can be used for real-world problems such as the classification of in-patient discharge summaries and medical text reports, which is beneficial to make medical documents more understandable to doctors. However, in electronic medical records (EMR), the texts containing sentences are shorter than that in general domain, which leads to the lack of semantic features and the ambiguity of semantic. To tackle this challenge, we propose to add word-cluster embedding to deep neural network for improving short text classification. Concretely, we first use hierarchical agglomerative clustering to cluster the word vectors in the semantic space. Then we calculate the cluster center vector which represents the implicit topic information of words in the cluster. Finally, we expand word vector with cluster center vector, and implement classifiers using CNN and LSTM respectively. To evaluate the performance of our proposed method, we conduct experiments on public data sets TREC and the medical short sentences data sets which is constructed and released by us. The experimental results demonstrate that our proposed method outperforms state-of-the-art baselines in short sentence classification on both medical domain and general domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2019

tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification

The use of background knowledge remains largely unexploited in many text...
research
01/20/2020

Short Text Classification via Term Graph

Short text classi cation is a method for classifying short sentence with...
research
11/24/2020

Neural Text Classification by Jointly Learning to Cluster and Align

Distributional text clustering delivers semantically informative represe...
research
09/13/2021

Embedding Convolutions for Short Text Extreme Classification with Millions of Labels

Automatic annotation of short-text data to a large number of target labe...
research
08/30/2017

End-to-end Learning for Short Text Expansion

Effectively making sense of short texts is a critical task for many real...
research
05/21/2021

Word-level Text Highlighting of Medical Texts forTelehealth Services

The medical domain is often subject to information overload. The digitiz...
research
02/26/2019

Semantic Hilbert Space for Text Representation Learning

Capturing the meaning of sentences has long been a challenging task. Cur...

Please sign up or login with your details

Forgot password? Click here to reset