An Automated Method to Enrich Consumer Health Vocabularies Using GloVe Word Embeddings and An Auxiliary Lexical Resource

05/18/2021
by   Mohammed Ibrahim, et al.
0

Background: Clear language makes communication easier between any two parties. A layman may have difficulty communicating with a professional due to not understanding the specialized terms common to the domain. In healthcare, it is rare to find a layman knowledgeable in medical terminology which can lead to poor understanding of their condition and/or treatment. To bridge this gap, several professional vocabularies and ontologies have been created to map laymen medical terms to professional medical terms and vice versa. Objective: Many of the presented vocabularies are built manually or semi-automatically requiring large investments of time and human effort and consequently the slow growth of these vocabularies. In this paper, we present an automatic method to enrich laymen's vocabularies that has the benefit of being able to be applied to vocabularies in any domain. Methods: Our entirely automatic approach uses machine learning, specifically Global Vectors for Word Embeddings (GloVe), on a corpus collected from a social media healthcare platform to extend and enhance consumer health vocabularies (CHV). Our approach further improves the CHV by incorporating synonyms and hyponyms from the WordNet ontology. The basic GloVe and our novel algorithms incorporating WordNet were evaluated using two laymen datasets from the National Library of Medicine (NLM), Open-Access Consumer Health Vocabulary (OAC CHV) and MedlinePlus Healthcare Vocabulary. Results: The results show that GloVe was able to find new laymen terms with an F-score of 48.44 basic GloVe with an average F-score of 61 Furthermore, the enhanced GloVe showed a statistical significance over the two ground truth datasets with P<.001.

READ FULL TEXT
research
03/31/2020

Enriching Consumer Health Vocabulary Using Enhanced GloVe Word Embedding

Open-Access and Collaborative Consumer Health Vocabulary (OAC CHV, or CH...
research
06/25/2018

Mapping Unparalleled Clinical Professional and Consumer Languages with Embedding Alignment

Mapping and translating professional but arcane clinical jargons to cons...
research
06/14/2022

CHQ-Summ: A Dataset for Consumer Healthcare Question Summarization

The quest for seeking health information has swamped the web with consum...
research
05/09/2018

Incorporating Subword Information into Matrix Factorization Word Embeddings

The positive effect of adding subword information to word embeddings has...
research
05/18/2020

VerifyMed – A blockchain platform for transparent trust in virtualized healthcare: Proof-of-concept

Patients living in a digitized world can now interact with medical profe...
research
02/04/2019

Unsupervised Clinical Language Translation

As patients' access to their doctors' clinical notes becomes common, tra...
research
05/31/2016

Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler

Specialized dictionaries are used to understand concepts in specific dom...

Please sign up or login with your details

Forgot password? Click here to reset