Integration of Domain Knowledge using Medical Knowledge Graph Deep Learning for Cancer Phenotyping

01/05/2021
by   Mohammed Alawad, et al.
0

A key component of deep learning (DL) for natural language processing (NLP) is word embeddings. Word embeddings that effectively capture the meaning and context of the word that they represent can significantly improve the performance of downstream DL models for various NLP tasks. Many existing word embeddings techniques capture the context of words based on word co-occurrence in documents and text; however, they often cannot capture broader domain-specific relationships between concepts that may be crucial for the NLP task at hand. In this paper, we propose a method to integrate external knowledge from medical terminology ontologies into the context captured by word embeddings. Specifically, we use a medical knowledge graph, such as the unified medical language system (UMLS), to find connections between clinical terms in cancer pathology reports. This approach aims to minimize the distance between connected clinical concepts. We evaluate the proposed approach using a Multitask Convolutional Neural Network (MT-CNN) to extract six cancer characteristics – site, subsite, laterality, behavior, histology, and grade – from a dataset of  900K cancer pathology reports. The results show that the MT-CNN model which uses our domain informed embeddings outperforms the same MT-CNN using standard word2vec embeddings across all tasks, with an improvement in the overall micro- and macro-F1 scores by 4.97%and 22.5%, respectively.

READ FULL TEXT

page 1

page 4

research
06/02/2023

Word Embeddings for Banking Industry

Applications of Natural Language Processing (NLP) are plentiful, from se...
research
10/28/2017

Partial Knowledge In Embeddings

Representing domain knowledge is crucial for any task. There has been a ...
research
07/31/2020

Model Reduction of Shallow CNN Model for Reliable Deployment of Information Extraction from Medical Reports

Shallow Convolution Neural Network (CNN) is a time-tested tool for the i...
research
11/09/2022

Combining Contrastive Learning and Knowledge Graph Embeddings to develop medical word embeddings for the Italian language

Word embeddings play a significant role in today's Natural Language Proc...
research
12/05/2017

AWE-CM Vectors: Augmenting Word Embeddings with a Clinical Metathesaurus

In recent years, word embeddings have been surprisingly effective at cap...
research
04/19/2022

Unsupervised Numerical Reasoning to Extract Phenotypes from Clinical Text by Leveraging External Knowledge

Extracting phenotypes from clinical text has been shown to be useful for...
research
07/17/2018

Clinical Text Classification with Rule-based Features and Knowledge-guided Convolutional Neural Networks

Clinical text classification is an important problem in medical natural ...

Please sign up or login with your details

Forgot password? Click here to reset