Gextext: Unsupervised Knowledge Modelling in Biomedical Literature

11/06/2019
by   Robert O'Shea, et al.
0

PURPOSE: Literature review is a complex task, requiring the expert analysis of unstructured data. Computational automation of this process presents a valuable opportunity for high throughput knowledge extraction and meta analysis. Currently available methods are limited to the detection of explicit and short-context relationships. We address this challenge with Gextext, which extracts a knowledge graph of latent relationships directly from unstructured text. METHODS: Let C be a corpus of n text chunks. Let V_target be a set of query terms and V_random a random selection of terms in C. Let X indicate the occurrence of V_target and V_random in C. Gextext learns a graph G(V,E) by correlation thresholding on the covariance matrix of X, where thresholds are estimated by the correlations with randomly selected terms. Gextext was benchmarked against GloVE in tasks where embedding distance matrices were correlated against real world similarity matrices. A general corpus was generated from 5,000 randomly selected Wikipedia articles and a biomedical corpus from 961 research papers on stroke. RESULTS: Embeddings generated by Gextext preserved relative geographical distances between countries (Gextext: rho = 0.255, p < 2.22e-16; GloVE: rho = 0.086, p = 1.859e-09) and capital cities (Gextext: rho = 0.282, p < 2.22e-16 ; Glove: rho = 0.093, p = 8.0805e-11). Gextext embeddings organised drug names by shared target (Gextext: rho = 0.456, p < 2.22e-16; GloVE: rho = 0.091, p = 0.00087) and stroke phenotypes by body system (Gextext: rho = 0.446, p < 2.22e-16; GloVE: rho = 0.129, p = 1.7464e-11). CONCLUSIONS: Gextext extracts latent relationships from unstructured text, enabling fully unsupervised automation of the literature review process.

READ FULL TEXT

page 1

page 9

research
11/06/2019

Gextext: Disease Network Extraction from Biomedical Literature

PURPOSE: We propose a fully unsupervised method to learn latent disease ...
research
12/21/2020

Towards Incorporating Entity-specific Knowledge Graph Information in Predicting Drug-Drug Interactions

Off-the-shelf biomedical embeddings obtained from the recently released ...
research
07/24/2020

COVID-19 Knowledge Graph: Accelerating Information Retrieval and Discovery for Scientific Literature

The coronavirus disease (COVID-19) has claimed the lives of over 350,000...
research
04/21/2023

BERT Based Clinical Knowledge Extraction for Biomedical Knowledge Graph Construction and Analysis

Background : Knowledge is evolving over time, often as a result of new d...
research
01/17/2021

A Literature Review of Recent Graph Embedding Techniques for Biomedical Data

With the rapid development of biomedical software and hardware, a large ...
research
06/07/2023

SKG: A Versatile Information Retrieval and Analysis Framework for Academic Papers with Semantic Knowledge Graphs

The number of published research papers has experienced exponential grow...
research
07/28/2022

Knowledge-Driven Mechanistic Enrichment of the Preeclampsia Ignorome

Preeclampsia is a leading cause of maternal and fetal morbidity and mort...

Please sign up or login with your details

Forgot password? Click here to reset