-
Enriching BERT with Knowledge Graph Embeddings for Document Classification
In this paper, we focus on the classification of books using short descr...
read it
-
Dual Pointer Network for Fast Extraction of Multiple Relations in a Sentence
Relation extraction is a type of information extraction task that recogn...
read it
-
FALCON 2.0: An Entity and Relation Linking framework over Wikidata
Natural Language Processing (NLP) tools and frameworks have significantl...
read it
-
FALCON 2.0: An Entity and Relation Linking Tool over Wikidata
Natural Language Processing (NLP) tools and frameworks have significantl...
read it
-
Coreference Resolution in Research Papers from Multiple Domains
Coreference resolution is essential for automatic text understanding to ...
read it
-
Annotating and normalizing biomedical NEs with limited knowledge
Named entity recognition (NER) is the very first step in the linguistic ...
read it
-
Using Spatial Pooler of Hierarchical Temporal Memory to classify noisy videos with predefined complexity
This paper examines the performance of a Spatial Pooler (SP) of a Hierar...
read it
Building a PubMed knowledge graph
PubMed is an essential resource for the medical domain, but useful concepts are either difficult to extract or are ambiguated, which has significantly hindered knowledge discovery. To address this issue, we constructed a PubMed knowledge graph (PKG) by extracting bio-entities from 29 million PubMed abstracts, disambiguating author names, integrating funding data through the National Institutes of Health (NIH) ExPORTER, collecting affiliation history and educational background of authors from ORCID, and identifying fine-grained affiliation data from MapAffil. Through the integration of the credible multi-source data, we could create connections among the bio-entities, authors, articles, affiliations, and funding. Data validation revealed that the BioBERT deep learning method of bio-entity extraction significantly outperformed the state-of-the-art models based on the F1 score (by 0.51 disambiguation (AND) achieving a F1 score of 98.09 innovations, not only enabling us to measure scholarly impact, knowledge usage, and knowledge transfer, but also assisting us in profiling authors and organizations based on their connections with bio-entities. The PKG is freely available on Figshare (https://figshare.com/s/6327a55355fc2c99f3a2, simplified version that exclude PubMed raw data) and TACC website (http://er.tacc.utexas.edu/datasets/ped, full version).
READ FULL TEXT
Comments
There are no comments yet.