Information Extraction from Scientific Literature for Method Recommendation
As a research community grows, more and more papers are published each year. As a result there is increasing demand for improved methods for finding relevant papers, automatically understanding the key ideas and recommending potential methods for a target problem. Despite advances in search engines, it is still hard to identify new technologies according to a researcher's need. Due to the large variety of domains and extremely limited annotated resources, there has been relatively little work on leveraging natural language processing in scientific recommendation. In this proposal, we aim at making scientific recommendations by extracting scientific terms from a large collection of scientific papers and organizing the terms into a knowledge graph. In preliminary work, we trained a scientific term extractor using a small amount of annotated data and obtained state-of-the-art performance by leveraging large amount of unannotated papers through applying multiple semi-supervised approaches. We propose to construct a knowledge graph in a way that can make minimal use of hand annotated data, using only the extracted terms, unsupervised relational signals such as co-occurrence, and structural external resources such as Wikipedia. Latent relations between scientific terms can be learned from the graph. Recommendations will be made through graph inference for both observed and unobserved relational pairs.
READ FULL TEXT