The Semantic Scholar Open Data Platform

01/24/2023
by   Rodney Kinney, et al.
16

The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field. Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature. We combine public and proprietary data sources using state-of-the-art techniques for scholarly PDF content extraction and automatic knowledge graph construction to build the Semantic Scholar Academic Graph, the largest open scientific literature graph to-date, with 200M+ papers, 80M+ authors, 550M+ paper-authorship edges, and 2.4B+ citation edges. The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings. In this paper, we describe the components of the S2 data processing pipeline and the associated APIs offered by the platform. We will update this living document to reflect changes as we add new data offerings and improve existing services.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2019

S2ORC: The Semantic Scholar Open Research Corpus

We introduce S2ORC, a large contextual citation graph of English-languag...
research
08/07/2023

SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples

We present SemOpenAlex, an extensive RDF knowledge graph that contains o...
research
11/29/2022

Improving astroBERT using Semantic Textual Similarity

The NASA Astrophysics Data System (ADS) is an essential tool for researc...
research
11/07/2019

GORC: A large contextual citation graph of academic papers

We introduce the Semantic Scholar Graph of References in Context (GORC),...
research
06/03/2021

CitationIE: Leveraging the Citation Graph for Scientific Information Extraction

Automatically extracting key information from scientific documents has t...
research
08/25/2019

Unsupervised Construction of Knowledge Graphs From Text and Code

The scientific literature is a rich source of information for data minin...
research
05/06/2018

Construction of the Literature Graph in Semantic Scholar

We describe a deployed scalable system for organizing published scientif...

Please sign up or login with your details

Forgot password? Click here to reset