Investigating Software Usage in the Social Sciences: A Knowledge Graph Approach

03/24/2020
by   David Schindler, et al.
0

Knowledge about the software used in scientific investigations is necessary for different reasons, including provenance of the results, measuring software impact to attribute developers, and bibliometric software citation analysis in general. Additionally, providing information about whether and how the software and the source code are available allows an assessment about the state and role of open source software in science in general. While such analyses can be done manually, large scale analyses require the application of automated methods of information extraction and linking. In this paper, we present SoftwareKG - a knowledge graph that contains information about software mentions from more than 51,000 scientific articles from the social sciences. A silver standard corpus, created by a distant and weak supervision approach, and a gold standard corpus, created by manual annotation, were used to train an LSTM based neural network to identify software mentions in scientific articles. The model achieves a recognition rate of .82 F-score in exact matches. As a result, we identified more than 133,000 software mentions. For entity disambiguation, we used the public domain knowledge base DBpedia. Furthermore, we linked the entities of the knowledge graph to other knowledge bases such as the Microsoft Academic Knowledge Graph, the Software Ontology, and Wikidata. Finally, we illustrate, how SoftwareKG can be used to assess the role of software in the social sciences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2021

SoMeSci- A 5 Star Open Data Gold Standard Knowledge Graph of Software Mentions in Scientific Articles

Knowledge about software used in scientific investigations is important ...
research
09/21/2021

WorldKG: A World-Scale Geographic Knowledge Graph

OpenStreetMap is a rich source of openly available geographic informatio...
research
12/01/2020

Creating a Scholarly Knowledge Graph from Survey Article Tables

Due to the lack of structure, scholarly knowledge remains hardly accessi...
research
08/12/2022

Autonomous Intelligent Software Development

We present an overview of the design and first proof-of-concept implemen...
research
02/01/2022

Semantic Annotation and Querying Framework based on Semi-structured Ayurvedic Text

Knowledge bases (KB) are an important resource in a number of natural la...
research
09/15/2022

Gollum: A Gold Standard for Large Scale Multi Source Knowledge Graph Matching

The number of Knowledge Graphs (KGs) generated with automatic and manual...
research
05/17/2022

Global Contentious Politics Database (GLOCON) Annotation Manuals

The database creation utilized automated text processing tools that dete...

Please sign up or login with your details

Forgot password? Click here to reset