The Missing Path: Diagnosing Incompleteness in Linked Data

05/16/2020
by   Marie Destandau, et al.
0

The Semantic Web is an interoperable ecosystem where data producers, such as libraries, public institutions, communities, and companies, publish and link heterogeneous resources. To support this heterogeneity, its format, RDF, allows to describe collections of items sharing some attributes but not necessarily all of them. This flexible framework leads to incompleteness and inconsistencies in information representation, which in turn leads to unreliable query results. In order to make their data reliable and usable, Linked Data producers need to provide the best level of completeness. We propose a novel visualization tool "The Missing Path" to support data producers in diagnosing incompleteness in their data. It relies on dimensional reduction techniques to create a map of RDF entities based on missing paths, revealing clusters of entities missing the same paths. The novelty of our work consists in describing the entities of interest as vectors of aggregated RDF paths of a fixed length. We show that identifying groups of items sharing a similar structure helps users find the cause of incompleteness for entire groups and allows them to decide if and how it has to be resolved. We describe our iterative design process and evaluation with Wikidata contributors.

READ FULL TEXT

page 1

page 5

research
10/06/2020

Joint Semantics and Data-Driven Path Representation for Knowledge Graph Inference

Inference on a large-scale knowledge graph (KG) is of great importance f...
research
04/16/2018

NELL2RDF: Reading the Web, and Publishing it as Linked Data

NELL is a system that continuously reads the Web to extract knowledge in...
research
09/14/2020

SPARQL with XQuery-based Filtering

Linked Open Data (LOD) has been proliferated over various domains, howev...
research
05/18/2022

Entity Alignment with Reliable Path Reasoning and Relation-Aware Heterogeneous Graph Transformer

Entity Alignment (EA) has attracted widespread attention in both academi...
research
03/22/2022

Demo of the Linguistic Field Data Management and Analysis System – LiFE

In the proposed demo, we will present a new software - Linguistic Field ...
research
02/23/2020

Path Outlines: Browsing Path-Based Summaries of Linked Open Datasets

Linked Data (LD) are structured sources of information, such as DBpedia ...

Please sign up or login with your details

Forgot password? Click here to reset