FiLiPo: A Sample Driven Approach for Finding Linkage Points between RDF Data and APIs (Extended Version)

03/10/2021
by   Tobias Zeimetz, et al.
0

Data integration is an important task in order to create comprehensive RDF knowledge bases. Many data sources are used to extend a given dataset or to correct errors. Since several data providers make their data publicly available only via Web APIs they also must be included in the integration process. However, APIs often come with limitations in terms of access frequencies and speed due to latencies and other constraints. On the other hand, APIs always provide access to the latest data. So far, integrating APIs has been mainly a manual task due to the heterogeneity of API responses. To tackle this problem we present in this paper the FiLiPo (Finding Linkage Points) system which automatically finds connections (i.e., linkage points) between data provided by APIs and local knowledge bases. FiLiPo is an open source sample-driven schema matching system that models API services as parameterized queries. Furthermore, our approach is able to find valid input values for APIs automatically (e.g. IDs) and can determine valid alignments between KBs and APIs. Our results on ten pairs of KBs and APIs show that FiLiPo performs well in terms of precision and recall and outperforms the current state-of-the-art system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2019

Reasoning about disclosure in data integration in the presence of source constraints

Data integration systems allow users to access data sitting in multiple ...
research
07/20/2020

A Model-based Chatbot Generation Approach to Converse with Open Data Sources

The Open Data movement promotes the free distribution of data. More and ...
research
01/26/2018

Automatically Extracting Web API Specifications from HTML Documentation

Web API specifications are machine-readable descriptions of APIs. These ...
research
03/17/2022

SemTUI: a Framework for the Interactive Semantic Enrichment of Tabular Data

The large availability of datasets fosters the use of ml and ai technolo...
research
12/23/2020

TicketTalk: Toward human-level performance with end-to-end, transaction-based dialog systems

We present a data-driven, end-to-end approach to transaction-based dialo...
research
06/14/2023

A statistical approach for finding property-access errors

We study the problem of finding incorrect property accesses in JavaScrip...

Please sign up or login with your details

Forgot password? Click here to reset