OBDA for the Web: Creating Virtual RDF Graphs On Top of Web Data Sources
Due to Variety, Web data come in many different structures and formats, with HTML tables and REST APIs (e.g., social media APIs) being among the most popular ones. A big subset of Web data is also characterised by Velocity, as data gets frequently updated so that consumers can obtain the most up-to-date version of the respective datasets. At the moment, though, these data sources are not effectively supported by Semantic Web tools. To address variety and velocity, we propose Ontop4theWeb, a system that maps Web data of various formats into virtual RDF triples, thus allowing for querying them on-the-fly without materializing them as RDF. We demonstrate how Ontop4theWeb can use SPARQL to uniformly query popular, but heterogeneous Web data sources, like HTML tables and Web APIs. We showcase our approach in a number of use cases, such as Twitter, Foursquare, Yelp and HTML tables. We carried out a thorough experimental evaluation which verifies the high efficiency of our framework, which goes beyond the current state-of-the-art in this area, in terms of both functionality and performance.
READ FULL TEXT