DeepAI AI Chat
Log In Sign Up

Schemaless Queries over Document Tables with Dependencies

by   Mustafa Canim, et al.

Unstructured enterprise data such as reports, manuals and guidelines often contain tables. The traditional way of integrating data from these tables is through a two-step process of table detection/extraction and mapping the table layouts to an appropriate schema. This can be an expensive process. In this paper we show that by using semantic technologies (RDF/SPARQL and database dependencies) paired with a simple but powerful way to transform tables with non-relational layouts, it is possible to offer query answering services over these tables with minimal manual work or domain-specific mappings. Our method enables users to exploit data in tables embedded in documents with little effort, not only for simple retrieval queries, but also for structured queries that require joining multiple interrelated tables.


page 1

page 2

page 5

page 7

page 8

page 9

page 10

page 11


Semantic Table Retrieval using Keyword and Table Queries

Tables on the Web contain a vast amount of knowledge in a structured for...

Integrating and querying similar tables from PDF documents using deep learning

Large amount of public data produced by enterprises are in semi-structur...

TabVec: Table Vectors for Classification of Web Tables

There are hundreds of millions of tables in Web pages that contain usefu...

Query Significance in Databases via Randomizations

Many sorts of structured data are commonly stored in a multi-relational ...

Synthesizing Mapping Relationships Using Table Corpus

Mapping relationships, such as (country, country-code) or (company, stoc...

The quantification of Simpsons paradox and other contributions to contingency table theory

The analysis of contingency tables is a powerful statistical tool used i...

Exploring Query Results

Users typically interact with a database by asking queries and examining...