DeepAI AI Chat
Log In Sign Up

Schemaless Queries over Document Tables with Dependencies

11/21/2019
by   Mustafa Canim, et al.
11

Unstructured enterprise data such as reports, manuals and guidelines often contain tables. The traditional way of integrating data from these tables is through a two-step process of table detection/extraction and mapping the table layouts to an appropriate schema. This can be an expensive process. In this paper we show that by using semantic technologies (RDF/SPARQL and database dependencies) paired with a simple but powerful way to transform tables with non-relational layouts, it is possible to offer query answering services over these tables with minimal manual work or domain-specific mappings. Our method enables users to exploit data in tables embedded in documents with little effort, not only for simple retrieval queries, but also for structured queries that require joining multiple interrelated tables.

READ FULL TEXT

page 1

page 2

page 5

page 7

page 8

page 9

page 10

page 11

05/13/2021

Semantic Table Retrieval using Keyword and Table Queries

Tables on the Web contain a vast amount of knowledge in a structured for...
01/15/2019

Integrating and querying similar tables from PDF documents using deep learning

Large amount of public data produced by enterprises are in semi-structur...
02/17/2018

TabVec: Table Vectors for Classification of Web Tables

There are hundreds of millions of tables in Web pages that contain usefu...
06/30/2009

Query Significance in Databases via Randomizations

Many sorts of structured data are commonly stored in a multi-relational ...
05/25/2017

Synthesizing Mapping Relationships Using Table Corpus

Mapping relationships, such as (country, country-code) or (company, stoc...
04/07/2021

The quantification of Simpsons paradox and other contributions to contingency table theory

The analysis of contingency tables is a powerful statistical tool used i...
05/22/2019

Exploring Query Results

Users typically interact with a database by asking queries and examining...