The Complexity of Aggregates over Extractions by Regular Expressions

02/20/2020
by   Johannes Doleschal, et al.
0

Regular expressions with capture variables, also known as "regex formulas," extract relations of spans (intervals identified by their start and end indices) from text. Based on these Fagin et al. introduced regular document spanners which are the closure of regex formulas under Relational Algebra. In this work, we study the computational complexity of querying text by aggregate functions, like sum, average or quantiles, on top of regular document spanners. To this end, we formally define aggregate functions over regular document spanners and analyze the computational complexity of exact and approximative computation of the aggregates. To be precise, we show that in a restricted case all aggregates can be computed in polynomial time. In general, however, even though exact computation is intractable, some aggregates can still be approximated with fully polynomial-time randomized approximation schemes (FPRAS).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2019

Complexity Bounds for Relational Algebra over Document Spanners

We investigate the complexity of evaluating queries in Relational Algebr...
research
12/21/2017

Recursive Programs for Document Spanners

A document spanner models a program for Information Extraction (IE) as a...
research
08/30/2019

Weight Annotation in Information Extraction

The framework of document spanners abstracts the task of information ext...
research
03/15/2020

Grammars for Document Spanners

A new grammar-based language for defining information-extractors from te...
research
03/15/2020

Grammars for Document Spanenrs

A new grammar-based language for defining information-extractors from te...
research
11/20/2017

XSAT of Linear CNF Formulas

Open questions with respect to the computational complexity of linear CN...
research
04/12/2023

Skyline Operators for Document Spanners

When extracting a relation of spans (intervals) from a text document, a ...

Please sign up or login with your details

Forgot password? Click here to reset