IDEBench: A Benchmark for Interactive Data Exploration

04/07/2018
by   Philipp Eichmann, et al.
0

Existing benchmarks for analytical database systems such as TPC-DS and TPC-H are designed for static reporting scenarios. The main metric of these benchmarks is the performance of running individual SQL queries over a synthetic database. In this paper, we argue that such benchmarks are not suitable for evaluating database workloads originating from interactive data exploration (IDE) systems where most queries are ad-hoc, not based on predefined reports, and built incrementally. As a main contribution, we present a novel benchmark called IDEBench that can be used to evaluate the performance of database systems for IDE workloads. As opposed to traditional benchmarks for analytical database systems, our goal is to provide more meaningful workloads and datasets that can be used to benchmark IDE query engines, with a particular focus on metrics that capture the trade-off between query performance and quality of the result. As a second contribution, this paper evaluates and discusses the performance results of selected IDE query engines using our benchmark. The study includes two commercial systems, as well as two research prototypes (IDEA, approXimateDB/XDB), and one traditional analytical database system (MonetDB).

READ FULL TEXT

page 3

page 9

research
03/20/2021

Greenplum: A Hybrid Database for Transactional and Analytical Workloads

Demand for enterprise data warehouse solutions to support real-time Onli...
research
12/24/2021

Fine-Tuning Data Structures for Analytical Query Processing

We introduce a framework for automatically choosing data structures to s...
research
11/15/2018

Model-based Approximate Query Processing

Interactive visualizations are arguably the most important tool to explo...
research
10/19/2020

DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees

Automating physical database design has remained a long-term interest in...
research
01/22/2018

Smoke: Fine-grained Lineage at Interactive Speed

Data lineage describes the relationship between individual input and out...
research
04/22/2020

Qd-tree: Learning Data Layouts for Big Data Analytics

Corporations today collect data at an unprecedented and accelerating sca...
research
05/22/2019

Exploring Query Results

Users typically interact with a database by asking queries and examining...

Please sign up or login with your details

Forgot password? Click here to reset