Cuttlefish: A Lightweight Primitive for Adaptive Query Processing

02/26/2018
by   Tomer Kaftan, et al.
0

Modern data processing applications execute increasingly sophisticated analysis that requires operations beyond traditional relational algebra. As a result, operators in query plans grow in diversity and complexity. Designing query optimizer rules and cost models to choose physical operators for all of these novel logical operators is impractical. To address this challenge, we develop Cuttlefish, a new primitive for adaptively processing online query plans that explores candidate physical operator instances during query execution and exploits the fastest ones using multi-armed bandit reinforcement learning techniques. We prototype Cuttlefish in Apache Spark and adaptively choose operators for image convolution, regular expression matching, and relational joins. Our experiments show Cuttlefish-based adaptive convolution and regular expression operators can reach 72-99 all-knowing oracle that always selects the optimal algorithm, even when individual physical operators are up to 105x slower than the optimal. Additionally, Cuttlefish achieves join throughput improvements of up to 7.5x compared with Spark SQL's query optimizer.

READ FULL TEXT

page 3

page 6

page 10

page 15

research
10/05/2021

Scalable Relational Query Processing on Big Matrix Data

The use of large-scale machine learning methods is becoming ubiquitous i...
research
08/07/2023

CAESURA: Language Models as Multi-Modal Query Planners

Traditional query planners translate SQL queries into query plans to be ...
research
03/05/2019

Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins

We study the problem of optimizing subgraph queries using the new worst-...
research
02/28/2018

Deep Reinforcement Learning for Join Order Enumeration

Join order selection plays a significant role in query performance. Many...
research
11/26/2019

Join Query Optimization with Deep Reinforcement Learning Algorithms

Join query optimization is a complex task and is central to the performa...
research
06/15/2022

Selectivity Estimation of Inequality Joins In Databases

Selectivity estimation refers to the ability of the SQL query optimizer ...
research
01/02/2018

On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML

Many large-scale machine learning (ML) systems allow specifying custom M...

Please sign up or login with your details

Forgot password? Click here to reset