Optimizing Machine Learning Inference Queries with Correlative Proxy Models

01/02/2022
by   Zhihui Yang, et al.
0

We consider accelerating machine learning (ML) inference queries on unstructured datasets. Expensive operators such as feature extractors and classifiers are deployed as user-defined functions(UDFs), which are not penetrable with classic query optimization techniques such as predicate push-down. Recent optimization schemes (e.g., Probabilistic Predicates or PP) assume independence among the query predicates, build a proxy model for each predicate offline, and rewrite a new query by injecting these cheap proxy models in the front of the expensive ML UDFs. In such a manner, unlikely inputs that do not satisfy query predicates are filtered early to bypass the ML UDFs. We show that enforcing the independence assumption in this context may result in sub-optimal plans. In this paper, we propose CORE, a query optimizer that better exploits the predicate correlations and accelerates ML inference queries. Our solution builds the proxy models online for a new query and leverages a branch-and-bound search process to reduce the building costs. Results on three real-world text, image and video datasets show that CORE improves the query throughput by up to 63 compared to running the queries as it is.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2020

ML-AQP: Query-Driven Approximate Query Processing based on Machine Learning

As more and more organizations rely on data-driven decision making, larg...
research
09/09/2020

Task-agnostic Indexes for Deep Learning-based Queries over Unstructured Data

Unstructured data is now commonly queried by using target deep neural ne...
research
11/20/2022

NeuroSketch: Fast and Approximate Evaluation of Range Aggregate Queries with Neural Networks

Range aggregate queries (RAQs) are an integral part of many real-world a...
research
06/06/2022

On Efficient Approximate Queries over Machine Learning Models

The question of answering queries over ML predictions has been gaining a...
research
05/18/2020

A Comparative Exploration of ML Techniques for Tuning Query Degree of Parallelism

There is a large body of recent work applying machine learning (ML) tech...
research
04/12/2020

Complaint-driven Training Data Debugging for Query 2.0

As the need for machine learning (ML) increases rapidly across all indus...
research
09/04/2023

Is Your Learned Query Optimizer Behaving As You Expect? A Machine Learning Perspective

The current boom of learned query optimizers (LQO) can be explained not ...

Please sign up or login with your details

Forgot password? Click here to reset