Monotonic Cardinality Estimation of Similarity Selection: A Deep Learning Approach

by   Yaoshu Wang, et al.

Due to the outstanding capability of capturing underlying data distributions, deep learning techniques have been recently utilized for a series of traditional database problems. In this paper, we investigate the possibilities of utilizing deep learning for cardinality estimation of similarity selection. Answering this problem accurately and efficiently is essential to many data management applications, especially for query optimization. Moreover, in some applications the estimated cardinality is supposed to be consistent and interpretable. Hence a monotonic estimation w.r.t. the query threshold is preferred. We propose a novel and generic method that can be applied to any data type and distance function. Our method consists of a feature extraction model and a regression model. The feature extraction model transforms original data and threshold to a Hamming space, in which a deep learning-based regression model is utilized to exploit the incremental property of cardinality w.r.t. the threshold for both accuracy and monotonicity. We develop a training strategy tailored to our model as well as techniques for fast estimation. We also discuss how to handle updates. We demonstrate the accuracy and the efficiency of our method through experiments, and show how it improves the performance of a query optimizer.


page 17

page 18


Learned Cardinalities: Estimating Correlated Joins with Deep Learning

We describe a new deep learning approach to cardinality estimation. MSCN...

Consistent and Flexible Selectivity Estimation for High-dimensional Data

Selectivity estimation aims at estimating the number of database objects...

An End-to-End Learning-based Cost Estimator

Cost and cardinality estimation is vital to query optimizer, which can g...

An Empirical Analysis of Deep Learning for Cardinality Estimation

We implement and evaluate deep learning for cardinality estimation by st...

A General Cardinality Estimation Framework for Subgraph Matching in Property Graphs

Many techniques have been developed for the cardinality estimation probl...

Multi-Attribute Selectivity Estimation Using Deep Learning

Selectivity estimation - the problem of estimating the result size of qu...

Please sign up or login with your details

Forgot password? Click here to reset