DeepAI AI Chat
Log In Sign Up

Discovering Domain Orders through Order Dependencies

05/28/2020
by   Reza Karegar, et al.
University of Waterloo
York University
Att
Ryerson University
0

Much real-world data come with explicitly defined domain orders; e.g., lexicographic order for strings, numeric for integers, and chronological for time. Our goal is to discover implicit domain orders that we do not already know; for instance, that the order of months in the Lunar calendar is Corner < Apricot < Peach, and so on. To do so, we enhance data profiling methods by discovering implicit domain orders in data through order dependencies (ODs). We first identify tractable special cases and then proceed towards the most general case, which we prove is NP-complete. Nevertheless, we show that the general case can be effectively handled by a SAT solver. We also propose an interestingness measure to rank the discovered implicit domain orders. Finally, we report on the results of an experimental evaluation using real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/25/2020

Choice functions based on sets of strict partial orders: an axiomatic characterisation

Methods for choosing from a set of options are often based on a strict p...
05/28/2019

Discovery of Band Order Dependencies

We enhance dependency-based data cleaning with approximate band conditio...
04/08/2022

Finding a Winning Strategy for Wordle is NP-complete

In this paper, we give a formal definition of the popular word-guessing ...
07/22/2017

Possible and Certain Answers for Queries over Order-Incomplete Data

To combine and query ordered data from multiple sources, one needs to ha...
01/16/2023

An Efficient Approach for Discovering Graph Entity Dependencies (GEDs)

Graph entity dependencies (GEDs) are novel graph constraints, unifying k...
11/29/2019

Mining Approximate Acyclic Schemes from Relations

Acyclic schemes have numerous applications in databases and in machine l...
04/05/2022

Improving Generalizability in Implicitly Abusive Language Detection with Concept Activation Vectors

Robustness of machine learning models on ever-changing real-world data i...