DeepAI AI Chat
Log In Sign Up

Discovering Domain Orders through Order Dependencies

by   Reza Karegar, et al.
University of Waterloo
York University
Ryerson University

Much real-world data come with explicitly defined domain orders; e.g., lexicographic order for strings, numeric for integers, and chronological for time. Our goal is to discover implicit domain orders that we do not already know; for instance, that the order of months in the Lunar calendar is Corner < Apricot < Peach, and so on. To do so, we enhance data profiling methods by discovering implicit domain orders in data through order dependencies (ODs). We first identify tractable special cases and then proceed towards the most general case, which we prove is NP-complete. Nevertheless, we show that the general case can be effectively handled by a SAT solver. We also propose an interestingness measure to rank the discovered implicit domain orders. Finally, we report on the results of an experimental evaluation using real-world datasets.


page 1

page 2

page 3

page 4


Choice functions based on sets of strict partial orders: an axiomatic characterisation

Methods for choosing from a set of options are often based on a strict p...

Discovery of Band Order Dependencies

We enhance dependency-based data cleaning with approximate band conditio...

Finding a Winning Strategy for Wordle is NP-complete

In this paper, we give a formal definition of the popular word-guessing ...

Possible and Certain Answers for Queries over Order-Incomplete Data

To combine and query ordered data from multiple sources, one needs to ha...

An Efficient Approach for Discovering Graph Entity Dependencies (GEDs)

Graph entity dependencies (GEDs) are novel graph constraints, unifying k...

Mining Approximate Acyclic Schemes from Relations

Acyclic schemes have numerous applications in databases and in machine l...

Improving Generalizability in Implicitly Abusive Language Detection with Concept Activation Vectors

Robustness of machine learning models on ever-changing real-world data i...