Interactive Set Discovery

11/18/2021
by   Arif Hasnat, et al.
0

We study the problem of set discovery where given a few example tuples of a desired set, we want to find the set in a collection of sets. A challenge is that the example tuples may not uniquely identify a set, and a large number of candidate sets may be returned. Our focus is on interactive exploration to set discovery where additional example tuples from the candidate sets are shown and the user either accepts or rejects them as members of the target set. The goal is to find the target set with the least number of user interactions. The problem can be cast as an optimization problem where we want to find a decision tree that can guide the search to the target set with the least number of questions to be answered by the user. We propose a general algorithm that is capable of reaching an optimal solution and two variations of it that strike a balance between the quality of a solution and the running time. We also propose a novel pruning strategy that safely reduces the search space without introducing false negatives. We evaluate the efficiency and the effectiveness of our algorithms through an extensive experimental study using both real and synthetic datasets and comparing them to previous approaches in the literature. We show that our pruning strategy reduces the running time of the search algorithms by 2-5 orders of magnitude.

READ FULL TEXT
research
07/16/2019

The Quantum Version Of Classification Decision Tree Constructing Algorithm C5.0

In the paper, we focus on complexity of C5.0 algorithm for constructing ...
research
07/12/2023

Guided Bottom-Up Interactive Constraint Acquisition

Constraint Acquisition (CA) systems can be used to assist in the modelin...
research
12/01/2022

Fully-Dynamic Decision Trees

We develop the first fully dynamic algorithm that maintains a decision t...
research
05/28/2019

Discovery of Band Order Dependencies

We enhance dependency-based data cleaning with approximate band conditio...
research
01/23/2013

Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm

Learning Bayesian networks is often cast as an optimization problem, whe...
research
12/06/2018

ALLSAT compressed with wildcards: Partitionings and face-numbers of simplicial complexes

Given the facets of a finite simplicial complex, we use wildcards to enu...

Please sign up or login with your details

Forgot password? Click here to reset