Learning what matters - Sampling interesting patterns

02/07/2017
by   Vladimir Dzyuba, et al.
0

In the field of exploratory data mining, local structure in data can be described by patterns and discovered by mining algorithms. Although many solutions have been proposed to address the redundancy problems in pattern mining, most of them either provide succinct pattern sets or take the interests of the user into account-but not both. Consequently, the analyst has to invest substantial effort in identifying those patterns that are relevant to her specific interests and goals. To address this problem, we propose a novel approach that combines pattern sampling with interactive data mining. In particular, we introduce the LetSIP algorithm, which builds upon recent advances in 1) weighted sampling in SAT and 2) learning to rank in interactive pattern mining. Specifically, it exploits user feedback to directly learn the parameters of the sampling distribution that represents the user's interests. We compare the performance of the proposed algorithm to the state-of-the-art in interactive pattern mining by emulating the interests of a user. The resulting system allows efficient and interleaved learning and sampling, thus user-specific anytime data exploration. Finally, LetSIP demonstrates favourable trade-offs concerning both quality-diversity and exploitation-exploration when compared to existing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2016

Flexible constrained sampling with guarantees for pattern mining

Pattern sampling has been proposed as a potential solution to the infamo...
research
04/08/2022

Exploiting complex pattern features for interactive pattern mining

Recent years have seen a shift from a pattern mining process that has us...
research
03/05/2022

Boosting the Learning for Ranking Patterns

Discovering relevant patterns for a particular user remains a challengin...
research
04/09/2018

Human-Guided Data Exploration

The outcome of the explorative data analysis (EDA) phase is vital for su...
research
06/01/2023

Efficient Failure Pattern Identification of Predictive Algorithms

Given a (machine learning) classifier and a collection of unlabeled data...
research
05/07/2019

Guided Visual Exploration of Relations in Data Sets

Efficient explorative data analysis systems must take into account both ...
research
06/16/2020

Tell Me Something I Don't Know: Randomization Strategies for Iterative Data Mining

There is a wide variety of data mining methods available, and it is gene...

Please sign up or login with your details

Forgot password? Click here to reset