A Bayesian Network Model for Interesting Itemsets

10/14/2015
by   Jaroslav Fowkes, et al.
0

Mining itemsets that are the most interesting under a statistical model of the underlying data is a commonly used and well-studied technique for exploratory data analysis, with the most recent interestingness models exhibiting state of the art performance. Continuing this highly promising line of work, we propose the first, to the best of our knowledge, generative model over itemsets, in the form of a Bayesian network, and an associated novel measure of interestingness. Our model is able to efficiently infer interesting itemsets directly from the transaction database using structural EM, in which the E-step employs the greedy approximation to weighted set cover. Our approach is theoretically simple, straightforward to implement, trivially parallelizable and retrieves itemsets whose quality is comparable to, if not better than, existing state of the art algorithms as we demonstrate on several real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2016

A Subsequence Interleaving Model for Sequential Pattern Mining

Recent sequential pattern mining methods have used the minimum descripti...
research
12/24/2018

Bayesian Causal Inference

We address the problem of two-variable causal inference. This task is to...
research
02/02/2021

Mining Feature Relationships in Data

When faced with a new dataset, most practitioners begin by performing ex...
research
05/08/2018

Mining Top-k Sequential Patterns in Database Graphs:A New Challenging Problem and a Sampling-based Approach

In many real world networks, a vertex is usually associated with a trans...
research
03/15/2022

MoReL: Multi-omics Relational Learning

Multi-omics data analysis has the potential to discover hidden molecular...
research
12/12/2012

Finding Optimal Bayesian Networks

In this paper, we derive optimality results for greedy Bayesian-network ...

Please sign up or login with your details

Forgot password? Click here to reset