Hybrid ASP-based Approach to Pattern Mining

08/22/2018
by   Sergey Paramonov, et al.
0

Detecting small sets of relevant patterns from a given dataset is a central challenge in data mining. The relevance of a pattern is based on user-provided criteria; typically, all patterns that satisfy certain criteria are considered relevant. Rule-based languages like Answer Set Programming (ASP) seem well-suited for specifying such criteria in a form of constraints. Although progress has been made, on the one hand, on solving individual mining problems and, on the other hand, developing generic mining systems, the existing methods either focus on scalability or on generality. In this paper we make steps towards combining local (frequency, size, cost) and global (various condensed representations like maximal, closed, skyline) constraints in a generic and efficient way. We present a hybrid approach for itemset, sequence and graph mining which exploits dedicated highly optimized mining systems to detect frequent patterns and then filters the results using declarative ASP. To further demonstrate the generic nature of our hybrid framework we apply it to a problem of approximately tiling a database. Experiments on real-world datasets show the effectiveness of the proposed method and computational gains for itemset, sequence and graph mining, as well as approximate tiling. Under consideration in Theory and Practice of Logic Programming (TPLP).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2023

Extended High Utility Pattern Mining: An Answer Set Programming Based Framework and Applications

Detecting sets of relevant patterns from a given dataset is an important...
research
09/27/2014

Using Answer Set Programming for pattern mining

Serial pattern mining consists in extracting the frequent sequential pat...
research
11/14/2017

Efficiency Analysis of ASP Encodings for Sequential Pattern Mining Tasks

This article presents the use of Answer Set Programming (ASP) to mine se...
research
07/28/2011

Complex Optimization in Answer Set Programming

Preference handling and optimization are indispensable means for address...
research
09/17/2021

Generating Explainable Rule Sets from Tree-Ensemble Learning Methods by Answer Set Programming

We propose a method for generating explainable rule sets from tree-ensem...
research
04/01/2016

A SAT model to mine flexible sequences in transactional datasets

Traditional pattern mining algorithms generally suffer from a lack of fl...
research
11/16/2020

Improving Scalability of Contrast Pattern Mining for Network Traffic Using Closed Patterns

Contrast pattern mining (CPM) aims to discover patterns whose support in...

Please sign up or login with your details

Forgot password? Click here to reset