Discovering Closed and Maximal Embedded Patterns from Large Tree Data

12/26/2020
by   Xiaoying Wu, et al.
0

We address the problem of summarizing embedded tree patterns extracted from large data trees. We do so by defining and mining closed and maximal embedded unordered tree patterns from a single large data tree. We design an embedded frequent pattern mining algorithm extended with a local closedness checking technique. This algorithm is called closedEmbTM-prune as it eagerly eliminates non-closed patterns. To mitigate the generation of intermediate patterns, we devise pattern search space pruning rules to proactively detect and prune branches in the pattern search space which do not correspond to closed patterns. The pruning rules are accommodated into the extended embedded pattern miner to produce a new algorithm, called closedEmbTM-prune, for mining all the closed and maximal embedded frequent patterns from large data trees. Our extensive experiments on synthetic and real large-tree datasets demonstrate that, on dense datasets, closedEmbTM-prune not only generates a complete closed and maximal pattern set which is substantially smaller than that generated by the embedded pattern miner, but also runs much faster with negligible overhead on pattern pruning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2009

FastLMFI: An Efficient Approach for Local Maximal Patterns Propagation and Maximal Patterns Superset Checking

Maximal frequent patterns superset checking plays an important role in t...
research
12/30/2018

Mining Maximal Dynamic Spatial Co-Location Patterns

A spatial co-location pattern represents a subset of spatial features wh...
research
07/24/2023

Detection of Common Subtrees with Identical Label Distribution

Frequent pattern mining is a relevant method to analyse structured data,...
research
12/28/2018

HUOPM: High Utility Occupancy Pattern Mining

Mining useful patterns from varied types of databases is an important re...
research
12/17/2021

cgSpan: Closed Graph-Based Substructure Pattern Mining

gSpan is a popular algorithm for mining frequent subgraphs. cgSpan (clos...
research
06/09/2019

Proposition d'une nouvelle approche d'extraction des motifs fermés fréquents

This work is done as part of a master's thesis project. The increase in ...
research
04/26/2021

Boolean Reasoning-Based Biclustering for Shifting Pattern Extraction

Biclustering is a powerful approach to search for patterns in data, as i...

Please sign up or login with your details

Forgot password? Click here to reset