Mining Feature Relationships in Data

02/02/2021
by   Andrew Lensen, et al.
0

When faced with a new dataset, most practitioners begin by performing exploratory data analysis to discover interesting patterns and characteristics within data. Techniques such as association rule mining are commonly applied to uncover relationships between features (attributes) of the data. However, association rules are primarily designed for use on binary or categorical data, due to their use of rule-based machine learning. A large proportion of real-world data is continuous in nature, and discretisation of such data leads to inaccurate and less informative association rules. In this paper, we propose an alternative approach called feature relationship mining (FRM), which uses a genetic programming approach to automatically discover symbolic relationships between continuous or categorical features in data. To the best of our knowledge, our proposed approach is the first such symbolic approach with the goal of explicitly discovering relationships between features. Empirical testing on a variety of real-world datasets shows the proposed method is able to find high-quality, simple feature relationships which can be easily interpreted and which provide clear and non-trivial insight into data.

READ FULL TEXT
research
04/10/2021

Discovering Categorical Main and Interaction Effects Based on Association Rule Mining

With the growing size of data sets, feature selection becomes increasing...
research
10/21/2020

uARMSolver: A framework for Association Rule Mining

The paper presents a novel software framework for Association Rule Minin...
research
07/14/2021

MARC: Mining Association Rules from datasets by using Clustering models

Association rules are useful to discover relationships, which are mostly...
research
10/14/2015

A Bayesian Network Model for Interesting Itemsets

Mining itemsets that are the most interesting under a statistical model ...
research
10/24/2022

Path association rule mining

Graph association rule mining is a data mining technique used for discov...
research
02/23/2010

Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

Association rules are among the most widely employed data analysis metho...

Please sign up or login with your details

Forgot password? Click here to reset