Random Intersection Chains

04/10/2021
by   Qiuqiang Lin, et al.
0

Interactions between several features sometimes play an important role in prediction tasks. But taking all the interactions into consideration will lead to an extremely heavy computational burden. For categorical features, the situation is more complicated since the input will be extremely high-dimensional and sparse if one-hot encoding is applied. Inspired by association rule mining, we propose a method that selects interactions of categorical features, called Random Intersection Chains. It uses random intersections to detect frequent patterns, then selects the most meaningful ones among them. At first a number of chains are generated, in which each node is the intersection of the previous node and a random chosen observation. The frequency of patterns in the tail nodes is estimated by maximum likelihood estimation, then the patterns with largest estimated frequency are selected. After that, their confidence is calculated by Bayes' theorem. The most confident patterns are finally returned by Random Intersection Chains. We show that if the number and length of chains are appropriately chosen, the patterns in the tail nodes are indeed the most frequent ones in the data set. We analyze the computation complexity of the proposed algorithm and prove the convergence of the estimators. The results of a series of experiments verify the efficiency and effectiveness of the algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2021

Discovering Categorical Main and Interaction Effects Based on Association Rule Mining

With the growing size of data sets, feature selection becomes increasing...
research
06/27/2021

Online Interaction Detection for Click-Through Rate Prediction

Click-Through Rate prediction aims to predict the ratio of clicks to imp...
research
10/31/2016

DPPred: An Effective Prediction Framework with Concise Discriminative Patterns

In the literature, two series of models have been proposed to address pr...
research
09/01/2017

A simple en,ex marking rule for degenerate intersection points in 2D polygon clipping

A simple en,ex rule to mark the intersection points of 2D input polygon ...
research
06/22/2023

Inferring the finest pattern of mutual independence from data

For a random variable X, we are interested in the blind extraction of it...
research
08/28/2023

Estimation problems for some perturbations of the independence copula

This work provides a study of parameter estimators based on functions of...

Please sign up or login with your details

Forgot password? Click here to reset