LazyBum: Decision tree learning using lazy propositionalization

09/11/2019
by   Jonas Schouterden, et al.
0

Propositionalization is the process of summarizing relational data into a tabular (attribute-value) format. The resulting table can next be used by any propositional learner. This approach makes it possible to apply a wide variety of learning methods to relational data. However, the transformation from relational to propositional format is generally not lossless: different relational structures may be mapped onto the same feature vector. At the same time, features may be introduced that are not needed for the learning task at hand. In general, it is hard to define a feature space that contains all and only those features that are needed for the learning task. This paper presents LazyBum, a system that can be considered a lazy version of the recently proposed OneBM method for propositionalization. LazyBum interleaves OneBM's feature construction method with a decision tree learner. This learner both uses and guides the propositionalization process. It indicates when and where to look for new features. This approach is similar to what has elsewhere been called dynamic propositionalization. In an experimental comparison with the original OneBM and with two other recently proposed propositionalization methods (nFOIL and MODL, which respectively perform dynamic and static propositionalization), LazyBum achieves a comparable accuracy with a lower execution time on most of the datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2017

Induction of Decision Trees based on Generalized Graph Queries

Usually, decision tree induction algorithms are limited to work with non...
research
11/08/2019

An Experimental Comparison of Old and New Decision Tree Algorithms

This paper presents a detailed comparison of a recently proposed algorit...
research
09/17/2021

Decision Tree Learning with Spatial Modal Logics

Symbolic learning represents the most straightforward approach to interp...
research
09/24/2015

CRDT: Correlation Ratio Based Decision Tree Model for Healthcare Data Mining

The phenomenal growth in the healthcare data has inspired us in investig...
research
03/27/2023

Lifting uniform learners via distributional decomposition

We show how any PAC learning algorithm that works under the uniform dist...
research
06/08/2020

Propositionalization and Embeddings: Two Sides of the Same Coin

Data preprocessing is an important component of machine learning pipelin...
research
01/16/2018

Learning Features For Relational Data

Feature engineering is one of the most important but tedious tasks in da...

Please sign up or login with your details

Forgot password? Click here to reset