Learning Features For Relational Data

01/16/2018
by   Hoang Thanh Lam, et al.
0

Feature engineering is one of the most important but tedious tasks in data science projects. This work studies automation of feature learning for relational data. We first theoretically proved that learning relevant features from relational data for a given predictive analytics problem is NP-hard. However, it is possible to empirically show that an efficient rule based approach predefining transformations as a priori based on heuristics can extract very useful features from relational data. Indeed, the proposed approach outperformed the state of the art solutions with a significant margin. We further introduce a deep neural network which automatically learns appropriate transformations of relational data into a representation that predicts the target variable well instead of being predefined as a priori by users. In an extensive experiment with Kaggle competitions, the proposed methods could win late medals. To the best of our knowledge, this is the first time an automation system could win medals in Kaggle competitions with complex relational data.

READ FULL TEXT
research
06/01/2017

One button machine for automating feature engineering in relational databases

Feature engineering is one of the most important and time consuming task...
research
02/06/2020

Supervised Learning on Relational Databases with Graph Neural Networks

The majority of data scientists and machine learning practitioners use r...
research
06/28/2016

Clustering-Based Relational Unsupervised Representation Learning with an Explicit Distributed Representation

The goal of unsupervised representation learning is to extract a new rep...
research
01/02/2020

Non-Parametric Learning of Gaifman Models

We consider the problem of structure learning for Gaifman models and lea...
research
05/28/2018

Typed Embedding of a Relational Language in OCaml

We present an implementation of the relational programming language mini...
research
03/30/2012

Transforming Graph Representations for Statistical Relational Learning

Relational data representations have become an increasingly important to...
research
09/11/2019

LazyBum: Decision tree learning using lazy propositionalization

Propositionalization is the process of summarizing relational data into ...

Please sign up or login with your details

Forgot password? Click here to reset