Schema Independent Relational Learning

08/16/2015
by   Jose Picado, et al.
0

Learning novel concepts and relations from relational databases is an important problem with many applications in database systems and machine learning. Relational learning algorithms learn the definition of a new relation in terms of existing relations in the database. Nevertheless, the same data set may be represented under different schemas for various reasons, such as efficiency, data quality, and usability. Unfortunately, the output of current relational learning algorithms tends to vary quite substantially over the choice of schema, both in terms of learning accuracy and efficiency. This variation complicates their off-the-shelf application. In this paper, we introduce and formalize the property of schema independence of relational learning algorithms, and study both the theoretical and empirical dependence of existing algorithms on the common class of (de) composition schema transformations. We study both sample-based learning algorithms, which learn from sets of labeled examples, and query-based algorithms, which learn by asking queries to an oracle. We prove that current relational learning algorithms are generally not schema independent. For query-based learning algorithms we show that the (de) composition transformations influence their query complexity. We propose Castor, a sample-based relational learning algorithm that achieves schema independence by leveraging data dependencies. We support the theoretical results with an empirical study that demonstrates the schema dependence/independence of several algorithms on existing benchmark and real-world datasets under (de) compositions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2017

AutoMode: Relational Learning With Less Black Magic

Relational databases are valuable resources for learning novel and inter...
research
09/01/2021

Storing Multi-model Data in RDBMSs based on Reinforcement Learning

How to manage various data in a unified way is a significant research to...
research
04/13/2022

SkiQL: A Unified Schema Query Language

Most NoSQL systems are schema-on-read: data can be stored without first ...
research
10/06/2021

Reconsidering Optimistic Algorithms for Relational DBMS

At DBKDA 2019, we demonstrated that StrongDBMS with simple but rigorous ...
research
11/29/2019

Mining Approximate Acyclic Schemes from Relations

Acyclic schemes have numerous applications in databases and in machine l...
research
09/15/2017

A Rule-Based Approach to Analyzing Database Schema Objects with Datalog

Database schema elements such as tables, views, triggers and functions a...
research
03/01/2018

Graph Based Proactive Secure Decomposition Algorithm for Context Dependent Attribute Based Inference Control Problem

Relational DBMSs continue to dominate the database market, and inference...

Please sign up or login with your details

Forgot password? Click here to reset