sql4ml A declarative end-to-end workflow for machine learning

07/29/2019
by   Nantia Makrynioti, et al.
0

We present sql4ml, a system for expressing supervised machine learning (ML) models in SQL and automatically training them in TensorFlow. The primary motivation for this work stems from the observation that in many data science tasks there is a back-and-forth between a relational database that stores the data and a machine learning framework. Data preprocessing and feature engineering typically happen in a database, whereas learning is usually executed in separate ML libraries. This fragmented workflow requires from the users to juggle between different programming paradigms and software systems. With sql4ml the user can express both feature engineering and ML algorithms in SQL, while the system translates this code to an appropriate representation for training inside a machine learning framework. We describe our translation method, present experimental results from applying it on three well-known ML algorithms and discuss the usability benefits from concentrating the entire workflow on the database side.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/19/2020

SQLFlow: A Bridge between SQL and Machine Learning

Industrial AI systems are mostly end-to-end machine learning (ML) workfl...
research
06/29/2023

Statistically Enhanced Learning: a feature engineering framework to boost (any) learning algorithms

Feature engineering is of critical importance in the field of Data Scien...
research
09/06/2018

Propheticus: Generalizable Machine Learning Framework

Due to recent technological developments, Machine Learning (ML), a subfi...
research
11/04/2021

Scanflow: A multi-graph framework for Machine Learning workflow management, supervision, and debugging

Machine Learning (ML) is more than just training models, the whole workf...
research
04/11/2020

In-Machine-Learning Database: Reimagining Deep Learning with Old-School SQL

In-database machine learning has been very popular, almost being a clich...
research
08/19/2021

Reproducible radiomics through automated machine learning validated on twelve clinical applications

Radiomics uses quantitative medical imaging features to predict clinical...
research
07/29/2021

Machine Learning over Static and Dynamic Relational Data

This tutorial overviews principles behind recent works on training and m...

Please sign up or login with your details

Forgot password? Click here to reset