DeepAI AI Chat
Log In Sign Up

DiffML: End-to-end Differentiable ML Pipelines

07/04/2022
by   Benjamin Hilprecht, et al.
0

In this paper, we present our vision of differentiable ML pipelines called DiffML to automate the construction of ML pipelines in an end-to-end fashion. The idea is that DiffML allows to jointly train not just the ML model itself but also the entire pipeline including data preprocessing steps, e.g., data cleaning, feature selection, etc. Our core idea is to formulate all pipeline steps in a differentiable way such that the entire pipeline can be trained using backpropagation. However, this is a non-trivial problem and opens up many new research questions. To show the feasibility of this direction, we demonstrate initial ideas and a general principle of how typical preprocessing steps such as data cleaning, feature selection and dataset selection can be formulated as differentiable programs and jointly learned with the ML model. Moreover, we discuss a research roadmap and core challenges that have to be systematically tackled to enable fully differentiable ML pipelines.

READ FULL TEXT

page 1

page 2

page 3

page 4

06/10/2019

Making Classical Machine Learning Pipelines Differentiable: A Neural Translation Approach

Classical Machine Learning (ML) pipelines often comprise of multiple ML ...
11/17/2016

DSAC - Differentiable RANSAC for Camera Localization

RANSAC is an important algorithm in robust optimization and a central bu...
07/15/2022

Modeling Quality and Machine Learning Pipelines through Extended Feature Models

The recently increased complexity of Machine Learning (ML) methods, led ...
12/15/2020

Amazon SageMaker Autopilot: a white box AutoML solution at scale

AutoML systems provide a black-box solution to machine learning problems...
07/27/2022

Learning with Combinatorial Optimization Layers: a Probabilistic Approach

Combinatorial optimization (CO) layers in machine learning (ML) pipeline...
06/07/2022

SubStrat: A Subset-Based Strategy for Faster AutoML

Automated machine learning (AutoML) frameworks have become important too...
02/28/2023

Towards Personalized Preprocessing Pipeline Search

Feature preprocessing, which transforms raw input features into numerica...