Modeling Quality and Machine Learning Pipelines through Extended Feature Models

07/15/2022
by   Giordano d'Aloisio, et al.
0

The recently increased complexity of Machine Learning (ML) methods, led to the necessity to lighten both the research and industry development processes. ML pipelines have become an essential tool for experts of many domains, data scientists and researchers, allowing them to easily put together several ML models to cover the full analytic process starting from raw datasets. Over the years, several solutions have been proposed to automate the building of ML pipelines, most of them focused on semantic aspects and characteristics of the input dataset. However, an approach taking into account the new quality concerns needed by ML systems (like fairness, interpretability, privacy, etc.) is still missing. In this paper, we first identify, from the literature, key quality attributes of ML systems. Further, we propose a new engineering approach for quality ML pipeline by properly extending the Feature Models meta-model. The presented approach allows to model ML pipelines, their quality requirements (on the whole pipeline and on single phases), and quality characteristics of algorithms used to implement each pipeline phase. Finally, we demonstrate the expressiveness of our model considering the classification problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2021

Towards autonomic orchestration of machine learning pipelines in future networks

Machine learning (ML) techniques are being increasingly used in mobile n...
research
04/28/2023

Benchmarking Automated Machine Learning Methods for Price Forecasting Applications

Price forecasting for used construction equipment is a challenging task ...
research
02/09/2023

REIN: A Comprehensive Benchmark Framework for Data Cleaning Methods in ML Pipelines

Nowadays, machine learning (ML) plays a vital role in many aspects of ou...
research
12/15/2020

Amazon SageMaker Autopilot: a white box AutoML solution at scale

AutoML systems provide a black-box solution to machine learning problems...
research
10/09/2019

Provenance Data in the Machine Learning Lifecycle in Computational Science and Engineering

Machine Learning (ML) has become essential in several industries. In Com...
research
04/01/2020

Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical Evolution

The deployment of Machine Learning (ML) models is a difficult and time-c...
research
11/06/2020

Underspecification Presents Challenges for Credibility in Modern Machine Learning

ML models often exhibit unexpectedly poor behavior when they are deploye...

Please sign up or login with your details

Forgot password? Click here to reset