Quantitative Overfitting Management for Human-in-the-loop ML Application Development with ease.ml/meter

06/01/2019
by   Frances Ann Hubis, et al.
0

Simplifying machine learning (ML) application development, including distributed computation, programming interface, resource management, model selection, etc, has attracted intensive interests recently. These research efforts have significantly improved the efficiency and the degree of automation of developing ML models. In this paper, we take a first step in an orthogonal direction towards automated quality management for human-in-the-loop ML application development. We build ease. ml/meter, a system that can automatically detect and measure the degree of overfitting during the whole lifecycle of ML application development. ease. ml/meter returns overfitting signals with strong probabilistic guarantees, based on which developers can take appropriate actions. In particular, ease. ml/meter provides principled guidelines to simple yet nontrivial questions regarding desired validation and test data sizes, which are among commonest questions raised by developers. The fact that ML application development is typically a continuous procedure further worsens the situation: The validation and test data sets can lose their statistical power quickly due to multiple accesses, especially in the presence of adaptive analysis. ease. ml/meter addresses these challenges by leveraging a collection of novel techniques and optimizations, resulting in practically tractable data sizes without compromising the probabilistic guarantees. We present the design and implementation details of ease. ml/meter, as well as detailed theoretical analysis and empirical evaluation of its effectiveness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2019

Ease.ml/meter: Quantitative Overfitting Management for Human-in-the-loop ML Application Development

Simplifying machine learning (ML) application development, including dis...
research
09/02/2022

When Bioprocess Engineering Meets Machine Learning: A Survey from the Perspective of Automated Bioprocess Development

Machine learning (ML) has significantly contributed to the development o...
research
08/24/2017

Ease.ml: Towards Multi-tenant Resource Sharing for Machine Learning Workloads

We present ease.ml, a declarative machine learning service platform we b...
research
10/16/2020

On Automatic Feasibility Study for Machine Learning Application Development with ease.ml/snoopy

In our experience working with domain experts who are using today's Auto...
research
04/16/2018

Accelerating Human-in-the-loop Machine Learning: Challenges and Opportunities

Development of machine learning (ML) workflows is a tedious process of i...
research
03/01/2019

Continuous Integration of Machine Learning Models with ease.ml/ci: Towards a Rigorous Yet Practical Treatment

Continuous integration is an indispensable step of modern software engin...
research
09/15/2022

A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems

The traditional Machine Learning (ML) methodology requires to fragment t...

Please sign up or login with your details

Forgot password? Click here to reset