Challenges and Pitfalls of Machine Learning Evaluation and Benchmarking

04/29/2019
by   Cheng Li, et al.
0

An increasingly complex and diverse collection of Machine Learning (ML) models as well as hardware/software stacks, collectively referred to as "ML artifacts", are being proposed - leading to a diverse landscape of ML. These ML innovations proposed have outpaced researchers' ability to analyze, study and adapt them. This is exacerbated by the complicated and sometimes non-reproducible procedures for ML evaluation. A common practice of sharing ML artifacts is through repositories where artifact authors post ad-hoc code and some documentation, but often fail to reveal critical information for others to reproduce their results. This results in users' inability to compare with artifact authors' claims or adapt the model to his/her own use. This paper discusses common challenges and pitfalls of ML evaluation and benchmarking, which can be used as a guideline for ML model authors when sharing ML artifacts, and for system developers when benchmarking or designing ML systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

research
04/29/2019

Challenges and Pitfalls of Reproducing Machine Learning Artifacts

An increasingly complex and diverse collection of Machine Learning(ML) m...
research
11/15/2021

Benchmarking Various ML Solutions in Complex Intent-Based Network Management Systems

Intent-based networking (IBN) solutions to managing complex ICT systems ...
research
12/04/2021

BenchML: an extensible pipelining framework for benchmarking representations of materials and molecules at scale

We introduce a machine-learning (ML) framework for high-throughput bench...
research
02/19/2020

MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale

Machine Learning (ML) and Deep Learning (DL) innovations are being intro...
research
03/31/2019

SysML'19 demo: customizable and reusable Collective Knowledge pipelines to automate and reproduce machine learning experiments

Reproducing, comparing and reusing results from machine learning and sys...
research
12/06/2022

Benchmarking AutoML algorithms on a collection of binary problems

Automated machine learning (AutoML) algorithms have grown in popularity ...
research
10/21/2022

Management of Machine Learning Lifecycle Artifacts: A Survey

The explorative and iterative nature of developing and operating machine...

Please sign up or login with your details

Forgot password? Click here to reset