Model predictivity assessment: incremental test-set selection and accuracy evaluation

07/08/2022
by   Elias Fekhari, et al.
0

Unbiased assessment of the predictivity of models learnt by supervised machine-learning methods requires knowledge of the learned function over a reserved test set (not used by the learning algorithm). The quality of the assessment depends, naturally, on the properties of the test set and on the error statistic used to estimate the prediction error. In this work we tackle both issues, proposing a new predictivity criterion that carefully weights the individual observed errors to obtain a global error estimate, and using incremental experimental design methods to "optimally" select the test points on which the criterion is computed. Several incremental constructions are studied, including greedy-packing (coffee-house design), support points and kernel herding techniques. Our results show that the incremental and weighted versions of the latter two, based on Maximum Mean Discrepancy concepts, yield superior performance. An industrial test case provided by the historical French electricity supplier (EDF) illustrates the practical relevance of the methodology, indicating that it is an efficient alternative to expensive cross-validation techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/27/2021

Sample selection from a given dataset to validate machine learning models

The selection of a validation basis from a full dataset is often require...
research
05/02/2022

FundusQ-Net: a Regression Quality Assessment Deep Learning Algorithm for Fundus Images Quality Grading

Objective: Ophthalmological pathologies such as glaucoma, diabetic retin...
research
07/25/2022

Deep learning-based algorithm for assessment of knee osteoarthritis severity in radiographs matches performance of radiologists

A fully-automated deep learning algorithm matched performance of radiolo...
research
06/30/2015

Fast Cross-Validation for Incremental Learning

Cross-validation (CV) is one of the main tools for performance estimatio...
research
12/11/2018

Bounding the Error From Reference Set Kernel Maximum Mean Discrepancy

In this paper, we bound the error induced by using a weighted skeletoniz...
research
02/22/2018

Employment of Multiple Algorithms for Optimal Path-based Test Selection Strategy

Executing various sequences of system functions in a system under test r...
research
09/10/2020

Critical analysis on the reproducibility of visual quality assessment using deep features

Data used to train supervised machine learning models are commonly split...

Please sign up or login with your details

Forgot password? Click here to reset