Sequential algorithmic modification with test data reuse

03/21/2022
by   Jean Feng, et al.
0

After initial release of a machine learning algorithm, the model can be fine-tuned by retraining on subsequently gathered data, adding newly discovered features, or more. Each modification introduces a risk of deteriorating performance and must be validated on a test dataset. It may not always be practical to assemble a new dataset for testing each modification, especially when most modifications are minor or are implemented in rapid succession. Recent works have shown how one can repeatedly test modifications on the same dataset and protect against overfitting by (i) discretizing test results along a grid and (ii) applying a Bonferroni correction to adjust for the total number of modifications considered by an adaptive developer. However, the standard Bonferroni correction is overly conservative when most modifications are beneficial and/or highly correlated. This work investigates more powerful approaches using alpha-recycling and sequentially-rejective graphical procedures (SRGPs). We introduce novel extensions that account for correlation between adaptively chosen algorithmic modifications. In empirical analyses, the SRGPs control the error rate of approving unacceptable modifications and approve a substantially higher number of beneficial modifications than previous approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2019

Approval policies for modifications to Machine Learning-Based Software as a Medical Device: A study of bio-creep

Successful deployment of machine learning algorithms in healthcare requi...
research
02/20/2020

Familywise Error Rate Control by Interactive Unmasking

We propose a method for multiple hypothesis testing with familywise erro...
research
04/01/2011

Towards an automated query modification assistant

Users who need several queries before finding what they need can benefit...
research
03/31/2023

Online Modifications for Event-based Signal Temporal Logic Specifications

In this paper we present a grammar and control synthesis framework for o...
research
12/14/2020

Learning how to approve updates to machine learning algorithms in non-stationary settings

Machine learning algorithms in healthcare have the potential to continua...
research
02/23/2021

Do Transformer Modifications Transfer Across Implementations and Applications?

The research community has proposed copious modifications to the Transfo...
research
07/15/2021

Optimal sports betting strategies in practice: an experimental review

We investigate the most popular approaches to the problem of sports bett...

Please sign up or login with your details

Forgot password? Click here to reset