Approval policies for modifications to Machine Learning-Based Software as a Medical Device: A study of bio-creep

12/28/2019
by   Jean Feng, et al.
0

Successful deployment of machine learning algorithms in healthcare requires careful assessments of their performance and safety. To date, the FDA approves locked algorithms prior to marketing and requires future updates to undergo separate premarket reviews. However, this negates a key feature of machine learning–the ability to learn from a growing dataset and improve over time. This paper frames the design of an approval policy, which we refer to as an automatic algorithmic change protocol (aACP), as an online hypothesis testing problem. As this process has obvious analogy with noninferiority testing of new drugs, we investigate how repeated testing and adoption of modifications might lead to gradual deterioration in prediction accuracy, also known as “biocreep” in the drug development literature. We consider simple policies that one might consider but do not necessarily offer any error-rate guarantees, as well as policies that do provide error-rate control. For the latter, we define two online error-rates appropriate for this context: Bad Approval Count (BAC) and Bad Approval and Benchmark Ratios (BABR). We control these rates in the simple setting of a constant population and data source using policies aACP-BAC and aACP-BABR, which combine alpha-investing, group-sequential, and gate-keeping methods. In simulation studies, bio-creep regularly occurred when using policies with no error-rate guarantees, whereas aACP-BAC and -BABR controlled the rate of bio-creep without substantially impacting our ability to approve beneficial modifications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2020

Learning how to approve updates to machine learning algorithms in non-stationary settings

Machine learning algorithms in healthcare have the potential to continua...
research
03/21/2022

Sequential algorithmic modification with test data reuse

After initial release of a machine learning algorithm, the model can be ...
research
08/24/2022

Online multiple hypothesis testing for reproducible research

Modern data analysis frequently involves large-scale hypothesis testing,...
research
02/20/2020

Familywise Error Rate Control by Interactive Unmasking

We propose a method for multiple hypothesis testing with familywise erro...
research
07/10/2023

Beyond the Two-Trials Rule

The two-trials rule for drug approval requires "at least two adequate an...
research
11/30/2022

Robust incorporation of historical information with known type I error rate inflation

Bayesian clinical trials can benefit of available historical information...
research
10/03/2021

Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

We introduce Learn then Test, a framework for calibrating machine learni...

Please sign up or login with your details

Forgot password? Click here to reset