Did the Model Change? Efficiently Assessing Machine Learning API Shifts

07/29/2021
by   Lingjiao Chen, et al.
25

Machine learning (ML) prediction APIs are increasingly widely used. An ML API can change over time due to model updates or retraining. This presents a key challenge in the usage of the API because it is often not clear to the user if and how the ML model has changed. Model shifts can affect downstream application performance and also create oversight issues (e.g. if consistency is desired). In this paper, we initiate a systematic investigation of ML API shifts. We first quantify the performance shifts from 2020 to 2021 of popular ML APIs from Google, Microsoft, Amazon, and others on a variety of datasets. We identified significant model shifts in 12 out of 36 cases we investigated. Interestingly, we found several datasets where the API's predictions became significantly worse over time. This motivated us to formulate the API shift assessment problem at a more fine-grained level as estimating how the API model's confusion matrix changes over time when the data distribution is constant. Monitoring confusion matrix shifts using standard random sampling can require a large number of samples, which is expensive as each API call costs a fee. We propose a principled adaptive sampling algorithm, MASA, to efficiently estimate confusion matrix shifts. MASA can accurately estimate the confusion matrix shifts in commercial ML APIs using up to 90 random sampling. This work establishes ML API shifts as an important problem to study and provides a cost-effective approach to monitor such shifts.

READ FULL TEXT

page 3

page 8

page 32

research
09/18/2022

HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions

Commercial ML APIs offered by providers such as Google, Amazon and Micro...
research
03/29/2022

Improving the Learnability of Machine Learning APIs by Semi-Automated API Wrapping

A major hurdle for students and professional software developers who wan...
research
12/14/2020

WILDS: A Benchmark of in-the-Wild Distribution Shifts

Distribution shifts can cause significant degradation in a broad range o...
research
06/12/2020

FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply

Prediction APIs offered for a fee are a fast-growing industry and an imp...
research
02/01/2023

Model Monitoring and Robustness of In-Use Machine Learning Models: Quantifying Data Distribution Shifts Using Population Stability Index

Safety goes first. Meeting and maintaining industry safety standards for...
research
03/03/2020

Model Assertions for Monitoring and Improving ML Model

ML models are increasingly deployed in settings with real world interact...
research
03/03/2020

Model Assertions for Monitoring and Improving ML Models

ML models are increasingly deployed in settings with real world interact...

Please sign up or login with your details

Forgot password? Click here to reset