Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models

11/26/2021
by   David Nigenda, et al.
0

With the increasing adoption of machine learning (ML) models and systems in high-stakes settings across different industries, guaranteeing a model's performance after deployment has become crucial. Monitoring models in production is a critical aspect of ensuring their continued performance and reliability. We present Amazon SageMaker Model Monitor, a fully managed service that continuously monitors the quality of machine learning models hosted on Amazon SageMaker. Our system automatically detects data, concept, bias, and feature attribution drift in models in real-time and provides alerts so that model owners can take corrective actions and thereby maintain high quality models. We describe the key requirements obtained from customers, system design and architecture, and methodology for detecting different types of drift. Further, we provide quantitative evaluations followed by use cases, insights, and lessons learned from more than 1.5 years of production deployment.

READ FULL TEXT

page 24

page 25

page 26

page 27

research
07/13/2020

Monitoring and explainability of models in production

The machine learning lifecycle extends beyond the deployment stage. Moni...
research
09/07/2021

Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud

Understanding the predictions made by machine learning (ML) models and t...
research
07/29/2021

Concept for a Technical Infrastructure for Management of Predictive Models in Industrial Applications

With the increasing number of created and deployed prediction models and...
research
11/11/2022

A monitoring framework for deployed machine learning models with supply chain examples

Actively monitoring machine learning models during production operations...
research
08/25/2022

Adaptive Learning for Service Monitoring Data

Service monitoring applications continuously produce data to monitor the...
research
02/07/2019

ML Health: Fitness Tracking for Production Models

Deployment of machine learning (ML) algorithms in production for extende...
research
04/28/2021

MLDemon: Deployment Monitoring for Machine Learning Systems

Post-deployment monitoring of the performance of ML systems is critical ...

Please sign up or login with your details

Forgot password? Click here to reset