Serverless inferencing on Kubernetes

07/14/2020
by   Clive Cox, et al.
0

Organisations are increasingly putting machine learning models into production at scale. The increasing popularity of serverless scale-to-zero paradigms presents an opportunity for deploying machine learning models to help mitigate infrastructure costs when many models may not be in continuous use. We will discuss the KFServing project which builds on the KNative serverless paradigm to provide a serverless machine learning inference solution that allows a consistent and simple interface for data scientists to deploy their models. We will show how it solves the challenges of autoscaling GPU based inference and discuss some of the lessons learnt from using it in production.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2023

Evaluation Challenges for Geospatial ML

As geospatial machine learning models and maps derived from their predic...
research
11/18/2020

Challenges in Deploying Machine Learning: a Survey of Case Studies

In recent years, machine learning has received increased interest both a...
research
10/04/2020

Diagonal Memory Optimisation for Machine Learning on Micro-controllers

As machine learning spreads into more and more application areas, micro ...
research
04/27/2022

Generating Examples From CLI Usage: Can Transformers Help?

Continuous evolution in modern software often causes documentation, tuto...
research
06/27/2022

Deployment of ML Models using Kubeflow on Different Cloud Providers

This project aims to explore the process of deploying Machine learning m...
research
01/28/2021

Choice modelling in the age of machine learning

Since its inception, the choice modelling field has been dominated by th...
research
05/17/2023

The Jaseci Programming Paradigm and Runtime Stack: Building Scale-out Production Applications Easy and Fast

Today's production scale-out applications include many sub-application c...

Please sign up or login with your details

Forgot password? Click here to reset