Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures

05/10/2022
by   Yongji Wu, et al.
0

With the advent of ubiquitous deployment of smart devices and the Internet of Things, data sources for machine learning inference have increasingly moved to the edge of the network. Existing machine learning inference platforms typically assume a homogeneous infrastructure and do not take into account the more complex and tiered computing infrastructure that includes edge devices, local hubs, edge datacenters, and cloud datacenters. On the other hand, recent AutoML efforts have provided viable solutions for model compression, pruning and quantization for heterogeneous environments; for a machine learning model, now we may easily find or even generate a series of models with different tradeoffs between accuracy and efficiency. We design and implement JellyBean, a system for serving and optimizing machine learning inference workflows on heterogeneous infrastructures. Given service-level objectives (e.g., throughput, accuracy), JellyBean picks the most cost-efficient models that meet the accuracy target and decides how to deploy them across different tiers of infrastructures. Evaluations show that JellyBean reduces the total serving cost of visual question answering by up to 58 AI City Challenge by up to 36 and worker assignment solutions. JellyBean also outperforms prior ML serving systems (e.g., Spark on the cloud) up to 5x in serving costs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2022

MLProxy: SLA-Aware Reverse Proxy for Machine Learning Inference Serving on Serverless Computing Platforms

Serving machine learning inference workloads on the cloud is still a cha...
research
07/23/2022

RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances

Deep learning model inference is a key service in many businesses and sc...
research
08/21/2020

Towards Designing a Self-Managed Machine Learning Inference Serving System inPublic Cloud

We are witnessing an increasing trend towardsusing Machine Learning (ML)...
research
07/31/2019

Machine Learning at the Network Edge: A Survey

Devices comprising the Internet of Things, such as sensors and small cam...
research
03/04/2021

Serverless Model Serving for Data Science

Machine learning (ML) is an important part of modern data science applic...
research
10/12/2022

Building Heterogeneous Cloud System for Machine Learning Inference

Online inference is becoming a key service product for many businesses, ...
research
04/20/2023

Scaling ML Products At Startups: A Practitioner's Guide

How do you scale a machine learning product at a startup? In particular,...

Please sign up or login with your details

Forgot password? Click here to reset