RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances

07/23/2022
by   Baolin Li, et al.
0

Deep learning model inference is a key service in many businesses and scientific discovery processes. This paper introduces RIBBON, a novel deep learning inference serving system that meets two competing objectives: quality-of-service (QoS) target and cost-effectiveness. The key idea behind RIBBON is to intelligently employ a diverse set of cloud computing instances (heterogeneous instances) to meet the QoS target and maximize cost savings. RIBBON devises a Bayesian Optimization-driven strategy that helps users build the optimal set of heterogeneous instances for their model inference service needs on cloud computing platforms – and, RIBBON demonstrates its superiority over existing approaches of inference serving systems using homogeneous instance pools. RIBBON saves up to 16 different learning models including emerging deep learning recommender system models and drug-discovery enabling models.

READ FULL TEXT
research
10/12/2022

Building Heterogeneous Cloud System for Machine Learning Inference

Online inference is becoming a key service product for many businesses, ...
research
05/10/2022

Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures

With the advent of ubiquitous deployment of smart devices and the Intern...
research
12/14/2021

MCDS: AI Augmented Workflow Scheduling in Mobile Edge Cloud Computing Systems

Workflow scheduling is a long-studied problem in parallel and distribute...
research
02/24/2023

Uncertainty-Aware Workload Prediction in Cloud Computing

Predicting future resource demand in Cloud Computing is essential for ma...
research
12/11/2020

Analyzing the Performance of Smart Industry 4.0 Applications on Cloud Computing Systems

Cloud-based Deep Neural Network (DNN) applications that make latency-sen...
research
04/14/2022

RobustScaler: QoS-Aware Autoscaling for Complex Workloads

Autoscaling is a critical component for efficient resource utilization w...
research
05/11/2019

Improving Robustness of Heterogeneous Serverless Computing Systems Via Probabilistic Task Pruning

Cloud-based serverless computing is an increasingly popular computing pa...

Please sign up or login with your details

Forgot password? Click here to reset